From shameer at ncbs.res.in  Wed Aug  1 01:45:45 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Wed, 1 Aug 2007 11:15:45 +0530 (IST)
Subject: [Bioperl-l] Perl 3D OpenGL
In-Reply-To: <04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu>
References: <152401c7d224$8e2455b0$6e4e7c0a@HPONE>
	<25A5F0A3-1CC3-46B5-8976-A24C451204E7@jays.net>
	<04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu>
Message-ID: <49637.192.168.1.1.1185947145.squirrel@mail.ncbs.res.in>

Hi,
Open-GL/3D contributions are always welcome !!!
What about Perl-OpenGL/3D implimentation of a web-based 3D-Viewer like Jmol.

 http://jmol.sourceforge.net/

(So we dont need to worry about Java installation and stuffs :) develop it
and deploy it in Perl - eternal happiness !!!)
-- 
SK
>
> On Jul 31, 2007, at 7:00 AM, Jay Hannah wrote:
>
>> On Jul 29, 2007, at 4:08 PM, Grafman Productions wrote:
>>> If this posting is inappropriate, please let me know - my apologies.
>>
>> Not at all. AFAIK this is the perfect place to discuss any
>> contributions you're motivated to make to the BioPerl project.
>>
>>> I recently came across an article on BioPerl, and it occurred to me
>>> that
>>> there might be some need for 3D rendering within your BioPerl
>>> project.
>>>
>>> I released a number of new/updated Perl OpenGL (POGL) modules this
>>> year,
>>> along with benchmarks that demonstrate that it performs comparably
>>> to C.
>>>
>>> If there's a need for 3D features within BioPerl, and if I can be
>>> of any
>>> assistance in helping to add such features, I would enjoy the
>>> opportunity.
>>
>> I know nothing about 3D modeling in biology, nor do I hang out with
>> any protein structure folks, but 3D always sounds sexy. -grin-
>>
>> If you're new to bioinformatics (I certainly am) you might want to
>> read this:
>>
>>    http://en.wikipedia.org/wiki/Protein_structure
>>
>> Because that's probably where your 3D work would be used. Especially
>> note the "Software" section, where you'll find some of the
>> "competition".  :)
>>
>> There's some cool stuff out there. I don't know what all would or
>> wouldn't be time well spent in Perl / BioPerl.
>>
>> HTH,
>>
>> Jay Hannah
>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
> I agree that protein structure is the best place for something like
> this.
>
> It's a wide open area as far as I'm concerned; in fact I would say
> that Bio::Structure is getting pretty dated, so if anyone wants to
> take it over, refactor the code, and so on I don't have a problem.
>
> chris
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From Alicia.Amadoz at uv.es  Wed Aug  1 03:13:11 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Wed, 1 Aug 2007 09:13:11 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
Message-ID: <1664224328amadoz@uv.es>

Hi, I would like to save my hit sequences from a blast result in a fasta
file. I am trying some things but I have problems using Bio::SearchIO
and Bio::SeqIO. Hope anyone could help me with this. Here is my current
code:

# my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
"fasta");
my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
         my $hseq = $hsp->hit_string();
         # $seq_out->write_seq($hseq);
         $seq_out->write_result($hseq);
      }
   }
}

Here the error is,

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: ResultWriter not defined.

I couldn't find any kind of documentation about ResultWriter.
Thanks in advance,
Alicia


From xianranli78 at yahoo.com.cn  Wed Aug  1 04:11:53 2007
From: xianranli78 at yahoo.com.cn (Xianran Li)
Date: Wed, 1 Aug 2007 16:11:53 +0800
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
References: <1664224328amadoz@uv.es>
Message-ID: <001101c7d413$a0d79aa0$ed07a8c0@BGI.LOCAL>

The $hseq->$hsp->hit_string() will return the string of hit sequence, rather than an objective of Bio::Seq. So may be you should construct a objective firstly, then you could use $seq_out->write_seq($hseq_obj) to write the seq into a fasta file.

# my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta");
  my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
         my $hseq = $hsp->hit_string(); 
            $hseq =~ s/-//g; #### remove the gap within the aligment
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
         # $seq_out->write_seq($hseq);
         $seq_out->write_result($hseq_obj);
      }
   }
}

Xianran
----- Original Message ----- 
From: "Alicia Amadoz" <Alicia.Amadoz at uv.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, August 01, 2007 3:13 PM
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file


> Hi, I would like to save my hit sequences from a blast result in a fasta
> file. I am trying some things but I have problems using Bio::SearchIO
> and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> code:
> 
> # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> "fasta");
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> while(my $result = $blast_report->next_result()) {
>    while(my $hit = $result->next_hit()) {
>       while(my $hsp = $hit->next_hsp()) {
>          my $hseq = $hsp->hit_string();
>          # $seq_out->write_seq($hseq);
>          $seq_out->write_result($hseq);
>       }
>    }
> }
> 
> Here the error is,
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: ResultWriter not defined.
> 
> I couldn't find any kind of documentation about ResultWriter.
> Thanks in advance,
> Alicia
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f???????


From Alicia.Amadoz at uv.es  Wed Aug  1 06:25:29 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Wed, 1 Aug 2007 12:25:29 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
Message-ID: <5927683277amadoz@uv.es>

Hi, I have tried what you suggested and I get also some errors.
With this code,

my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
	my $hseq = $hsp->hit_string(); 
        $hseq =~ s/-//g; #### remove the gap within the aligment
        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
        $seq_out->write_seq($hseq_obj);
      }
   }				
}

I have the following error:

Can't locate object method "write_seq" via package "Bio::SearchIO::fasta"

And using write_result methog with this code,

my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
	my $hseq = $hsp->hit_string(); 
        $hseq =~ s/-//g; #### remove the gap within the aligment
        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
        $seq_out->write_result($hseq_obj);
      }
   }				
}

I have again this kind of error:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: ResultWriter not defined.
STACK: Error::throw

So, what else can I try?? Thanks in advance,
Alicia


From neetisomaiya at gmail.com  Wed Aug  1 07:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive

From neetisomaiya at gmail.com  Wed Aug  1 07:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive

From neetisomaiya at gmail.com  Wed Aug  1 07:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive

From jay at jays.net  Wed Aug  1 09:30:50 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 1 Aug 2007 09:30:50 -0400 (EDT)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <5927683277amadoz@uv.es>
References: <5927683277amadoz@uv.es>
Message-ID: <Pine.LNX.4.64.0708010926370.3555@ferret.jays.net>

On Wed, 1 Aug 2007, Alicia Amadoz wrote:
> Hi, I have tried what you suggested and I get also some errors.
> With this code,
>
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> while(my $result = $blast_report->next_result()) {
>   while(my $hit = $result->next_hit()) {
>      while(my $hsp = $hit->next_hsp()) {
> 	my $hseq = $hsp->hit_string();
>        $hseq =~ s/-//g; #### remove the gap within the aligment
>        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
>        $seq_out->write_seq($hseq_obj);
>      }
>   }
> }
>
> I have the following error:
>
> Can't locate object method "write_seq" via package "Bio::SearchIO::fasta"

You don't want to write_seq() to a SearchIO, you want to write_seq() to a 
SeqIO. Try this:

my $seq_out = Bio::SeqIO->new(-file => ">$fasfilename", -format => "fasta");
while(my $result = $blast_report->next_result()) {
    while(my $hit = $result->next_hit()) {
       while(my $hsp = $hit->next_hsp()) {
 	my $hseq = $hsp->hit_string();
         $hseq =~ s/-//g; #### remove the gap within the aligment
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
         $seq_out->write_seq($hseq_obj);
       }
    }
}

(Untested.)

HTH,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah

From cjfields at uiuc.edu  Wed Aug  1 11:02:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 1 Aug 2007 10:02:07 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
Message-ID: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>

Neeti,

Only post to one list email address, namely the one I'm responding to  
and the one shown here:

http://bioperl.org/mailman/listinfo/bioperl-l

The others are aliases so you essentially posted three times.  As for  
your question: there was no attached script or any additional  
information (bioperl version would have also been nice), so we can't  
help you until we have something more to work with.

chris

On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:

> I have downloaded the omim.txt file from NCBI ftp site and I am  
> running my
> attached parser on this file, the parser run stops in between with  
> this :-
>
> ------------- EXCEPTION  -------------
> MSG: a part/organism must be assigned
> STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> STACK toplevel parse_omim_original.pl:47
>
> --------------------------------------
>
> What is the reason for this?
> Can anyone guide me please.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Wed Aug  1 20:50:06 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Thu, 2 Aug 2007 10:50:06 +1000
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <1664224328amadoz@uv.es>
References: <1664224328amadoz@uv.es>
Message-ID: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>

Alicia,

> Hi, I would like to save my hit sequences from a blast result in a fasta
> file. I am trying some things but I have problems using Bio::SearchIO
> and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> code:
> # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> "fasta");
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> ...
>        my $hseq = $hsp->hit_string();
>          # $seq_out->write_seq($hseq);
>          $seq_out->write_result($hseq);

You have encountered two common problems for BioPerl beginners:

1. "fasta" means two different things! In SearchIO it refers to the
output format of the "fasta" sequence alignment software. In SeqIO it
refers to a file format that stores just sequences. Confusing, I know.
You need SeqIO and write_seq, not SearchIO and write_result.

2. $hseq is a STRING which has the raw sequence letters in it.
However, the write_seq() method needs a Bio::Seq object (which has
extra details like the name and ID) not a raw string.

The example code Jay Hannah supplied in his reply looks pretty good,
you should try it.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University

From Alicia.Amadoz at uv.es  Thu Aug  2 03:06:54 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Thu, 2 Aug 2007 09:06:54 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
Message-ID: <3579584634amadoz@uv.es>

Hi, thanks for your help and suggestions. I have tried the example code
of Jay Hannah and it works perfectly. But what I need to save in fasta
format is the whole sequence in the database that is similar to my query
sequence. I don't understand very well the difference between
hit_string() and query_string(), are they the whole sequence that is
similiar (about hit_string), a part of the whole sequence or just the
part that is aligned to my query string? 

With the previous code what I have are different sequences in length
with the same id as my query string, so I am not sure that I am doing
what I need to do. Any light on this point?

Thank you very much for your help.
Alicia

> Alicia,
> 
> > Hi, I would like to save my hit sequences from a blast result in a fasta
> > file. I am trying some things but I have problems using Bio::SearchIO
> > and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> > code:
> > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> > "fasta");
> > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> > => "fasta");
> > ...
> >        my $hseq = $hsp->hit_string();
> >          # $seq_out->write_seq($hseq);
> >          $seq_out->write_result($hseq);
> 
> You have encountered two common problems for BioPerl beginners:
> 
> 1. "fasta" means two different things! In SearchIO it refers to the
> output format of the "fasta" sequence alignment software. In SeqIO it
> refers to a file format that stores just sequences. Confusing, I know.
> You need SeqIO and write_seq, not SearchIO and write_result.
> 
> 2. $hseq is a STRING which has the raw sequence letters in it.
> However, the write_seq() method needs a Bio::Seq object (which has
> extra details like the name and ID) not a raw string.
> 
> The example code Jay Hannah supplied in his reply looks pretty good,
> you should try it.
> 
> -- 
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> 
> 


From xianranli78 at yahoo.com.cn  Thu Aug  2 04:56:04 2007
From: xianranli78 at yahoo.com.cn (Xianran Li)
Date: Thu, 2 Aug 2007 16:56:04 +0800
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
	<3579584634amadoz@uv.es>
Message-ID: <003701c7d4e2$f7a34bc0$ed07a8c0@BGI.LOCAL>

----- Original Message ----- 
From: "Alicia Amadoz" <Alicia.Amadoz at uv.es>
To: "Torsten Seemann" <torsten.seemann at infotech.monash.edu.au>; <bioperl-l at bioperl.org>
Cc: <jay at jays.net>
Sent: Thursday, August 02, 2007 3:06 PM
Subject: Re: [Bioperl-l] trying to save blast hit sequences to fasta file


> Hi, thanks for your help and suggestions. I have tried the example code
> of Jay Hannah and it works perfectly. But what I need to save in fasta
> format is the whole sequence in the database that is similar to my query
> sequence. I don't understand very well the difference between
> hit_string() and query_string(), are they the whole sequence that is
> similiar (about hit_string), a part of the whole sequence or just the
> part that is aligned to my query string? 

The hit_string() returns the  aligned sequences of the subject in your database and the query_string() is the aligned sequences of the query. These two things will be the same unless there are some mutations and or gaps within the alignment. 

> 
> With the previous code what I have are different sequences in length
> with the same id as my query string, so I am not sure that I am doing
> what I need to do. Any light on this point?

Did you specify the $id before 
  
my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 

If you didn't, then all the sequences retrieved will get the same id. The following is a simply way to avoid this problem.

my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta");                                                           
my $i;                                                                    
while(my $result = $blast_report->next_result()) {                        
   while(my $hit = $result->next_hit()) {                                 
      while(my $hsp = $hit->next_hsp()) {                                 
            $i ++;                                                      
         my $hseq = $hsp->hit_string();                                   
            $hseq =~ s/-//g; #### remove the gap within the aligment      
         my $id = $i; ###### specifiy the id                            
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
         # $seq_out->write_seq($hseq);                                    
         $seq_out->write_result($hseq_obj);                               
      }                                                                   
   }                                                                      
}               


Xianran 

> 
> Thank you very much for your help.
> Alicia
> 
> > Alicia,
> > 
> > > Hi, I would like to save my hit sequences from a blast result in a fasta
> > > file. I am trying some things but I have problems using Bio::SearchIO
> > > and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> > > code:
> > > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> > > "fasta");
> > > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> > > => "fasta");
> > > ...
> > >        my $hseq = $hsp->hit_string();
> > >          # $seq_out->write_seq($hseq);
> > >          $seq_out->write_result($hseq);
> > 
> > You have encountered two common problems for BioPerl beginners:
> > 
> > 1. "fasta" means two different things! In SearchIO it refers to the
> > output format of the "fasta" sequence alignment software. In SeqIO it
> > refers to a file format that stores just sequences. Confusing, I know.
> > You need SeqIO and write_seq, not SearchIO and write_result.
> > 
> > 2. $hseq is a STRING which has the raw sequence letters in it.
> > However, the write_seq() method needs a Bio::Seq object (which has
> > extra details like the name and ID) not a raw string.
> > 
> > The example code Jay Hannah supplied in his reply looks pretty good,
> > you should try it.
> > 
> > -- 
> > --Torsten Seemann
> > --Victorian Bioinformatics Consortium, Monash University
> > 
> > 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f???????


From neetisomaiya at gmail.com  Thu Aug  2 02:20:33 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 2 Aug 2007 11:50:33 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
Message-ID: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>

Hi,

The script is attached with this mail.
I am using bioperl-1.4.

Regards,
Neeti.

On 8/1/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Neeti,
>
> Only post to one list email address, namely the one I'm responding to
> and the one shown here:
>
> http://bioperl.org/mailman/listinfo/bioperl-l
>
> The others are aliases so you essentially posted three times.  As for
> your question: there was no attached script or any additional
> information (bioperl version would have also been nice), so we can't
> help you until we have something more to work with.
>
> chris
>
> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>
> > I have downloaded the omim.txt file from NCBI ftp site and I am
> > running my
> > attached parser on this file, the parser run stops in between with
> > this :-
> >
> > ------------- EXCEPTION  -------------
> > MSG: a part/organism must be assigned
> > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > STACK toplevel parse_omim_original.pl:47
> >
> > --------------------------------------
> >
> > What is the reason for this?
> > Can anyone guide me please.
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
-Neeti
Even my blood says, B positive
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_omim_original.pl
Type: application/x-perl
Size: 5998 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/fbbee8db/attachment.bin 

From neetisomaiya at gmail.com  Thu Aug  2 09:00:33 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 2 Aug 2007 18:30:33 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
Message-ID: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>

Also,
As per the following links we can fetch data from the genemap file as well
:-
http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/OMIMparser.pm

But when I am trying to do so in the exact manner as given in the above
link, I get no data. As in there are OMIM ids which are present in both the
omim.txt and genemap files, and for such cases when I parse and fetch data,
data from both files should be obtained, but I aint getting it.

For eg. while running the attached script, for OMIM id 100790, I get all
data from omim.txt but the cytoposition, gene symbol etc from genemap is not
coming, though it is present in the genemap file.

Please help me find what could be going wrong.

On 8/2/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> The script is attached with this mail.
> I am using bioperl-1.4.
>
> Regards,
> Neeti.
>
> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:
> >
> > Neeti,
> >
> > Only post to one list email address, namely the one I'm responding to
> > and the one shown here:
> >
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> > The others are aliases so you essentially posted three times.  As for
> > your question: there was no attached script or any additional
> > information (bioperl version would have also been nice), so we can't
> > help you until we have something more to work with.
> >
> > chris
> >
> > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
> >
> > > I have downloaded the omim.txt file from NCBI ftp site and I am
> > > running my
> > > attached parser on this file, the parser run stops in between with
> > > this :-
> > >
> > > ------------- EXCEPTION  -------------
> > > MSG: a part/organism must be assigned
> > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > > STACK toplevel parse_omim_original.pl:47
> > >
> > > --------------------------------------
> > >
> > > What is the reason for this?
> > > Can anyone guide me please.
> > >
> > > --
> > > -Neeti
> > > Even my blood says, B positive
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
>
>
> --
> -Neeti
> Even my blood says, B positive
>
>


-- 
-Neeti
Even my blood says, B positive
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_omim_original.pl
Type: application/x-perl
Size: 8750 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/6bdb009c/attachment.bin 

From cjfields at uiuc.edu  Thu Aug  2 13:05:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 12:05:55 -0500
Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml
References: <38B65B2C-A36D-41FB-83C9-7D7B55156CCD@uiuc.edu>
Message-ID: <EF284983-9A37-4F0F-BF92-04C7804275A0@uiuc.edu>

For archiving purposes; of course I forgot to cc the list!

-c

Begin forwarded message:

> From: Chris Fields <cjfields at uiuc.edu>
> Date: August 2, 2007 12:04:59 PM CDT
> To: gyang at plantbio.uga.edu
> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
> with xml
>
> Guojun,
>
> Make sure to keep this on the mail list for archiving purposes.
>
> It could be that the RID is not being removed properly (if it isn't  
> removed then you will repeatedly retrieve your BLAST report).  The  
> new error you are seeing may be coming from whatever XML::SAX  
> backend parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat,  
> etc); it doesn't look bioperl-related and there is an eval which  
> catches this stuff in SearchIO::blastxml.  Does text parsing work?
>
> Could you directly send me your script or add it to a new bug  
> report as an attachment?
>
> http://www.bioperl.org/wiki/Bugs
>
> chris
>
> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote:
>
>> Hi,Chris,
>> I installed the latest version of bioperl, in addition to the  
>> repeated output problem, there are new problems with parsing:
>>
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>  No close tag marker [Ln: 4126, Col: 0]
>>
>> ---------------------------------------------------
>>
>> Would you please kindly give me a hint on this,
>> Thanks a lot,
>> Guojun
>>
>>
>> ----- Original Message -----
>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> To: gyang at plantbio.uga.edu
>> Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org]
>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
>> with xml
>>
>>
>>> Make sure to keep responses on the ail list.
>>>> You might want to run a full install, just in case.  If I remember
>>> correctly Sendu made some changes a while back in the BLAST-related
>>> modules which may be related to this.  At the very least install/
>>> upgrade all modules in Bio::Tools::Run.
>>>> chris
>>>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote:
>>>>> Thanks, Chris,
>>>> But when I replaced the old RemoteBlast.pm with the new one, I got
>>>> "can't locate the object method "retrieve_parameter"". Does this
>>>> mean I need to install something else?
>>>> Guojun
>>>>
>>>> ----- Original Message -----
>>>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>>>> To: gyang at plantbio.uga.edu
>>>> Cc: bioperl-l at bioperl.org
>>>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast
>>>> with xml
>>>>
>>>>
>>>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote:
>>>>>>> I am running remoteblast and using readmethod "xml", I  
>>>>>>> noticed that
>>>>>> it is printing the output repeatedly nonstop. It's like in a  
>>>>>> loop.
>>>>>> Did anybody notice this before? Can anybody help me getting  
>>>>>> out of
>>>>>> this?
>>>>>> Thanks a lot,
>>>>>>
>>>>>>
>>>>>> Guojun Yang
>>>>>> University of Georgia
>>>>>> Not seeing that using bioperl-live; you may need to update
>>>>> RemoteBlast.pm as this sounds similar to an issue that popped up
>>>>> earlier in the spring.
>>>>>> chris
>>>>>
>>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>>>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug  2 13:51:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 12:51:27 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>
Message-ID: <921F31D6-3CA9-483A-8AFF-B3555E9768C4@uiuc.edu>

Neeti,

The genemap wasn't loaded in all cases; don't know what the reasoning  
for it was, but it is fixed in CVS now  
(Bio::Phenotype::OMIM::OMIMparser, specifically).  I would recommend  
that you install a full upgrade to at least bioperl 1.5.2 before  
using this; I can't guarantee it will work with bioperl 1.4.

chris

On Aug 2, 2007, at 8:00 AM, neeti somaiya wrote:

> Also,
> As per the following links we can fetch data from the genemap file  
> as well
> :-
> http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/ 
> OMIMparser.pm
>
> But when I am trying to do so in the exact manner as given in the  
> above
> link, I get no data. As in there are OMIM ids which are present in  
> both the
> omim.txt and genemap files, and for such cases when I parse and  
> fetch data,
> data from both files should be obtained, but I aint getting it.
>
> For eg. while running the attached script, for OMIM id 100790, I  
> get all
> data from omim.txt but the cytoposition, gene symbol etc from  
> genemap is not
> coming, though it is present in the genemap file.
>
> Please help me find what could be going wrong.
>
> On 8/2/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Hi,
>>
>> The script is attached with this mail.
>> I am using bioperl-1.4.
>>
>> Regards,
>> Neeti.
>>
>> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:
>>>
>>> Neeti,
>>>
>>> Only post to one list email address, namely the one I'm  
>>> responding to
>>> and the one shown here:
>>>
>>> http://bioperl.org/mailman/listinfo/bioperl-l
>>>
>>> The others are aliases so you essentially posted three times.  As  
>>> for
>>> your question: there was no attached script or any additional
>>> information (bioperl version would have also been nice), so we can't
>>> help you until we have something more to work with.
>>>
>>> chris
>>>
>>> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>>>
>>>> I have downloaded the omim.txt file from NCBI ftp site and I am
>>>> running my
>>>> attached parser on this file, the parser run stops in between with
>>>> this :-
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: a part/organism must be assigned
>>>> STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
>>>> STACK toplevel parse_omim_original.pl:47
>>>>
>>>> --------------------------------------
>>>>
>>>> What is the reason for this?
>>>> Can anyone guide me please.
>>>>
>>>> --
>>>> -Neeti
>>>> Even my blood says, B positive
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>>
>>
>>
>> --
>> -Neeti
>> Even my blood says, B positive
>>
>>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
> <parse_omim_original.pl>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug  2 14:16:56 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 13:16:56 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
Message-ID: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>

Neeti,

Keep this on the list please.  I am unable to reproduce this using  
your script with or without using the optional genemap file.  You  
really should upgrade bioperl to 1.5.2 and try the fix first; this is  
something that may have been fixed post-bioperl 1.4.

chris

On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote:

> Waiting for your reply on the exception I had mentioned in my first  
> mail.
>
> Thanks.
>
> ---------- Forwarded message ----------
> From: neeti somaiya < neetisomaiya at gmail.com>
> Date: Aug 2, 2007 11:50 AM
> Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser
> To: bioperl-l at lists.open-bio.org
>
> Hi,
>
> The script is attached with this mail.
> I am using bioperl-1.4.
>
> Regards,
> Neeti.
>
>
> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti,
>
> Only post to one list email address, namely the one I'm responding to
> and the one shown here:
>
> http://bioperl.org/mailman/listinfo/bioperl-l
>
> The others are aliases so you essentially posted three times.  As for
> your question: there was no attached script or any additional
> information (bioperl version would have also been nice), so we can't
> help you until we have something more to work with.
>
> chris
>
> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>
> > I have downloaded the omim.txt file from NCBI ftp site and I am
> > running my
> > attached parser on this file, the parser run stops in between with
> > this :-
> >
> > ------------- EXCEPTION  -------------
> > MSG: a part/organism must be assigned
> > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > STACK toplevel parse_omim_original.pl:47
> >
> > --------------------------------------
> >
> > What is the reason for this?
> > Can anyone guide me please.
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
> <parse_omim_original.pl>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Thu Aug  2 21:03:36 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 3 Aug 2007 11:03:36 +1000
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <3579584634amadoz@uv.es>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
	<3579584634amadoz@uv.es>
Message-ID: <a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>

Alicia,

> Hi, thanks for your help and suggestions. I have tried the example code
> of Jay Hannah and it works perfectly. But what I need to save in fasta
> format is the whole sequence in the database that is similar to my query
> sequence.

Unfortunately the hit_string is only that part of the sequence in the
database that was similar enough to your query sequence. The BLAST
report does not have the whole hit sequence in it, only the locally
aligned part. SearchIO can only give you what it can get from the
BLAST report.

You will need to record the IDs of the database sequences you are
interested in, and write extra code to retrieve the WHOLE hit sequence
from your database.

--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University

From neetisomaiya at gmail.com  Fri Aug  3 01:46:32 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 3 Aug 2007 11:16:32 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
	<9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>
Message-ID: <764978cf0708022246v98abed6ue41233f6b27c5674@mail.gmail.com>

Hi,

Thanks a lot.
The exception is not coming after upgrade to bioperl-1.5.2
But the genemap data is still a problem.

You had mentioned that I should take Bio::Phenotype::OMIM::OMIMparser,
specifically from cvs. Where exactly can I get it?

Thanks,
Neeti.

On 8/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Neeti,
>
> Keep this on the list please.  I am unable to reproduce this using
> your script with or without using the optional genemap file.  You
> really should upgrade bioperl to 1.5.2 and try the fix first; this is
> something that may have been fixed post-bioperl 1.4.
>
> chris
>
> On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote:
>
> > Waiting for your reply on the exception I had mentioned in my first
> > mail.
> >
> > Thanks.
> >
> > ---------- Forwarded message ----------
> > From: neeti somaiya < neetisomaiya at gmail.com>
> > Date: Aug 2, 2007 11:50 AM
> > Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser
> > To: bioperl-l at lists.open-bio.org
> >
> > Hi,
> >
> > The script is attached with this mail.
> > I am using bioperl-1.4.
> >
> > Regards,
> > Neeti.
> >
> >
> > On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti,
> >
> > Only post to one list email address, namely the one I'm responding to
> > and the one shown here:
> >
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> > The others are aliases so you essentially posted three times.  As for
> > your question: there was no attached script or any additional
> > information (bioperl version would have also been nice), so we can't
> > help you until we have something more to work with.
> >
> > chris
> >
> > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
> >
> > > I have downloaded the omim.txt file from NCBI ftp site and I am
> > > running my
> > > attached parser on this file, the parser run stops in between with
> > > this :-
> > >
> > > ------------- EXCEPTION  -------------
> > > MSG: a part/organism must be assigned
> > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > > STACK toplevel parse_omim_original.pl:47
> > >
> > > --------------------------------------
> > >
> > > What is the reason for this?
> > > Can anyone guide me please.
> > >
> > > --
> > > -Neeti
> > > Even my blood says, B positive
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> >
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> >
> >
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > <parse_omim_original.pl>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
-Neeti
Even my blood says, B positive

From jay at jays.net  Fri Aug  3 10:23:11 2007
From: jay at jays.net (Jay Hannah)
Date: Fri, 03 Aug 2007 09:23:11 -0500
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>	<3579584634amadoz@uv.es>
	<a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
Message-ID: <46B33A4F.2010403@jays.net>

Torsten Seemann wrote:
>> Hi, thanks for your help and suggestions. I have tried the example code
>> of Jay Hannah and it works perfectly. But what I need to save in fasta
>> format is the whole sequence in the database that is similar to my query
>> sequence.
>>     
>
> Unfortunately the hit_string is only that part of the sequence in the
> database that was similar enough to your query sequence. The BLAST
> report does not have the whole hit sequence in it, only the locally
> aligned part. SearchIO can only give you what it can get from the
> BLAST report.
>
> You will need to record the IDs of the database sequences you are
> interested in, and write extra code to retrieve the WHOLE hit sequence
> from your database.
>   
This probably won't help, but my (extremely poorly documented) 
"SeqLab.net" project

   http://seqlab.net

is a framework that sits on top of BioPerl. The current cross_blast() 
stuff (http://seqlab.net/pods2html/tutorial.html) does this:

   GenBank -> FASTA -> formatdb -> "stand alone" NCBI BLAST -> reports

When the reports run they have simultaneous access to both the original 
Bio::Seq objects from the GenBank file and the Bio::SearchIO objects 
from the BLAST results, so it can kick out reports that understand the 
relationships between (and details of) the original sequences and HSPs 
simultaneously...

If you get stuck trying to do what Torsten suggests and have questions 
about SeqLab.net you could open a ticket with my group

   http://clab.ist.unomaha.edu/CLAB/index.php/RT

and I'll try to help.

Cheers,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From mbasu at mail.nih.gov  Fri Aug  3 14:55:57 2007
From: mbasu at mail.nih.gov (Malay)
Date: Fri, 03 Aug 2007 14:55:57 -0400
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <46B33A4F.2010403@jays.net>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>	<3579584634amadoz@uv.es>	<a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
	<46B33A4F.2010403@jays.net>
Message-ID: <46B37A3D.4070606@mail.nih.gov>

Jay Hannah wrote:
> Torsten Seemann wrote:
>>> Hi, thanks for your help and suggestions. I have tried the example code
>>> of Jay Hannah and it works perfectly. But what I need to save in fasta
>>> format is the whole sequence in the database that is similar to my query
>>> sequence.
>>>     
>> Unfortunately the hit_string is only that part of the sequence in the
>> database that was similar enough to your query sequence. The BLAST
>> report does not have the whole hit sequence in it, only the locally
>> aligned part. SearchIO can only give you what it can get from the
>> BLAST report.
>>
>> You will need to record the IDs of the database sequences you are
>> interested in, and write extra code to retrieve the WHOLE hit sequence
>> from your database.

I am not sure whether it has already been suggested or not but you can 
retrieve the full sequence from any blast database using "fastacmd", 
which is part of NCBI toolbox. Parse the "description" string from from 
the BLAST report and run:

fastacmd -d <database file> -s <description>

where, the argument of -s can be any unique string for the database.

-Malay

From cjfields at uiuc.edu  Mon Aug  6 13:49:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 6 Aug 2007 12:49:08 -0500
Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml
References: <1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu>
Message-ID: <B97BD1F9-05FE-4225-810F-5EA10AB2728B@uiuc.edu>

Wasn't paying attention! Forwarding this to the mail list in case  
anyone wanted the answer...

chris

Begin forwarded message:

> From: Chris Fields <cjfields at uiuc.edu>
> Date: August 6, 2007 12:10:37 PM CDT
> To: gyang at plantbio.uga.edu
> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
> with xml
>
> Guojun,
>
> Sorry about the long wait on this.  At this time RemoteBlast  
> doesn't automatically set the retrieval header to return XML when  
> setting the -reporttype parameter to 'xml' or 'blastxml'.  The  
> default is text output, so you are retrieving regular text BLAST  
> reports instead of XML, hence the reported XML parser failure (BTW,  
> you can see the plain text being returned in the debugging  
> output).  I'll look into a fix for that.
>
> In the meantime, you can do this manually by setting the following  
> key prior to submitting the BLAST run:
>
> $Bio::Tools::Run::RemoteBlast::RETRIEVALHEADER{'FORMAT_TYPE'} = 'XML';
>
> When I run your example with the above line added it works fine.   
> As an additional note, the CVS version of Bio::SearchIO::blastxml  
> now supports newer versions of XML::SAX::Expat; the problem there  
> was a bug in XML::SAX::Expat that killed parsing.
>
> Additional rant before I go back to work (you can skip this if  
> needed):  RemoteBlast is one of the most used modules in BioPerl,  
> but it is also the most problematic as NCBI keeps changing things  
> on their end (BLAST text output, prompts when returning RIDs,  
> etc).  It frankly isn't as well-maintained as we would like; this  
> is partly due to plans we have (but unfortunately haven't acted  
> upon) to merge RemoteBlast/StandAloneBlast so they have a similar  
> API and can be used for any BLAST program, including netblast.  If  
> someone wants to take this on at some point then they are more than  
> welcome!
>
> chris
>
> On Aug 3, 2007, at 10:08 AM, Guojun Yang wrote:
>
>> Thanks, Chris,
>> Attached are my script and the query file. I suspected that we  
>> need to add "remove RID... in the code", I tried putting romoving  
>> RID at the end of the parsing coding, but it seemed it removed it  
>> even before the output was processed.   I installed  
>> XML::SAX::Expat, the error became "XML::SAX::Expat is no longer  
>> supported...", so I installed ExpatXS, the error message becomes:
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>  no element found at line 4126, column 1, byte 186628 at /usr/lib/ 
>> perl5/site_perl/5.8.3/Bio/SearchIO/blastxml.pm line 304
>>
>>
>> Would you please try the script with the query file with the  
>> following input parameters, to see what happens on your machine (I  
>> want to make sure there is no installation problem on my machine).  
>> The search subroutine is where blast is performed, I did not  
>> include a romove RID there. Thanks again!
>>
>> master:/home/guojun # perl makcgi07.txt
>> Query file name:
>> kiddo.txt
>> Select a function: 1.member;2.RES; 3, long; 4.Anchor; 5.Associator.
>> 1
>> Type in the name of an organism, e.g. Oryza sativa.
>> Oryza sativa
>> Type in the organism to search for RES:
>> Your E_value:
>> 0.001
>> Size limit for ancestor element:
>> 4000
>> Flanking size for retrieved members:
>> 50
>> Tolerance for end mismatch:
>> 0
>>
>>
>>
>> Guojun From: Chris Fields [mailto:cjfields at uiuc.edu]
>> To: gyang at plantbio.uga.edu
>> Sent: Thu, 02 Aug 2007 13:04:59 -0400
>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
>> with xml
>>
>> Guojun,
>>
>> Make sure to keep this on the mail list for archiving purposes.
>>
>> It could be that the RID is not being removed properly (if it isn't
>> removed then you will repeatedly retrieve your BLAST report). The
>> new error you are seeing may be coming from whatever XML::SAX backend
>> parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat, etc); it
>> doesn't look bioperl-related and there is an eval which catches this
>> stuff in SearchIO::blastxml. Does text parsing work?
>>
>> Could you directly send me your script or add it to a new bug report
>> as an attachment?
>>
>> http://www.bioperl.org/wiki/Bugs
>>
>> chris
>>
>> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote:
>>
>> > Hi,Chris,
>> > I installed the latest version of bioperl, in addition to the
>> > repeated output problem, there are new problems with parsing:
>> >
>> >
>> > -------------------- WARNING ---------------------
>> > MSG: error in parsing a report:
>> > No close tag marker [Ln: 4126, Col: 0]
>> >
>> > ---------------------------------------------------
>> >
>> > Would you please kindly give me a hint on this,
>> > Thanks a lot,
>> > Guojun
>> >
>> >
>> > ----- Original Message -----
>> > From: Chris Fields [mailto:cjfields at uiuc.edu]
>> > To: gyang at plantbio.uga.edu
>> > Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org]
>> > Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast
>> > with xml
>> >
>> >
>> >> Make sure to keep responses on the ail list.
>> >>> You might want to run a full install, just in case. If I remember
>> >> correctly Sendu made some changes a while back in the BLAST- 
>> related
>> >> modules which may be related to this. At the very least install/
>> >> upgrade all modules in Bio::Tools::Run.
>> >>> chris
>> >>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote:
>> >>>> Thanks, Chris,
>> >>> But when I replaced the old RemoteBlast.pm with the new one, I  
>> got
>> >>> "can't locate the object method "retrieve_parameter"". Does this
>> >>> mean I need to install something else?
>> >>> Guojun
>> >>>
>> >>> ----- Original Message -----
>> >>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> >>> To: gyang at plantbio.uga.edu
>> >>> Cc: bioperl-l at bioperl.org
>> >>> Subject: Re: [Bioperl-l] nonstop repeated output from  
>> Remote_blast
>> >>> with xml
>> >>>
>> >>>
>> >>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote:
>> >>>>>> I am running remoteblast and using readmethod "xml", I noticed
>> >>>>>> that
>> >>>>> it is printing the output repeatedly nonstop. It's like in a  
>> loop.
>> >>>>> Did anybody notice this before? Can anybody help me getting  
>> out of
>> >>>>> this?
>> >>>>> Thanks a lot,
>> >>>>>
>> >>>>>
>> >>>>> Guojun Yang
>> >>>>> University of Georgia
>> >>>>> Not seeing that using bioperl-live; you may need to update
>> >>>> RemoteBlast.pm as this sounds similar to an issue that popped up
>> >>>> earlier in the spring.
>> >>>>> chris
>> >>>>
>> >>> Christopher Fields
>> >> Postdoctoral Researcher
>> >> Lab of Dr. Robert Switzer
>> >> Dept of Biochemistry
>> >> University of Illinois Urbana-Champaign
>> >>>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>
>>
>> <makcgi07.txt>
>> <kiddo.txt>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Alicia.Amadoz at uv.es  Tue Aug  7 04:20:12 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Tue, 7 Aug 2007 10:20:12 +0200 (CEST)
Subject: [Bioperl-l] error using standaloneblast through webserver, part II
Message-ID: <1387114447amadoz@uv.es>

Hi again, i'm trying to run a bioperl script in linux with
standaloneblast from a webserver but i now have another error. It is the
following:

[blastall] WARNING: Unable to open outfile_allseq.nin
[blastall] WARNING: 101: Unable to open outfile_allseq.nin

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d
 "/outfile_allseq"  -e  10  -i 
/tmp//alicia_2007_07_20/result_search_alicia_12_03_40.fasta  -o 
/tmp//alicia_2007_08_07/101_result_Local_Blast_alicia_09_56_47.out  -p 
blastn

My perl code is the following:

my $blastdatadir = $ARGV[9]; -> Here the value of the variable is ok

BEGIN { 
	$ENV{PATH} .= ':/usr/local/blast-2.2.16/bin'; # path where blastall bin
is located
	$ENV{BLASTDIR} = '/usr/local/blast-2.2.16/bin'; # path where blastall
bin is located
	$ENV{BLASTDATADIR} = $blastdatadir; # path where formated local
databases are located -> Here the value is empty
}   

I have tried without BEGIN { } so $ENV var has a correct value for
$blastdatadir but i get the same error. I have checked that formatdb was
done and all the files are correct.

Any idea or help to solve this problem? 

Thanks in advance. Regards,
Alicia


From mheusel at gmail.com  Tue Aug  7 04:45:33 2007
From: mheusel at gmail.com (Martin Heusel)
Date: Tue, 7 Aug 2007 10:45:33 +0200
Subject: [Bioperl-l] error using standaloneblast through webserver,
	part II
In-Reply-To: <1387114447amadoz@uv.es>
References: <1387114447amadoz@uv.es>
Message-ID: <6127fc200708070145keb750acycce8a43edd0f724d@mail.gmail.com>

> MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d
>  "/outfile_allseq"  -e  10  -i

I'm not familiar with all this, but it seems your script tries to
write in the systems root directory /

-d "/outfile_allseq"

that is normally not writable for normal users

is this the problem?

cu

Martin

-- 
+ openid: http://mhe.myopenid.com/
+ gpg   : http://user.cs.tu-berlin.de/~mhe/pub/martin.gpg
+ gpg fp: 4844 71B5 B4E4 3892 69CA  6EA5 6598 61BE 0021 94A2

From Alicia.Amadoz at uv.es  Tue Aug  7 07:08:12 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Tue, 7 Aug 2007 13:08:12 +0200 (CEST)
Subject: [Bioperl-l] error using standaloneblast through webserver,
	part II
In-Reply-To: <1387114447amadoz@uv.es>
References: <1387114447amadoz@uv.es>
Message-ID: <5825345446amadoz@uv.es>

Hi, i thought that it was enough with setting $ENV{BLASTDATADIR} and
standaloneblast would find the database. I have change it, setting
-database option of params with path_to_database+name_of_database and it
works ok.

Thanks for your help. Regards,
Alicia


From jason at bioperl.org  Wed Aug  8 15:16:07 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 8 Aug 2007 14:16:07 -0500
Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module
References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com>
Message-ID: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>

Young -
I'm forwarding to the list for more help.

Begin forwarded message:

> From: "Young Song" <youngcsong at gmail.com>
> Date: August 8, 2007 1:48:29 PM CDT
> To: jason at bioperl.org
> Subject: Question regarding Bio::GenBank module
>
> Hello,
>
>    I am currently located in Vancouver, Canada, and I actually have  
> some
> question based on the Bio::GenBank module for bioperl.  I read in the
> online document for the module (
> http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are  
> not
> supposed to spam the NCBI with multiple requests, which lead me to  
> think
> about the script that I wrote.  I am trying to extract some  
> information
> based on the fasta protein files located in the  NCBI's  database.   
> The
> script  reads  each '.faa' (Fasta Protein) file and takes in the  
> 'gi'  ID
> for each  sequence, and extracts several information, which looks like
> following output (please note that there are lot more gi's then I  
> am showing
> you right now):
>
> 10954456
> accesstion number: NP_047185.1
> dbsource: GenBank: NC_001911.1
> NP_047185.1
> starting pos. at genomic seq: 1488
> ending pos. at genomic seq: 1991
> strand: +
> description: putative membrane-associated protein
> organism: Buchnera aphidicola
> MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGLL 
> VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLCV 
> IIHLTFVLSAFGMAYIDKMSKKKHVLH
> ************************************************
> 10954457
> accesstion number: NP_047186.1
> dbsource: GenBank: NC_001911.1
> NP_047186.1
> starting pos. at genomic seq: 2158
> ending pos. at genomic seq: 2913
> strand: +
> description: putative replication-associated protein
> organism: Buchnera aphidicola
> MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHRA 
> CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKRK 
> FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKKI 
> LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK
>
>
>
>   Because there are lot of sequences I am dealing with here, I am  
> little bit
> worried that I may be causing harm to the NCBI server.  I just need  
> to know
> if this is the right approach to take, or if there is another  
> solution (I am
> little bit confused what you mean by "multiple requests" in the  
> document).
> Your reply would be very much appreciated.  Thank you in advance.
>
>   Sincerely,
>
>      Young C. Song

--
Jason Stajich
jason at bioperl.org


From cjfields at uiuc.edu  Wed Aug  8 15:41:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 8 Aug 2007 14:41:34 -0500
Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module
In-Reply-To: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>
References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com>
	<24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>
Message-ID: <FD7D1694-604A-4C8B-AC47-B31F306EA5B0@uiuc.edu>

NCBI eUtils (which Bio::DB::GenBank uses to get sequence data) has a  
list of user requirements:

http://www.ncbi.nlm.nih.gov/entrez/query/static/ 
eutils_help.html#UserSystemRequirements

The most important one is the 3 second timeout between requests, but  
the module already implements that policy so there isn't a real issue  
unless you deliberately mess with that setting.  NCBI has been known  
to block IPs which don't follow that particular rule.  Also, if you  
are planning making hundreds of requests you should consider running  
the script during low traffic times as indicated in the above link.

chris

On Aug 8, 2007, at 2:16 PM, Jason Stajich wrote:

> Young -
> I'm forwarding to the list for more help.
>
> Begin forwarded message:
>
>> From: "Young Song" <youngcsong at gmail.com>
>> Date: August 8, 2007 1:48:29 PM CDT
>> To: jason at bioperl.org
>> Subject: Question regarding Bio::GenBank module
>>
>> Hello,
>>
>>    I am currently located in Vancouver, Canada, and I actually have
>> some
>> question based on the Bio::GenBank module for bioperl.  I read in the
>> online document for the module (
>> http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are
>> not
>> supposed to spam the NCBI with multiple requests, which lead me to
>> think
>> about the script that I wrote.  I am trying to extract some
>> information
>> based on the fasta protein files located in the  NCBI's  database.
>> The
>> script  reads  each '.faa' (Fasta Protein) file and takes in the
>> 'gi'  ID
>> for each  sequence, and extracts several information, which looks  
>> like
>> following output (please note that there are lot more gi's then I
>> am showing
>> you right now):
>>
>> 10954456
>> accesstion number: NP_047185.1
>> dbsource: GenBank: NC_001911.1
>> NP_047185.1
>> starting pos. at genomic seq: 1488
>> ending pos. at genomic seq: 1991
>> strand: +
>> description: putative membrane-associated protein
>> organism: Buchnera aphidicola
>> MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGL 
>> L
>> VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLC 
>> V
>> IIHLTFVLSAFGMAYIDKMSKKKHVLH
>> ************************************************
>> 10954457
>> accesstion number: NP_047186.1
>> dbsource: GenBank: NC_001911.1
>> NP_047186.1
>> starting pos. at genomic seq: 2158
>> ending pos. at genomic seq: 2913
>> strand: +
>> description: putative replication-associated protein
>> organism: Buchnera aphidicola
>> MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHR 
>> A
>> CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKR 
>> K
>> FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKK 
>> I
>> LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK
>>
>>
>>
>>   Because there are lot of sequences I am dealing with here, I am
>> little bit
>> worried that I may be causing harm to the NCBI server.  I just need
>> to know
>> if this is the right approach to take, or if there is another
>> solution (I am
>> little bit confused what you mean by "multiple requests" in the
>> document).
>> Your reply would be very much appreciated.  Thank you in advance.
>>
>>   Sincerely,
>>
>>      Young C. Song
>
> --
> Jason Stajich
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From gyang at plantbio.uga.edu  Thu Aug  9 15:03:21 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Thu, 09 Aug 2007 15:03:21 -0400
Subject: [Bioperl-l] standalone blastall call crashed, please help
In-Reply-To: 1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu
Message-ID: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>

Hi, Chris,  
Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm?  Best, Guojun  
   
I set the blast env variables:  
   
BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
$PROGRAMDIR = $ENV{'BLASTDIR'} || '';
......  
   
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  -o  /tmp/4qSSO16EZP  -p  blastx   
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359
STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
STACK: main::ancestor makcgi07.txt:593
STACK: makcgi07.txt:208
  

sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  

my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
print $query->seq();
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('swissprot');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
*    my $blast_report = $factory->blastall($query);
    while (my $result = $blast_report->next_result) {
            while( my $hit = $result->next_hit()) {
                $hit_name=$hit->name;
                $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
                $name=$1;
                $desc = $hit->description();
                if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){
                     $AN=0;
                     $replica=0;
                     while ($ancestor_name[$AN]) {
                        $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name));
                         $AN+=1;
                     }
                        if ($replica==0) {
                        push @ancestor_name, $long_name;
                        push @ancestor_start, $long_start;
                        push @ancestor_end, $long_end;
                        push @desc, $desc;
                        push @hitname,$name;
                        }
                }
               }
              }}
return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
}

From harijay at gmail.com  Thu Aug  9 17:47:50 2007
From: harijay at gmail.com (hari jayaram)
Date: Thu, 9 Aug 2007 17:47:50 -0400
Subject: [Bioperl-l] newbie wants install help
Message-ID: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>

Hi I am trying to install bioperl as a non root user since I dont have root
access on the machine.

I was following the instructions as given on the wiki at
http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
I started from scratch using perl version v5.8.5 and used cpan to install
the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
was needed. Everything worked just fine
I could use cpan as a non root user following instructions given at
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html

But when I try to install bioperl using the instructions for non-root I get
an error when I build Module::Build because I am not root.
Iget the same Module::Build error when I try to install without CPAN using
command line script perl Build.PL --install_base option as given on the
wiki.

Is there a way out

Thanks for your help in advance
harijay
Brandeis University


Installing /usr/share/man/man3/Module::Build::Platform::VMS.3pm
Installing /usr/share/man/man3/Module::Build::Base.3pm
Installing /usr/share/man/man3/Module::Build::Authoring.3pm
Installing /usr/share/man/man3/Module::Build::Compat.3pm
mkdir /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi/auto/Module:
Permission denied at /usr/lib/perl5/5.8.5/ExtUtils/Install.pm line 207
Installing /usr/bin/config_data
make: *** [install] Error 255
  /usr/bin/make install  -- NOT OK
    You may have to su to root to install the package
Couldn't install Module::Build, giving up.
make: *** No targets specified and no makefile found.  Stop.
  /usr/bin/make  -- NOT OK
Running make test
  Can't test without successful make
Running make install
  make had returned bad status, install seems impossible

From bix at sendu.me.uk  Thu Aug  9 18:23:24 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 09 Aug 2007 23:23:24 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
Message-ID: <46BB93DC.9010608@sendu.me.uk>

hari jayaram wrote:
> Hi I am trying to install bioperl as a non root user since I dont have root
> access on the machine.
> 
> I was following the instructions as given on the wiki at
> http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
> I started from scratch using perl version v5.8.5 and used cpan to install
> the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
> was needed. Everything worked just fine
> I could use cpan as a non root user following instructions given at
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
> 
> But when I try to install bioperl using the instructions for non-root I get
> an error when I build Module::Build because I am not root.
> Iget the same Module::Build error when I try to install without CPAN using
> command line script perl Build.PL --install_base option as given on the
> wiki.

Follow the cpan instructions you found to install as non-root:

Bundle::CPAN

Failing that, you require at least:
Module::Build

Failing that:
http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#INSTALLING_BIOPERL_MODULES_THE_HARD_WAY
(it's actually the easiest way, go figure)

From bix at sendu.me.uk  Fri Aug 10 03:41:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 10 Aug 2007 08:41:29 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>	
	<46BB93DC.9010608@sendu.me.uk>
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
Message-ID: <46BC16A9.7090709@sendu.me.uk>

hari jayaram wrote:
> Hi Sendu ,

Hi, please post back to the list as well, so others can benefit.


> Well after going through a few attempts at installing Bundle::CPAN I 
> gave up.
> It always had weird timeout issues . ANd kept re-installing everything 
> on restarting the CPAN shell
> After a while I thought it did complete - since it retunred me to the shell
> 
> I tried the CPAN install of bioperl at that point
> 
> ANd bingo I got booted out at the exact same point when the Bioperl 
> install tried to re-install(?) Module:Build which failed as non root

Did you follow steps 7 and 8 of 
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?

If you managed to install Bundle::CPAN, when you now run 'cpan' it 
should start up and tell you its version number, which should be v1.9102 
or higher. If its lower, you didn't manage to install the latest CPAN, 
or you haven't managed to tell Perl where your newly installed modules are.


> I guess for all future modules I will adopt the option 3 you detailed , 
> i.e just have the modules sitting somewhere and use them from there
> 
> But I am still interested in getting it done right via CPAN.

From n.haigh at sheffield.ac.uk  Fri Aug 10 06:14:06 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 10 Aug 2007 11:14:06 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <46BC16A9.7090709@sendu.me.uk>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>		<46BB93DC.9010608@sendu.me.uk>	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
	<46BC16A9.7090709@sendu.me.uk>
Message-ID: <46BC3A6E.80302@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> hari jayaram wrote:
>> Hi Sendu ,
> 
> Hi, please post back to the list as well, so others can benefit.
> 
> 
>> Well after going through a few attempts at installing Bundle::CPAN I 
>> gave up.
>> It always had weird timeout issues . ANd kept re-installing everything 
>> on restarting the CPAN shell
>> After a while I thought it did complete - since it retunred me to the shell
>>
>> I tried the CPAN install of bioperl at that point
>>
>> ANd bingo I got booted out at the exact same point when the Bioperl 
>> install tried to re-install(?) Module:Build which failed as non root
> 
> Did you follow steps 7 and 8 of 
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?
> 
> If you managed to install Bundle::CPAN, when you now run 'cpan' it 
> should start up and tell you its version number, which should be v1.9102 
> or higher. If its lower, you didn't manage to install the latest CPAN, 
> or you haven't managed to tell Perl where your newly installed modules are.
> 
> 
>> I guess for all future modules I will adopt the option 3 you detailed , 
>> i.e just have the modules sitting somewhere and use them from there
>>
>> But I am still interested in getting it done right via CPAN.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

It will probably also help, if you post the commands you have run and
any output (truncated if it's really long), then we can follow what you
have tried and make some better suggestions.

Cheers
Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGvDpuczuW2jkwy2gRAjFjAJ0eG90cMfHrrIh7LbKWx1JN94kbXgCdGSbi
tMjQrZ/8EPc0wLiNAhYTr4Y=
=kXZ2
-----END PGP SIGNATURE-----

From mbasu at mail.nih.gov  Fri Aug 10 11:25:35 2007
From: mbasu at mail.nih.gov (Malay)
Date: Fri, 10 Aug 2007 11:25:35 -0400
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
Message-ID: <46BC836F.7010906@mail.nih.gov>

hari jayaram wrote:
> Hi I am trying to install bioperl as a non root user since I dont have root
> access on the machine.
> 
> I was following the instructions as given on the wiki at
> http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
> I started from scratch using perl version v5.8.5 and used cpan to install
> the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
> was needed. Everything worked just fine
> I could use cpan as a non root user following instructions given at
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
> 
> But when I try to install bioperl using the instructions for non-root I get
> an error when I build Module::Build because I am not root.
> Iget the same Module::Build error when I try to install without CPAN using
> command line script perl Build.PL --install_base option as given on the
> wiki.
> 
> Is there a way out
> 
> Thanks for your help in advance
> harijay
> Brandeis University
> 

This is related your situation and broadly applicable to all perl users 
in a non root situation. I can tell from my own experience the best way 
to handle your situation is to use your own Perl, if you are a dedicated 
perl developer. Just compile and install your own perl installation in 
any directory of you choice and put the "bin" directory in front of you 
path and off you go. The advantages are several fold. First, you get a 
very optimized, fast perl. The sysadmin might have just installed a 
binary run-of-the-mill perl version. Second, you get all the freedom of 
installing the very latest updates of all the modules. The sysadmins may 
be too busy man to update perl frequently. Third, a very common problem 
with production machine is that they follow strictly the perl 
installation instruction and avoid threaded perl, which clips your wings 
particularly, when almost all machines contain multiple processors.

The drawbacks are related to finding "/usr/bin/perl" in the shebang 
line. If you follow the perl way of installing any script, it will take 
care of it. When you develop, use the more portable way of

#!/usr/bin/env perl
BEGIN {$^W =1 } # Use it switch on compile time warnings (-w)

All the best,

Malay


-- 
Malay K Basu
www.malaybasu.net

From gyang at plantbio.uga.edu  Fri Aug 10 11:23:36 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Fri, 10 Aug 2007 11:23:36 -0400
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
 from StandAloneBlast
In-Reply-To: 20070809190321.191d0d4a@dogwood.plantbio.uga.edu
Message-ID: <20070810152336.898c3979@dogwood.plantbio.uga.edu>

Hi, Chris,  
Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run.  If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run.
Thanks,  
Guojun  
       _____  

  From: Guojun Yang [mailto:gyang at plantbio.uga.edu]
To: Chris Fields [mailto:cjfields at uiuc.edu]
Cc: bioperl-l at lists.open-bio.org
Sent: Thu, 09 Aug 2007 15:03:21 -0400
Subject: standalone blastall call crashed, please help

  
Hi, Chris,  
Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm?  Best, Guojun  
   
I set the blast env variables:  
   
BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
$PROGRAMDIR = $ENV{'BLASTDIR'} || '';
......  
   
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  -o  /tmp/4qSSO16EZP  -p  blastx   
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359
STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
STACK: main::ancestor makcgi07.txt:593
STACK: makcgi07.txt:208
  

sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  

my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
print $query->seq();
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('swissprot');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
*    my $blast_report = $factory->blastall($query);
    while (my $result = $blast_report->next_result) {
            while( my $hit = $result->next_hit()) {
                $hit_name=$hit->name;
                $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
                $name=$1;
                $desc = $hit->description();
                if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){
                     $AN=0;
                     $replica=0;
                     while ($ancestor_name[$AN]) {
                        $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name));
                         $AN+=1;
                     }
                        if ($replica==0) {
                        push @ancestor_name, $long_name;
                        push @ancestor_start, $long_start;
                        push @ancestor_end, $long_end;
                        push @desc, $desc;
                        push @hitname,$name;
                        }
                }
               }
              }}
return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
}


From cjfields at uiuc.edu  Fri Aug 10 12:17:38 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 10 Aug 2007 11:17:38 -0500
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
References: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
Message-ID: <56186844-3CB9-4968-B16F-FD5EE72865A2@uiuc.edu>

This should be filed as a bug if possible; could you do that?

http://www.bioperl.org/wiki/Bugs

Suggestions have been made many times previously that  
StandAloneBlast, RemoteBlast, etc be combined to use a common API,  
incorporate other BLAST implementations (i.e. WU-BLAST, NCBI's  
netblast, etc), and maybe utilize other cross-platform compatible  
means of running programs and passing off reports to parsers.  In  
fact, Jason, Roger Hall, Torsten, and I discussed tentative plans for  
plugin-able BLAST wrappers:

http://www.bioperl.org/wiki/Module:Bio::Tools::Run::RemoteBlast

Though they have never been acted upon.  If I get time towards the  
end of fall and manage to finish up some other projects I may try  
taking this on, maybe using the wiki to track progress.

chris

On Aug 10, 2007, at 10:23 AM, Guojun Yang wrote:

> Hi, Chris,
> Interestingly, I found the message in bioperl-l from Matthew Laird  
> 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES  
> run.  If one comments out this line in StandAloneBlast.pm, the  
> execution succeeds perfectly fine". It seemed to be mysterious when  
> I uncommented the " $self->throw("$executable call crashed: $? $!  
> $commandstring\n") unless ($status==0) ;" line, the blastall runs.  
> The only difference from what Matthew saw is that, when I did not  
> uncomment the line, blastall DID NOT run.
> Thanks,
> Guojun
>
> From: Guojun Yang [mailto:gyang at plantbio.uga.edu]
> To: Chris Fields [mailto:cjfields at uiuc.edu]
> Cc: bioperl-l at lists.open-bio.org
> Sent: Thu, 09 Aug 2007 15:03:21 -0400
> Subject: standalone blastall call crashed, please help
>
> Hi, Chris,
> Thanks a lot for your efforts. With your help, I am gaining more  
> confidence to fix the cgi code. While the remoteblast problem is  
> fixed now, I am caught in a local blast problem (see the error  
> message and subroutine). The line starting with * is line 593 in  
> the error message. I tried command line blastall, it works fine. I  
> set the permission to all the blast folders and files, it did not  
> help much. The same sequence and database works OK if I use command  
> line blastall. I used the seq object ref $query as query, the error  
> message gives "-i /tmp/...", does this look like an input problem?  
> The subroutine was working before early 2006 (on a different  
> machine), I am wondering whether this is due to changes in the  
> StandAloneBlast.pm?  Best, Guojun
>
> I set the blast env variables:
>
> BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
> BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
> BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
> $PROGRAMDIR = $ENV{'BLASTDIR'} || '';
> ......
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/ 
> usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  - 
> o  /tmp/4qSSO16EZP  -p  blastx
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/ 
> Root/Root.pm:359
> STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/ 
> site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
> STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/ 
> lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
> STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/ 
> site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
> STACK: main::ancestor makcgi07.txt:593
> STACK: makcgi07.txt:208
> sub ancestor {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
>
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"test");
> print $query->seq();
> my $len=$query->length();
> my $long_name=$_[1];
> my $long_start=$_[2];
> my $long_end=$_[3];
> @db=('swissprot');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new(-program =>  
> "blastx",
>                                                         -database  
> => "$db",
>                                                         -e => 1e-3,
>                                                         );
> *    my $blast_report = $factory->blastall($query);
>     while (my $result = $blast_report->next_result) {
>             while( my $hit = $result->next_hit()) {
>                 $hit_name=$hit->name;
>                 $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
>                 $name=$1;
>                 $desc = $hit->description();
>                 if ($desc =~ /.*{|\btransposon\b|\btransposase 
> \b|}.*/i){
>                      $AN=0;
>                      $replica=0;
>                      while ($ancestor_name[$AN]) {
>                         $replica=1 if (($ancestor_name[$AN] eq  
> $long_name) && ($hitname[$AN] eq $name));
>                          $AN+=1;
>                      }
>                         if ($replica==0) {
>                         push @ancestor_name, $long_name;
>                         push @ancestor_start, $long_start;
>                         push @ancestor_end, $long_end;
>                         push @desc, $desc;
>                         push @hitname,$name;
>                         }
>                 }
>                }
>               }}
> return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
> }
>
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From harijay at gmail.com  Fri Aug 10 13:09:32 2007
From: harijay at gmail.com (hari jayaram)
Date: Fri, 10 Aug 2007 13:09:32 -0400
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <46BC16A9.7090709@sendu.me.uk>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
	<46BB93DC.9010608@sendu.me.uk>
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
	<46BC16A9.7090709@sendu.me.uk>
Message-ID: <aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>

Hey all ,
Thanks for your help. Its working real well now.

Turns out I had not set my PERL5LIB environment variable correctly and it
was not finding the installed modules (thanks Sendu)

So the steps I followed were
1) Install CPAN as myself as detailed
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
Importantly the line which tells CPAN what prefix to use for all module
installs
PREFIX=~/perl5lib/ LIB=~/perl5lib/lib INSTALLMAN1DIR=~/perl5lib/man1
INSTALLMAN3DIR=~/perl5lib/man3

2) Set the Perl5LIB to /home/perl5lib/lib ( and not just /home/perl5lib) in
the shell . I use cshell so I edited .cshrc
setenv PERL5LIB /home/hari/perl5lib/lib
setenv MANPATH ${MANPATH}:/home/hari/perl5lib

3) Updated the system CPAN to latest version - this woked very well once the
perl5lib was installed ..only it took a while and sometimes stalled with
messages like done 31/34  But a CTRL C , got it going again

4) Made sure I was using the new CPAN v1.9102

5) Installed Bioperl with command
install S/SE/SENDU/bioperl-1.5.2_102.tar.gz

AND I was good to go..

I am thinking I will screencast this process for everyones benefit and put
it up on bioscreencast.com . If that will be useful for others.
Thanks to everyone on the group. Now the journey begins

Hari Jayaram


On 8/10/07, Sendu Bala <bix at sendu.me.uk> wrote:
> hari jayaram wrote:
> > Hi Sendu ,
>
> Hi, please post back to the list as well, so others can benefit.
>
>
> > Well after going through a few attempts at installing Bundle::CPAN I
> > gave up.
> > It always had weird timeout issues . ANd kept re-installing everything
> > on restarting the CPAN shell
> > After a while I thought it did complete - since it retunred me to the
shell
> >
> > I tried the CPAN install of bioperl at that point
> >
> > ANd bingo I got booted out at the exact same point when the Bioperl
> > install tried to re-install(?) Module:Build which failed as non root
>
> Did you follow steps 7 and 8 of
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?
>
> If you managed to install Bundle::CPAN, when you now run 'cpan' it
> should start up and tell you its version number, which should be v1.9102
> or higher. If its lower, you didn't manage to install the latest CPAN,
> or you haven't managed to tell Perl where your newly installed modules
are.
>
>
> > I guess for all future modules I will adopt the option 3 you detailed ,
> > i.e just have the modules sitting somewhere and use them from there
> >
> > But I am still interested in getting it done right via CPAN.
>

From torsten.seemann at infotech.monash.edu.au  Fri Aug 10 17:48:56 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 11 Aug 2007 07:48:56 +1000
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>
	<20070810152336.898c3979@dogwood.plantbio.uga.edu>
Message-ID: <a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>

> Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run.  If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run.

Yes, Matthew is one of the authors of PSORTB and I spent a bit of time
last year trying to fix this problem (unsuccessfully). The PSORTB docs
http://www.psort.org/downloads/index.html
explain how to get around this problem just as Guojun describes. I use
a custom BioPerl installation just for PSORTB!

 I was under the impression it was already filed as a bug, but my
searching indicates this is not so.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University

From cjfields at uiuc.edu  Fri Aug 10 18:04:20 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 10 Aug 2007 17:04:20 -0500
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>
References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>
	<20070810152336.898c3979@dogwood.plantbio.uga.edu>
	<a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>
Message-ID: <41A08079-6EEC-4B62-8104-C41E70C03083@uiuc.edu>


On Aug 10, 2007, at 4:48 PM, Torsten Seemann wrote:

>> Interestingly, I found the message in bioperl-l from Matthew Laird  
>> 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast  
>> DOES run.  If one comments out this line in StandAloneBlast.pm,  
>> the execution succeeds perfectly fine". It seemed to be mysterious  
>> when I uncommented the " $self->throw("$executable call crashed:  
>> $? $! $commandstring\n") unless ($status==0) ;" line, the blastall  
>> runs. The only difference from what Matthew saw is that, when I  
>> did not uncomment the line, blastall DID NOT run.
>
> Yes, Matthew is one of the authors of PSORTB and I spent a bit of time
> last year trying to fix this problem (unsuccessfully). The PSORTB docs
> http://www.psort.org/downloads/index.html
> explain how to get around this problem just as Guojun describes. I use
> a custom BioPerl installation just for PSORTB!
>
>  I was under the impression it was already filed as a bug, but my
> searching indicates this is not so.
>
> -- 
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Might be wise to go ahead and add it to bugzilla so we can track it,  
along with the workaround.

chris

From neetisomaiya at gmail.com  Mon Aug 13 06:29:39 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 13 Aug 2007 15:59:39 +0530
Subject: [Bioperl-l] Homologene parser?
Message-ID: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>

Hi,

Does anyone know of any Homologene parser, if available?
Please let me know.

Thanks and Regards,
Neeti.


-- 
-Neeti
Even my blood says, B positive

From shameer at ncbs.res.in  Mon Aug 13 07:07:45 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 16:37:45 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add
 direction to SeqFeature
In-Reply-To: <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
Message-ID: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>

Dear All,

I am generating images based on Transcription Factor binding site data
using bio::graphics module.
I created my images using program : version-2 
[http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
Stein ). I attaching one of the image with this mail.

I need to make 3 changes to this image

1. to color the 'scale'
Color the scale in two different colors ie, from start 1.0k - color blue
from 101 - till end of the scale green (I thoroghly checked the
Bio::Graphics document, I couldnt find an option to do this )

2. to sort the Transcription factors based on the z_score

3. to give forward/reverse [> or < ]direction for the black boxes

I would appreaciate if any one can give me some clues/link to accomplish
this :).
thanks in advance ,
Shameer

-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TF_top3.png
Type: image/png
Size: 2188 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070813/6a4423bd/attachment.png 

From bix at sendu.me.uk  Mon Aug 13 09:11:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 14:11:50 +0100
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>	<1178028249.2644.13.camel@localhost.localdomain>	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
Message-ID: <46C05896.1010002@sendu.me.uk>

Shameer Khadar wrote:
> Dear All,
> 
> I am generating images based on Transcription Factor binding site data
> using bio::graphics module.
> I created my images using program : version-2 
> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
> Stein ). I attaching one of the image with this mail.
> 
> I need to make 3 changes to this image
> 
> 1. to color the 'scale'
> Color the scale in two different colors ie, from start 1.0k - color blue
> from 101 - till end of the scale green (I thoroghly checked the
> Bio::Graphics document, I couldnt find an option to do this )

The scale is just a scale and shouldn't need colouring. You can do what 
you want by having a blue 'upstream' feature and a green 'gene' feature 
in the first row.


> 2. to sort the Transcription factors based on the z_score

I don't know Bio::Graphics well enough, but am interested in the answer...


> 3. to give forward/reverse [> or < ]direction for the black boxes

Presumably you just change the glyph type of your binding sites to 
something that shows direction, like 'processed_transcript'. Someone 
else may have a more appropriate suggestion.

However, do your binding sites really have a direction? That is, do you 
really know which strand your transcription factor bound to?


From cjfields at uiuc.edu  Mon Aug 13 10:39:11 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 09:39:11 -0500
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
	add direction to SeqFeature
In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
Message-ID: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>


On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote:

> Dear All,
>
> I am generating images based on Transcription Factor binding site data
> using bio::graphics module.
> I created my images using program : version-2
> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
> Stein ). I attaching one of the image with this mail.
>
> I need to make 3 changes to this image
>
> 1. to color the 'scale'
> Color the scale in two different colors ie, from start 1.0k - color  
> blue
> from 101 - till end of the scale green (I thoroghly checked the
> Bio::Graphics document, I couldnt find an option to do this )

Much of the documentation you need is available via 'perldoc  
Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes.   
The above may be possible using two seqfeatures instead of one or  
maybe a split location with a callback (not sure, haven't tried  
either, mileage may vary, batteries not included, warranty void if  
packaging is opened, etc).  Might be worth checking out the POD for  
the arrow glyph to see what's possible.

> 2. to sort the Transcription factors based on the z_score

In Bio::Graphics::Panel POD under 'Glyph Options', there is  
documentation for 'sort_order' which accepts callbacks.  According to  
the docs you would basically do something like the following (the  
prototype is required; note the score):

   -sort_order => sub ($$) {
     my ($glyph1,$glyph2) = @_;
     my $a = $glyph1->feature;
     my $b = $glyph2->feature;
     ( $b->score/log($b->length)
           <=>
       $a->score/log($a->length) )
           ||
     ( $a->start <=> $b->start )
   }

Again, haven't tried.

> 3. to give forward/reverse [> or < ]direction for the black boxes

I think you first need to ensure the glyph will accept strandedness,  
though I think most do.  Then you would set either the 'strand_arrow'  
or 'stranded' option to 1 (they are synonyms).  Again, see  
Bio::Graphics::Panel POD under Glyph Options, specifically the  
parameter 'stranded' or 'strand_arrow'.

> I would appreaciate if any one can give me some clues/link to  
> accomplish
> this :).
> thanks in advance ,
> Shameer

No problem!

chris

> -- 
> Shameer Khadar
> Lab (# 25) The Computational Biology Group
> National Centre for Biological Sciences (TIFR)
> GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
> T - 91-080-23666001 EXT - 6251
> W - http://www.ncbs.res.in
> <TF_top3.png>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Mon Aug 13 10:47:35 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 20:17:35 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <46C05896.1010002@sendu.me.uk>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
Message-ID: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>

Dear Sendu,

Thanks for your reply.

>> I need to make 3 changes to this image
>>
>> 1. to color the 'scale'
>> Color the scale in two different colors ie, from start 1.0k - color blue
>> from 101 - till end of the scale green (I thoroghly checked the
>> Bio::Graphics document, I couldnt find an option to do this )
>
> The scale is just a scale and shouldn't need colouring. You can do what
> you want by having a blue 'upstream' feature and a green 'gene' feature
> in the first row.
Thanks for the point : 'The scale is just a scale...'.
But my idea is to differentiate the scale in to three to diffentiate
between 100bp upstream region, UTR and gene start site. starting point of
scale till 0k is the 100bp upstream. From 0k till end of the current_scale
is UTR, from the end of scale gene starts, since this is a bit tough to
distinguish, we thought of this coloring option. Addition of an extra
track may is an alternate option (I tried to convince our experimental
team by adding an extra track, but they want it this way :(..)

>
>> 2. to sort the Transcription factors based on the z_score
> I don't know Bio::Graphics well enough, but am interested in the answer...
>
It is possible, but sort_order option is available. I tried it a couple of
times but it is not  working.

>
>> 3. to give forward/reverse [> or < ]direction for the black boxes
>
> Presumably you just change the glyph type of your binding sites to
> something that shows direction, like 'processed_transcript'. Someone
> else may have a more appropriate suggestion.
Thanks, I will look in to it.

>
> However, do your binding sites really have a direction? That is, do you
> really know which strand your transcription factor bound to?
Yes, these info we collated from various experimental datasets.

-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From bix at sendu.me.uk  Mon Aug 13 11:01:43 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 16:01:43 +0100
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
Message-ID: <46C07257.1000308@sendu.me.uk>

Shameer Khadar wrote:
>> However, do your binding sites really have a direction? That is, do you
>> really know which strand your transcription factor bound to?
 >
> Yes, these info we collated from various experimental datasets.

Well, those datasets I'd like to see... What I was getting at is the 
strand probably isn't known at the experimental level, but to describe 
the site a strand has to be arbitrarily picked so you can write the 
sequence of the site down as a single string. Its probably the case that 
the strand information you have is just the way it happened to be 
reported in the literature and has no biological meaning.


From shameer at ncbs.res.in  Mon Aug 13 11:16:33 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 20:46:33 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>
Message-ID: <42833.192.168.1.1.1187018193.squirrel@mail.ncbs.res.in>

Chris,

Thanks for your detailed reply.
I will read up the docs and try different options using ur code snippets
as starting point. I will get back to the list with my results.

Thanks
-- 
Shameer

>
> On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote:
>
>> Dear All,
>>
>> I am generating images based on Transcription Factor binding site data
>> using bio::graphics module.
>> I created my images using program : version-2
>> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
>> Stein ). I attaching one of the image with this mail.
>>
>> I need to make 3 changes to this image
>>
>> 1. to color the 'scale'
>> Color the scale in two different colors ie, from start 1.0k - color
>> blue
>> from 101 - till end of the scale green (I thoroghly checked the
>> Bio::Graphics document, I couldnt find an option to do this )
>
> Much of the documentation you need is available via 'perldoc
> Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes.
> The above may be possible using two seqfeatures instead of one or
> maybe a split location with a callback (not sure, haven't tried
> either, mileage may vary, batteries not included, warranty void if
> packaging is opened, etc).  Might be worth checking out the POD for
> the arrow glyph to see what's possible.
>
>> 2. to sort the Transcription factors based on the z_score
>
> In Bio::Graphics::Panel POD under 'Glyph Options', there is
> documentation for 'sort_order' which accepts callbacks.  According to
> the docs you would basically do something like the following (the
> prototype is required; note the score):
>
>    -sort_order => sub ($$) {
>      my ($glyph1,$glyph2) = @_;
>      my $a = $glyph1->feature;
>      my $b = $glyph2->feature;
>      ( $b->score/log($b->length)
>            <=>
>        $a->score/log($a->length) )
>            ||
>      ( $a->start <=> $b->start )
>    }
>
> Again, haven't tried.
>
>> 3. to give forward/reverse [> or < ]direction for the black boxes
>
> I think you first need to ensure the glyph will accept strandedness,
> though I think most do.  Then you would set either the 'strand_arrow'
> or 'stranded' option to 1 (they are synonyms).  Again, see
> Bio::Graphics::Panel POD under Glyph Options, specifically the
> parameter 'stranded' or 'strand_arrow'.
>


-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From bix at sendu.me.uk  Mon Aug 13 11:47:10 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 16:47:10 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>	
	<46BB93DC.9010608@sendu.me.uk>	
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>	
	<46BC16A9.7090709@sendu.me.uk>
	<aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>
Message-ID: <46C07CFE.7020105@sendu.me.uk>

hari jayaram wrote:
> Hey all ,
> Thanks for your help. Its working real well now.
[snip]
> I am thinking I will screencast this process for everyones benefit and 
> put it up on bioscreencast.com <http://bioscreencast.com> . If that will 
> be useful for others.

I'm certain it will. That's a very interesting website. Thanks for 
taking the time, and I hope you find Bioperl useful.

From cjfields at uiuc.edu  Mon Aug 13 12:24:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 11:24:15 -0500
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
	add direction to SeqFeature
In-Reply-To: <46C07257.1000308@sendu.me.uk>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
Message-ID: <A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>


On Aug 13, 2007, at 10:01 AM, Sendu Bala wrote:

> Shameer Khadar wrote:
>>> However, do your binding sites really have a direction? That is,  
>>> do you
>>> really know which strand your transcription factor bound to?
>>
>> Yes, these info we collated from various experimental datasets.
>
> Well, those datasets I'd like to see... What I was getting at is the
> strand probably isn't known at the experimental level, but to describe
> the site a strand has to be arbitrarily picked so you can write the
> sequence of the site down as a single string. Its probably the case  
> that
> the strand information you have is just the way it happened to be
> reported in the literature and has no biological meaning.

It's subjective.  I can think of several cases where strandedness  
does matter and has meaning.  If the motif is related to how the gene  
is transcribed or post-transcriptionally regulated, for instance;  
elements which indicate start of transcription (-10/-35 or any sigma- 
factor-related promoter element in prokaryotes), end of transcription  
(poly-A signal, transcription terminators), modulation of translation  
(SECIS, IRES), or conserved DNA motifs which are transcribed prior to  
regulation (RNA-binding proteins like IRE).

chris

From amacgregor at ccg.murdoch.edu.au  Mon Aug 13 20:52:10 2007
From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor)
Date: Tue, 14 Aug 2007 08:52:10 +0800
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
Message-ID: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>

On 13/08/2007, at 6:29 PM, neeti somaiya wrote:

> Hi,
>
> Does anyone know of any Homologene parser, if available?
> Please let me know.
>
> Thanks and Regards,
> Neeti.

Hi Neeti,

Quite a long time ago now I wrote an Homologene parser and posted it  
to the mailing list:

<http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>

I don't know if this still works but you could use it as a starting  
point. There may also be something newer out there too, I don't know.  
If you search the mailing list archives you'll get a few messages  
around the topic.

Cheers, Andrew.


Andrew Macgregor
Centre for Comparative Genomics, Murdoch University
Email: amacgregor at ccg.murdoch.edu.au
Tel: (08) 9360 2961


From cjfields at uiuc.edu  Mon Aug 13 23:21:54 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 22:21:54 -0500
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
Message-ID: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>

It looks like Heikki responded and thought a good place for it would  
be Bio::SeqIO, but it didn't go anywhere I suppose.  I see that a few  
other posts suggest it could be placed in Bio::Cluster as well which  
I'm not familiar with.  We could add it in if you were still  
interested, just need to find a good place for it; might be nice to  
have a Parse::RecDescent-based parser.

chris

On Aug 13, 2007, at 7:52 PM, Andrew Macgregor wrote:

> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>
>> Hi,
>>
>> Does anyone know of any Homologene parser, if available?
>> Please let me know.
>>
>> Thanks and Regards,
>> Neeti.
>
> Hi Neeti,
>
> Quite a long time ago now I wrote an Homologene parser and posted it
> to the mailing list:
>
> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>
> I don't know if this still works but you could use it as a starting
> point. There may also be something newer out there too, I don't know.
> If you search the mailing list archives you'll get a few messages
> around the topic.
>
> Cheers, Andrew.
>
>
> Andrew Macgregor
> Centre for Comparative Genomics, Murdoch University
> Email: amacgregor at ccg.murdoch.edu.au
> Tel: (08) 9360 2961
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Tue Aug 14 03:46:19 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 14 Aug 2007 08:46:19 +0100
Subject: [Bioperl-l] Warnings/errors generated by Eclipse
Message-ID: <46C15DCB.80603@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've just been setting up Eclipse with the EPIC plugin, and it's
generating some errors and warnings about bioperl-live that I'd like to
pass by you.

I think most of the errors are along the lines of:
"Can't find 'build_params' in _build in
/usr/local/share/perl/5.8.8/Module/Build/Base.pm line 1011"

This occurs with files like:
t/Biblio_biofetch.t
t/seqread_fail.t

I think it's to do with the parameters passed to test_begin() or it
could be my setup of Eclipse?

Other highlighted problems are some of the scripts in the examples dir.
Some require modules that reside in the bioperl-run package. Would it be
wise to move these to the bioperl-run examples dir?

There may also be some problems with XML files in t/data e.g.
t/data/interpro_ebi.xml
There appears to be a typo on line 2. However, I'm not sure this is
up-to-date? I can comment on the others later if required.

Cheers
Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGwV3KczuW2jkwy2gRApM/AJ9abWl02CAJqDK2sEXEUEg8nGRC4ACdHcAb
nZmh+1dmtc1W9mThkUVKitw=
=5eXZ
-----END PGP SIGNATURE-----

From amacgregor at ccg.murdoch.edu.au  Tue Aug 14 01:14:58 2007
From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor)
Date: Tue, 14 Aug 2007 13:14:58 +0800
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
Message-ID: <C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>

On 14/08/2007, at 11:21 AM, Chris Fields wrote:

> It looks like Heikki responded and thought a good place for it  
> would be Bio::SeqIO, but it didn't go anywhere I suppose.  I see  
> that a few other posts suggest it could be placed in Bio::Cluster  
> as well which I'm not familiar with.  We could add it in if you  
> were still interested, just need to find a good place for it; might  
> be nice to have a Parse::RecDescent-based parser.
>
> chris
>

Hi Chris,

I was also doing some parsing of UniGene at the time but found  
RecDescent was too slow and went back to regexes. That code found  
it's way into Bio::Cluster. Occasionally I see a message with someone  
looking for a Homologene parser but not very often, so I'm not sure  
it is worth the effort of moving the code into bioperl.

Cheers, Andrew.

From neetisomaiya at gmail.com  Tue Aug 14 09:24:07 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Tue, 14 Aug 2007 18:54:07 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
Message-ID: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>

Hi Andrew,

I think the homologene data files have changed now on the ftp, from what you
had used.
It is now homologene.data and homologene.xml.
I tried using your parser, but because it was written on the file
hmlg.trip.ftp, it doesnt work anymore.

I came across a parser
http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
.
I am looking at it to see if it works for me. NOt sure if it will.

~Neeti.

On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>
> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>
> > Hi,
> >
> > Does anyone know of any Homologene parser, if available?
> > Please let me know.
> >
> > Thanks and Regards,
> > Neeti.
>
> Hi Neeti,
>
> Quite a long time ago now I wrote an Homologene parser and posted it
> to the mailing list:
>
> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>
> I don't know if this still works but you could use it as a starting
> point. There may also be something newer out there too, I don't know.
> If you search the mailing list archives you'll get a few messages
> around the topic.
>
> Cheers, Andrew.
>
>
> Andrew Macgregor
> Centre for Comparative Genomics, Murdoch University
> Email: amacgregor at ccg.murdoch.edu.au
> Tel: (08) 9360 2961
>
>
>
>


-- 
-Neeti
Even my blood says, B positive

From bix at sendu.me.uk  Tue Aug 14 10:57:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 14 Aug 2007 15:57:29 +0100
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
Message-ID: <46C1C2D9.6050409@sendu.me.uk>

I'm looking at what looks like a pretty major bug in Bio::SimpleAlign, 
but before I commit the fix I wanted to check my sanity/understanding.

My understanding is that an alignment may be built from just sub-parts 
of a number of sequences. So you give each sequence in the alignment a 
start and stop so you can later map back the aligned region to the 
original sequence. So, for example, the following should all pass:

diff -r1.56 SimpleAlign.t
459a460,540
 >
 >
 > # is _remove_col really working correctly?
 > my $a = Bio::LocatableSeq->new(-id => 'a', -seq => 
'atcgatcgatcgatcg', -start => 5, -end => 20);
 > my $b = Bio::LocatableSeq->new(-id => 'b', -seq => 
'-tcgatc-atcgatcg', -start => 30, -end => 43);
 > my $c = Bio::LocatableSeq->new(-id => 'c', -seq => 
'atcgatcgatc-atc-', -start => 50, -end => 63);
 > my $d = Bio::LocatableSeq->new(-id => 'd', -seq => 
'--cgatcgatcgat--', -start => 80, -end => 91);
 > my $e = Bio::LocatableSeq->new(-id => 'e', -seq => 
'-t-gatcgatcga-c-', -start => 100, -end => 111);
 > $aln = Bio::SimpleAlign->new();
 > $aln->add_seq($a);
 > $aln->add_seq($b);
 > $aln->add_seq($c);
 >
 > my $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 6;
 >               is $seq->end, 19;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 >       elsif ($seq->id eq 'b') {
 >               is $seq->start, 30;
 >               is $seq->end, 42;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 >       elsif ($seq->id eq 'c') {
 >               is $seq->start, 51;
 >               is $seq->end, 63;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 > }
 >
 > $aln->add_seq($d);
 > $aln->add_seq($e);
 > $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 8;
 >               is $seq->end, 17;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'b') {
 >               is $seq->start, 32;
 >               is $seq->end, 40;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'c') {
 >               is $seq->start, 53;
 >               is $seq->end, 61;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'd') {
 >               is $seq->start, 81;
 >               is $seq->end, 90;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'e') {
 >               is $seq->start, 101;
 >               is $seq->end, 110;
 >               is $seq->seq, 'gatcatca';
 >       }
 > }
 >
 > my $f = Bio::LocatableSeq->new(-id => 'f', -seq => 
'a-cgatcgatcgat-g', -start => 30, -end => 43);
 > $aln = Bio::SimpleAlign->new();
 > $aln->add_seq($a);
 > $aln->add_seq($f);
 >
 > $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 5;
 >               is $seq->end, 20;
 >               is $seq->seq, 'acgatcgatcgatg';
 >       }
 >       elsif ($seq->id eq 'f') {
 >               is $seq->start, 30;
 >               is $seq->end, 43;
 >               is $seq->seq, 'acgatcgatcgatg';
 >       }
 > }


But they don't. Once you remove certain columns the start and stop of 
the sequences in the alignment are no longer correct coordinates for the 
sub-sequence in the original sequence.

I propose the following patch to resolve this issue:

diff -r1.136 SimpleAlign.pm
1116c1116,1118
<
---
 >
 >     my $gap = $self->gap_char;
 >
1129,1137c1131,1147
<             my $spliced;
<             $spliced .= $start > 0 ? substr($sequence,0,$start) : '';
<             $spliced .= substr($sequence,$end+1,$seq->length-$end+1);
<             $sequence = $spliced;
<             if ($start == 1) {
<               $new_seq->start($end);
<             }
<             else {
<               $new_seq->start( $seq->start);
---
 >             my $orig = $sequence;
 >             my $head =  $start > 0 ? substr($sequence, 0, $start) : '';
 >             my $tail = ($end + 1) >= length($sequence) ? '' : 
substr($sequence, $end + 1);
 >             $sequence = $head.$tail;
 >             # start
 >             unless (defined $new_seq->start) {
 >                 if ($start == 0) {
 >                     my $start_adjust = () = substr($orig, 0, $end + 
1) =~ /$gap/g;
 >                     $new_seq->start($seq->start + $end + 1 - 
$start_adjust);
 >                 }
 >                 else {
 >                     my $start_adjust = $orig =~ /$gap+/;
 >                     if ($start_adjust) {
 >                         $start_adjust = $+[0] - 1 < $start;
 >                     }
 >                     $new_seq->start($seq->start + $start_adjust);
 >                 }
1140,1141c1150,1152
<             if($end >= $seq->end){
<              $new_seq->end( $start);
---
 >             if (($end + 1) >= length($orig)) {
 >                 my $end_adjust = () = substr($orig, $start) =~ /$gap/g;
 >                 $new_seq->end($seq->end - (length($orig) - $start) + 
$end_adjust);
1144c1155
<              $new_seq->end($seq->end);
---
 >                 $new_seq->end($seq->end);
1148c1159
<                 push @new, $new_seq;
---
 >               push @new, $new_seq;
1207,1209c1218,1234
<       # sort the positions to remove columns at the end 1st
<       @$positions = sort { $b->[0] <=> $a->[0] } @$positions;
<       $aln = $self->_remove_col($aln,$positions);
---
 >       # sort the positions
 >       @$positions = sort { $a->[0] <=> $b->[0] } @$positions;
 >
 >     my @remove;
 >     my $length = 0;
 >     foreach my $pos (@{$positions}) {
 >         my ($start, $end) = @{$pos};
 >
 >         #have to offset the start and end for subsequent removes
 >         $start-=$length;
 >         $end  -=$length;
 >         $length += ($end-$start+1);
 >         push @remove, [$start,$end];
 >     }
 >
 >     #remove the segments
 >     $aln = $#remove >= 0 ? $self->_remove_col($aln,\@remove) : $self;


This breaks 2 tests in SimpleAlign.t, but as far as I can tell, those 
tests expect the wrong answer. Changed to expect the correct answer, 
SimpleAlign.t and all other tests in the test suite pass.

diff -r1.56 SimpleAlign.t
214,215c214,215
<       "P84139/1-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
<       "P814153/1-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
---
 >       "P84139/2-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
 >       "P814153/2-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
229c229
<       "gb|443893|124775/1-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
---
 >       "gb|443893|124775/2-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",


Can someone triple-check my thinking and report back please?

Cheers,
Sendu.

From basu at pharm.sunysb.edu  Tue Aug 14 11:02:06 2007
From: basu at pharm.sunysb.edu (Siddhartha Basu)
Date: Tue, 14 Aug 2007 11:02:06 -0400
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
Message-ID: <46C1C3EE.4030006@pharm.sunysb.edu>

neeti somaiya wrote:
> Hi Andrew,
> 
> I think the homologene data files have changed now on the ftp, from what you
> had used.
> It is now homologene.data and homologene.xml.
> I tried using your parser, but because it was written on the file
> hmlg.trip.ftp, it doesnt work anymore.
> 
> I came across a parser
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> .
> I am looking at it to see if it works for me. NOt sure if it will.
> 
> ~Neeti.

Hi Neeti,
I have recently written a parser for 'homologene' xml data specific for 
my purpose. I am not sure whether it will suit your purpose but it could 
be extended for general purpose parsing, so i am putting it forward. 
Here is how it works .......

* It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
* It does SAX based parsing (currently uses XML::SAX::ExpatXS)
* Returns a graph(uses Graph module of perl) object where each node is a 
homologue entry with its corresponding entrez gene id. Each node also 
contain the following attributes ...
	* Refseq protein id.
	* Protein id (pid)
	* ncbi taxon id.
* The edge attribute contain information about the ortholog(true/false) 
relationship between two nodes.
* The rest of tags currently are not being extracted. However, parsing 
the rest of the tags should not be very difficult.

Generally i get homologene xml stream from an 'efetch' through 
Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and 
then works on it.

So, to make it more generic and work on local file

* We need another class that reads the chunk between 
<HG-Entry>.....</HG-Entry> and sends it to the parser.
* Add supports for most of the tags.
* Massage the data to a bioperl compatible object.

The first two i could work it out and for the last one i have to figure 
out the bioperl object that could be suitable (like  Bio::Cluster or 
Bio::NetWork::Node/Edge).

Let me know if it sounds interesting and i will send you the code.

-siddhartha


> 
> On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>>
>>> Hi,
>>>
>>> Does anyone know of any Homologene parser, if available?
>>> Please let me know.
>>>
>>> Thanks and Regards,
>>> Neeti.
>> Hi Neeti,
>>
>> Quite a long time ago now I wrote an Homologene parser and posted it
>> to the mailing list:
>>
>> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>>
>> I don't know if this still works but you could use it as a starting
>> point. There may also be something newer out there too, I don't know.
>> If you search the mailing list archives you'll get a few messages
>> around the topic.
>>
>> Cheers, Andrew.
>>
>>
>> Andrew Macgregor
>> Centre for Comparative Genomics, Murdoch University
>> Email: amacgregor at ccg.murdoch.edu.au
>> Tel: (08) 9360 2961
>>
>>
>>
>>
> 
> 


From cjfields at uiuc.edu  Tue Aug 14 12:33:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 14 Aug 2007 11:33:31 -0500
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
In-Reply-To: <46C1C2D9.6050409@sendu.me.uk>
References: <46C1C2D9.6050409@sendu.me.uk>
Message-ID: <B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>

Could you attach the scripts and patches to a bug report for tracking  
so anyone interested can double-check?  Having them in an email is  
problematic as the text in some clients wraps.

 From what I'm seeing I think we're in general agreement, though I'll  
reason through it to see if I'm following correctly.  The data in the  
SimpleAlign example you give is this:

a/5-20            atcgatcgatcgatcg
b/30-43           -tcgatc-atcgatcg
c/50-63           atcgatcgatc-atc-
                    ****** *** ***

Removing the gaps gives:

a/5-20            tcgatcatcatc
b/30-43           tcgatcatcatc
c/50-63           tcgatcatcatc
                   ************

The start/end is wrong, as you state.  Adjusting to map simple start/ 
ends to the original sequence won't work as we're removing gaps and  
residues in the LocatableSeqs along with it (ends and internal  
residues).  I guess if we want to map back to the original sequence  
accurately we would have to use split locations (not currently  
implemented with LocatableSeq) or maybe a cigar-like syntax against  
consensus (ugh), otherwise we wouldn't know where to map the relevant  
internal gaps (now missing from the alignment) w/o running a local  
alignment against the original sequence:

a/6-11;12-19      tcgatcatcatc
b/30-38;40-42     tcgatcatcatc
c/51-56;58-63     tcgatcatcatc
                   ************

That could get really hairy for long alignments.  We could also  
return multiple SimpleAligns which map correctly (ugh), but what we  
really want (and the API specifies) is a new single SimpleAlign.

It may come down to simply stating it 'voids the warranty' (so-to- 
speak) when modifications are made to alignments which remove/insert  
residues from LocatableSeqs via remove_gaps/remove_columns or  
similar, and either leave as is with relevant warnings or readjust  
start/end appropriately when LocatableSeq residues change.

gapless_a/1-12    tcgatcatcatc
gapless_b/1-12    tcgatcatcatc
gapless_c/1-12    tcgatcatcatc
                   ************

Not sure which is the best approach but anything would be better than  
giving an unexpectedly incorrect answer.

chris

On Aug 14, 2007, at 9:57 AM, Sendu Bala wrote:

> I'm looking at what looks like a pretty major bug in Bio::SimpleAlign,
> but before I commit the fix I wanted to check my sanity/understanding.
>
> My understanding is that an alignment may be built from just sub-parts
> of a number of sequences. So you give each sequence in the alignment a
> start and stop so you can later map back the aligned region to the
> original sequence. So, for example, the following should all pass:
>
> diff -r1.56 SimpleAlign.t
> 459a460,540
>>
>>
>> # is _remove_col really working correctly?
>> my $a = Bio::LocatableSeq->new(-id => 'a', -seq =>
> 'atcgatcgatcgatcg', -start => 5, -end => 20);
>> my $b = Bio::LocatableSeq->new(-id => 'b', -seq =>
> '-tcgatc-atcgatcg', -start => 30, -end => 43);
>> my $c = Bio::LocatableSeq->new(-id => 'c', -seq =>
> 'atcgatcgatc-atc-', -start => 50, -end => 63);
>> my $d = Bio::LocatableSeq->new(-id => 'd', -seq =>
> '--cgatcgatcgat--', -start => 80, -end => 91);
>> my $e = Bio::LocatableSeq->new(-id => 'e', -seq =>
> '-t-gatcgatcga-c-', -start => 100, -end => 111);
>> $aln = Bio::SimpleAlign->new();
>> $aln->add_seq($a);
>> $aln->add_seq($b);
>> $aln->add_seq($c);
>>
>> my $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 6;
>>               is $seq->end, 19;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>>       elsif ($seq->id eq 'b') {
>>               is $seq->start, 30;
>>               is $seq->end, 42;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>>       elsif ($seq->id eq 'c') {
>>               is $seq->start, 51;
>>               is $seq->end, 63;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>> }
>>
>> $aln->add_seq($d);
>> $aln->add_seq($e);
>> $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 8;
>>               is $seq->end, 17;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'b') {
>>               is $seq->start, 32;
>>               is $seq->end, 40;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'c') {
>>               is $seq->start, 53;
>>               is $seq->end, 61;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'd') {
>>               is $seq->start, 81;
>>               is $seq->end, 90;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'e') {
>>               is $seq->start, 101;
>>               is $seq->end, 110;
>>               is $seq->seq, 'gatcatca';
>>       }
>> }
>>
>> my $f = Bio::LocatableSeq->new(-id => 'f', -seq =>
> 'a-cgatcgatcgat-g', -start => 30, -end => 43);
>> $aln = Bio::SimpleAlign->new();
>> $aln->add_seq($a);
>> $aln->add_seq($f);
>>
>> $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 5;
>>               is $seq->end, 20;
>>               is $seq->seq, 'acgatcgatcgatg';
>>       }
>>       elsif ($seq->id eq 'f') {
>>               is $seq->start, 30;
>>               is $seq->end, 43;
>>               is $seq->seq, 'acgatcgatcgatg';
>>       }
>> }
>
>
> But they don't. Once you remove certain columns the start and stop of
> the sequences in the alignment are no longer correct coordinates  
> for the
> sub-sequence in the original sequence.
>
> I propose the following patch to resolve this issue:
>
> diff -r1.136 SimpleAlign.pm
> 1116c1116,1118
> <
> ---
>>
>>     my $gap = $self->gap_char;
>>
> 1129,1137c1131,1147
> <             my $spliced;
> <             $spliced .= $start > 0 ? substr($sequence,0,$start) :  
> '';
> <             $spliced .= substr($sequence,$end+1,$seq->length-$end 
> +1);
> <             $sequence = $spliced;
> <             if ($start == 1) {
> <               $new_seq->start($end);
> <             }
> <             else {
> <               $new_seq->start( $seq->start);
> ---
>>             my $orig = $sequence;
>>             my $head =  $start > 0 ? substr($sequence, 0,  
>> $start) : '';
>>             my $tail = ($end + 1) >= length($sequence) ? '' :
> substr($sequence, $end + 1);
>>             $sequence = $head.$tail;
>>             # start
>>             unless (defined $new_seq->start) {
>>                 if ($start == 0) {
>>                     my $start_adjust = () = substr($orig, 0, $end +
> 1) =~ /$gap/g;
>>                     $new_seq->start($seq->start + $end + 1 -
> $start_adjust);
>>                 }
>>                 else {
>>                     my $start_adjust = $orig =~ /$gap+/;
>>                     if ($start_adjust) {
>>                         $start_adjust = $+[0] - 1 < $start;
>>                     }
>>                     $new_seq->start($seq->start + $start_adjust);
>>                 }
> 1140,1141c1150,1152
> <             if($end >= $seq->end){
> <              $new_seq->end( $start);
> ---
>>             if (($end + 1) >= length($orig)) {
>>                 my $end_adjust = () = substr($orig, $start) =~ / 
>> $gap/g;
>>                 $new_seq->end($seq->end - (length($orig) - $start) +
> $end_adjust);
> 1144c1155
> <              $new_seq->end($seq->end);
> ---
>>                 $new_seq->end($seq->end);
> 1148c1159
> <                 push @new, $new_seq;
> ---
>>               push @new, $new_seq;
> 1207,1209c1218,1234
> <       # sort the positions to remove columns at the end 1st
> <       @$positions = sort { $b->[0] <=> $a->[0] } @$positions;
> <       $aln = $self->_remove_col($aln,$positions);
> ---
>>       # sort the positions
>>       @$positions = sort { $a->[0] <=> $b->[0] } @$positions;
>>
>>     my @remove;
>>     my $length = 0;
>>     foreach my $pos (@{$positions}) {
>>         my ($start, $end) = @{$pos};
>>
>>         #have to offset the start and end for subsequent removes
>>         $start-=$length;
>>         $end  -=$length;
>>         $length += ($end-$start+1);
>>         push @remove, [$start,$end];
>>     }
>>
>>     #remove the segments
>>     $aln = $#remove >= 0 ? $self->_remove_col($aln,\@remove) : $self;
>
>
> This breaks 2 tests in SimpleAlign.t, but as far as I can tell, those
> tests expect the wrong answer. Changed to expect the correct answer,
> SimpleAlign.t and all other tests in the test suite pass.
>
> diff -r1.56 SimpleAlign.t
> 214,215c214,215
> <       "P84139/1-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
> <       "P814153/1-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
> ---
>>       "P84139/2-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
>>       "P814153/2-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
> 229c229
> <       "gb|443893|124775/1-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
> ---
>>       "gb|443893|124775/2-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
>
>
> Can someone triple-check my thinking and report back please?
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Aug 14 13:13:30 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 14 Aug 2007 18:13:30 +0100
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
 columns?
In-Reply-To: <B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
References: <46C1C2D9.6050409@sendu.me.uk>
	<B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
Message-ID: <46C1E2BA.8060606@sendu.me.uk>

Chris Fields wrote:
> Could you attach the scripts and patches to a bug report for tracking
> so anyone interested can double-check?  Having them in an email is 
> problematic as the text in some clients wraps.

http://bugzilla.open-bio.org/show_bug.cgi?id=2344


> From what I'm seeing I think we're in general agreement, though I'll
>  reason through it to see if I'm following correctly.  The data in
> the SimpleAlign example you give is this:
> 
> a/5-20            atcgatcgatcgatcg
> b/30-43           -tcgatc-atcgatcg
> c/50-63           atcgatcgatc-atc-
>                    ****** *** ***
> 
> Removing the gaps gives:
> 
> a/5-20            tcgatcatcatc
> b/30-43           tcgatcatcatc
> c/50-63           tcgatcatcatc
>                   ************
> 
> The start/end is wrong, as you state.

Yes. For extra clarity, my thinking is that the correct answer is:

a/6-19            tcgatcatcatc
b/30-42           tcgatcatcatc
c/51-63           tcgatcatcatc
                   ************


> Adjusting to map simple start/ends to the original sequence won't
> work as we're removing gaps and residues in the LocatableSeqs along
> with it (ends and internal residues).  I guess if we want to map back
> to the original sequence accurately [snip]

What you say in the rest of your discussion is valid and deserves some 
thought/discussion, but for now just getting the start and end correct, 
ignoring any issues with internal residues, seems like a no-brainer.

For my own purposes that is all I need; having removed gaps I only need 
the start and end so I can take that region from each sequence and do a 
new alignment (for example).


BTW. Either my patch isn't quite perfect or there's another related bug 
I'm still tracking down. I'll commit when I've solved that, unless 
someone points out any mistakes in my thinking.

From basu at pharm.stonybrook.edu  Tue Aug 14 12:16:23 2007
From: basu at pharm.stonybrook.edu (Siddhartha Basu)
Date: Tue, 14 Aug 2007 12:16:23 -0400
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
Message-ID: <46C1D557.7090101@pharm.stonybrook.edu>

neeti somaiya wrote:
> Hi Andrew,
> 
> I think the homologene data files have changed now on the ftp, from what you
> had used.
> It is now homologene.data and homologene.xml.
> I tried using your parser, but because it was written on the file
> hmlg.trip.ftp, it doesnt work anymore.
> 
> I came across a parser
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> .
> I am looking at it to see if it works for me. NOt sure if it will.
> 
> ~Neeti.

Hi Neeti,
I have recently written a parser for 'homologene' xml data specific for
my purpose. I am not sure whether it will suit your purpose but it could
be extended for general purpose parsing, so i am putting it forward.
Here is how it works .......

* It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
* It does SAX based parsing (currently uses XML::SAX::ExpatXS)
* Returns a graph(uses Graph module of perl) object where each node is a
homologue entry with its corresponding entrez gene id. Each node also
contain the following attributes ...
	* Refseq protein id.
	* Protein id (pid)
	* ncbi taxon id.
* The edge attribute contain information about the ortholog(true/false)
relationship between two nodes.
* The rest of tags currently are not being extracted. However, parsing
the rest of the tags should not be very difficult.

Generally i get homologene xml stream from an 'efetch' through
Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and
then works on it.

So, to make it more generic and work on local file

* We need another class that reads the chunk between
<HG-Entry>.....</HG-Entry> and sends it to the parser.
* Add supports for most of the tags.
* Massage the data to a bioperl compatible object.

The first two i could work it out and for the last one i have to figure
out the bioperl object that could be suitable (like  Bio::Cluster or
Bio::NetWork::Node/Edge).

Let me know if it sounds interesting and i will send you the code.

-siddhartha


> 
> On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>>
>>> Hi,
>>>
>>> Does anyone know of any Homologene parser, if available?
>>> Please let me know.
>>>
>>> Thanks and Regards,
>>> Neeti.
>> Hi Neeti,
>>
>> Quite a long time ago now I wrote an Homologene parser and posted it
>> to the mailing list:
>>
>> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>>
>> I don't know if this still works but you could use it as a starting
>> point. There may also be something newer out there too, I don't know.
>> If you search the mailing list archives you'll get a few messages
>> around the topic.
>>
>> Cheers, Andrew.
>>
>>
>> Andrew Macgregor
>> Centre for Comparative Genomics, Murdoch University
>> Email: amacgregor at ccg.murdoch.edu.au
>> Tel: (08) 9360 2961
>>
>>
>>
>>
> 
> 


From cjfields at uiuc.edu  Tue Aug 14 13:19:59 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 14 Aug 2007 12:19:59 -0500
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
In-Reply-To: <46C1E2BA.8060606@sendu.me.uk>
References: <46C1C2D9.6050409@sendu.me.uk>
	<B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
	<46C1E2BA.8060606@sendu.me.uk>
Message-ID: <EE515FDC-2223-4D03-B819-3EA909539A61@uiuc.edu>


On Aug 14, 2007, at 12:13 PM, Sendu Bala wrote:
...

>
> Yes. For extra clarity, my thinking is that the correct answer is:
>
> a/6-19            tcgatcatcatc
> b/30-42           tcgatcatcatc
> c/51-63           tcgatcatcatc
>  ...
> What you say in the rest of your discussion is valid and deserves  
> some thought/discussion, but for now just getting the start and end  
> correct, ignoring any issues with internal residues, seems like a  
> no-brainer.
>
> For my own purposes that is all I need; having removed gaps I only  
> need the start and end so I can take that region from each sequence  
> and do a new alignment (for example).

It might be worth addressing the split location issue in the bug  
report before it gets lost in the ether.  Or maybe start a new one as  
an enhancement request.

> BTW. Either my patch isn't quite perfect or there's another related  
> bug I'm still tracking down. I'll commit when I've solved that,  
> unless someone points out any mistakes in my thinking.

Sounds fine by me.

chris


From gyang at plantbio.uga.edu  Tue Aug 14 15:01:07 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Tue, 14 Aug 2007 15:01:07 -0400
Subject: [Bioperl-l] the most weird thing  I've seen, help please
In-Reply-To: 41A08079-6EEC-4B62-8104-C41E70C03083@uiuc.edu
Message-ID: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>

Hi, all,  
I have two subroutines in my code. One is remoteblast and the other local blast. It works well.  
When I decided to change the remoteblast to local blast, I always get the following error. I downloaded nt database from NCBI as preformatted, but it works ok for both subroutines when I use command line blastall -p blastn.... I changed the db name to 'nt', 'nt.00', the same error message was returned. The error says: "program name was not given an argument", but I apparently gave it there.  Can anybody help me? The code for the two subrountines are very similar:  
   
sub search {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  
my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"query");
my $len=$query->length();
@db=('nt.nal');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new( -program =>"blastn",
                                                         -database =>"$db",
                                                         -e =>"$_[1]");
    my $rc = $factory->blastall($query);  
......  
   
   
sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  
my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('TNDB');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
    my $blast_report = $factory->blastall($query);

  
Thanks a lot!  
Guojun Yang  
Department of Plant Biology  
University of Georgia

From zhaodj at ioz.ac.cn  Wed Aug 15 04:05:36 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Wed, 15 Aug 2007 16:05:36 +0800 (CST)
Subject: [Bioperl-l] the most weird thing  I've seen, help please
In-Reply-To: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>
References: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>
Message-ID: <52820.159.226.67.49.1187165136.squirrel@mail.ioz.ac.cn>

Hi Guojun Yang,

I tested your code,modifying part of them. However,I did not
encounter the error.The modified code follows (see below and the
attachment). The codes run without any error on my Windows XP and
generates a file named lclblastResult.txt

In the codes I use the NCBI ecoli.nt database instead. Some
parameters change without affecting its function.

I think errors may happen in other part of your codes and more
details are needed.

-------code starts-------
#sub search {
use Bio::Tools::Run::StandAloneBlast;
use Bio::SearchIO::blast;

#my $query = Bio::Seq -> new ( -seq=>"$_[0]",
#                              -id=>"query");
my $query=Bio::Seq->new(-seq=>"ctgtattctgggatgca");
my $len=$query->length();

#@db=('nt.nal');
#foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new( -program
=>"blastn",
                                                         -database
=>'D:/blast/bin/ecoli.nt',
                                                         -e =>1,
														 -o=>'lclblastResult.txt');
my $rc = $factory->blastall($query);
-----code ends--------


On Wed, Aug 15, 2007 03:01, Guojun Yang wrote:
> Hi, all,
> I have two subroutines in my code. One is remoteblast and the
other
> local blast. It works well.
> When I decided to change the remoteblast to local blast, I always
get the following error. I downloaded nt database from NCBI as
> preformatted, but it works ok for both subroutines when I use
> command line blastall -p blastn.... I changed the db name to 'nt',
'nt.00', the same error message was returned. The error says:
> "program name was not given an argument", but I apparently gave it
there.  Can anybody help me? The code for the two subrountines are
very similar:
>
> sub search {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"query");
> my $len=$query->length();
> @db=('nt.nal');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new( -program
> =>"blastn",
>                                                          -database
> =>"$db",
>                                                          -e
> =>"$_[1]");
>     my $rc = $factory->blastall($query);
> ......
>
>
> sub ancestor {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"test");
> my $len=$query->length();
> my $long_name=$_[1];
> my $long_start=$_[2];
> my $long_end=$_[3];
> @db=('TNDB');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new(-program
=>
> "blastx",
>                                                         -database
=>
> "$db",
>                                                         -e =>
1e-3,
>                                                         );
>     my $blast_report = $factory->blastall($query);
>
>
> Thanks a lot!
> Guojun Yang
> Department of Plant Biology
> University of Georgia
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


-------------- next part --------------
A non-text attachment was scrubbed...
Name: lclblast.pl
Type: application/octet-stream
Size: 644 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/f40b2950/attachment.obj 

From tania.oh at brasenose.oxford.ac.uk  Wed Aug 15 12:05:15 2007
From: tania.oh at brasenose.oxford.ac.uk (Tania Oh)
Date: Wed, 15 Aug 2007 17:05:15 +0100
Subject: [Bioperl-l] exonerate parser in bioperl-live fails when protein2dna
	comparison is performed
Message-ID: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>

Dear All,

I was trying to use the Bio::SearchIO::Alignment::Exonerate module to  
run and parse my exonerate output. But I've noticed that the parser  
which is actually Bio::SearchIO::Exonerate works if the model used in  
Exonerate is --model est2genome. I used exonerate with the model -- 
model protein2dna and the parser was unable to parse the hsps.


Below is a simple of code I used for testing the output from exonerate:

use Bio::SearchIO;
use strict;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exonerate.output.works
Type: application/octet-stream
Size: 6056 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/e4e43d75/attachment-0002.obj 
-------------- next part --------------
my $searchio = Bio::SearchIO->new(-file => 'test_data/ 
exonerate.output.dontwork
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exonerate.output.dontwork
Type: application/octet-stream
Size: 3283 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/e4e43d75/attachment-0003.obj 
-------------- next part --------------
',
                                    -format => 'exonerate');

   while( my $r = $searchio->next_result ) {
           while(my $hit = $r->next_hit){
                   while(my $hsp = $hit->next_hsp){
                           print $hsp->start. "\t". $hsp->end. "\n";
                   }
           }

     print $r->query_name, "\n";
   }


There are 2 files attached to show the examples of using either the  
est2genome or protein2dna model:
1. exonerate.output.works  - produced from the command line:
exonerate -q exonerate_cdna.fa -t exonerate_genomic.fa --model  
est2genome --bestn 1 > exonerate.output.works

2. exonerate.output.dontwork - produced from the command line:
exonerate -q test_aa.fa -t test_cds.fa --model protein2dna >  
exonerate.output.dontwork


Line 239 in Bio::searchIO::exonerate (cut and pasted below)

elsif(  s/^vulgar:\s+(\S+)\s+         # query sequence id
                  (\d+)\s+(\d+)\s+([\-\+])\s+   # query start-end-strand
                  (\S+)\s+                      # target sequence id
                  (\d+)\s+(\d+)\s+([\-\+])\s+   # target start-end- 
strand
                  (\d+)\s+                      # score
                  //ox ) {

parses the vulgar line of an --model est2genome exonerate output  
well. An example of the (complex) vulgar line which I've truncated  
for readability is:
vulgar: MUSSPSYN 3 1279 + 4.143962167-143965267 28 3074 + 6137 M 8 8  
G 0 1 M 231 231 5 0 2 I 0 253 3 0

whereas the vulgar line I've obtained from a --model protein2dna  
exonerate output is much simpler and the parser fails to pick it up:
vulgar: SJCHGC00851 0 204 . SJCHGC00851 2 614 + 1059 M 204 612

Has anyone encountered this situation before? I've not changed the  
parser as exonerate is widely used for it's est2genome model, and  
thought I'd run it pass the list to see if there is a work around  
solution.

many thanks in advance,
tania


From johnsonmar at mail.nih.gov  Wed Aug 15 12:47:10 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 12:47:10 -0400
Subject: [Bioperl-l] Need assistance with make error
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>

I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
Enterprise Linux 4, and the other running RHEL3.  I'm getting the
following 'make Error 255' when running make test.  I'm not sure what
this error indicates, and whether I should continue with a force
install?  Could you please advise.

 
Failed Test        Stat Wstat Total Fail  Failed  List of Failed

------------------------------------------------------------------------
-------

t/BioFetch_DB.t                  27    1   3.70%  8

t/EMBL_DB.t                      15    3  20.00%  6 13-14

t/Ontology.t          9  2304    50  100 200.00%  1-50

t/TreeIO.t                       41    1   2.44%  42

t/Variation_IO.t                 25    3  12.00%  15 20 25

t/simpleGOparser.t    9  2304    98  196 200.00%  1-98

120 subtests skipped.

Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
okay.

make: *** [test_dynamic] Error 255

 
Thanks,

 
Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/ <http://www.terpsys.com/> 

 
From arareko at campus.iztacala.unam.mx  Wed Aug 15 13:45:39 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 15 Aug 2007 12:45:39 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
Message-ID: <46C33BC3.9000409@campus.iztacala.unam.mx>

Which version of bioperl you're trying to install?

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.
> 
>  
> 
>  
> 
> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> 
> ------------------------------------------------------------------------
> -------
> 
> t/BioFetch_DB.t                  27    1   3.70%  8
> 
> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> 
> t/Ontology.t          9  2304    50  100 200.00%  1-50
> 
> t/TreeIO.t                       41    1   2.44%  42
> 
> t/Variation_IO.t                 25    3  12.00%  15 20 25
> 
> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> 
> 120 subtests skipped.
> 
> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
> okay.
> 
> make: *** [test_dynamic] Error 255
> 
>  
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/ <http://www.terpsys.com/> 
> 
>  
> 
>  
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From mbasu at mail.nih.gov  Wed Aug 15 13:55:50 2007
From: mbasu at mail.nih.gov (Malay)
Date: Wed, 15 Aug 2007 13:55:50 -0400
Subject: [Bioperl-l] Developer docs
Message-ID: <46C33E26.2050004@mail.nih.gov>

Hello All:

I apologize for not searching throughly. But I'd appreciate if someone 
point to a location where I can find any bioperl coding convention that 
I need follow for any code contribution to Bioperl.

-Malay

-- 
Malay K Basu
www.malaybasu.net

From arareko at campus.iztacala.unam.mx  Wed Aug 15 14:39:29 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 15 Aug 2007 13:39:29 -0500
Subject: [Bioperl-l] Developer docs
In-Reply-To: <46C33E26.2050004@mail.nih.gov>
References: <46C33E26.2050004@mail.nih.gov>
Message-ID: <46C34861.8090400@campus.iztacala.unam.mx>

You may want to bookmark this one:

http://bioperl.org/wiki/Developer_Information#BioPerl_Code

Mauricio.

Malay wrote:
> Hello All:
> 
> I apologize for not searching throughly. But I'd appreciate if someone 
> point to a location where I can find any bioperl coding convention that 
> I need follow for any code contribution to Bioperl.
> 
> -Malay
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From johnsonmar at mail.nih.gov  Wed Aug 15 15:01:23 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 15:01:23 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <46C33BC3.9000409@campus.iztacala.unam.mx>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>

This is version 1.4.

Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx] 
Sent: Wednesday, August 15, 2007 1:46 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

Which version of bioperl you're trying to install?

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.
> 
>  
> 
>  
> 
> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> 
> ------------------------------------------------------------------------
> -------
> 
> t/BioFetch_DB.t                  27    1   3.70%  8
> 
> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> 
> t/Ontology.t          9  2304    50  100 200.00%  1-50
> 
> t/TreeIO.t                       41    1   2.44%  42
> 
> t/Variation_IO.t                 25    3  12.00%  15 20 25
> 
> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> 
> 120 subtests skipped.
> 
> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
> okay.
> 
> make: *** [test_dynamic] Error 255
> 
>  
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/ <http://www.terpsys.com/> 
> 
>  
> 
>  
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Wed Aug 15 16:25:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 15:25:30 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>
Message-ID: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>

You'll definitely want to update to the latest (v 1.5.2).  We hope to  
get a new stable release out sometime soon and possibly move to a  
more regular release cycle.

chris

On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> This is version 1.4.
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx]
> Sent: Wednesday, August 15, 2007 1:46 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> Which version of bioperl you're trying to install?
>
> Johnson, Mary (NIH/NCI) [C] wrote:
>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>> following 'make Error 255' when running make test.  I'm not sure what
>> this error indicates, and whether I should continue with a force
>> install?  Could you please advise.
>>
>>
>>
>>
>>
>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>
>> --------------------------------------------------------------------- 
>> ---
>> -------
>>
>> t/BioFetch_DB.t                  27    1   3.70%  8
>>
>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>
>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>
>> t/TreeIO.t                       41    1   2.44%  42
>>
>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>
>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>
>> 120 subtests skipped.
>>
>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,  
>> 98.14%
>> okay.
>>
>> make: *** [test_dynamic] Error 255
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> MAURICIO HERRERA CUADRA
> arareko at campus.iztacala.unam.mx
> Laboratorio de Gen?tica
> Unidad de Morfofisiolog?a y Funci?n
> Facultad de Estudios Superiores Iztacala, UNAM
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonmar at mail.nih.gov  Wed Aug 15 16:32:43 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 16:32:43 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>

I saw the 1.5.2 version, but it stated that this was a developer release and that 1.4 was the latest stable version, so I went with 1.4.  I'll give 1.5.2 a try.

Thanks,


Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu] 
Sent: Wednesday, August 15, 2007 4:26 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

You'll definitely want to update to the latest (v 1.5.2).  We hope to  
get a new stable release out sometime soon and possibly move to a  
more regular release cycle.

chris

On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> This is version 1.4.
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx]
> Sent: Wednesday, August 15, 2007 1:46 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> Which version of bioperl you're trying to install?
>
> Johnson, Mary (NIH/NCI) [C] wrote:
>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>> following 'make Error 255' when running make test.  I'm not sure what
>> this error indicates, and whether I should continue with a force
>> install?  Could you please advise.
>>
>>
>>
>>
>>
>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>
>> --------------------------------------------------------------------- 
>> ---
>> -------
>>
>> t/BioFetch_DB.t                  27    1   3.70%  8
>>
>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>
>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>
>> t/TreeIO.t                       41    1   2.44%  42
>>
>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>
>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>
>> 120 subtests skipped.
>>
>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,  
>> 98.14%
>> okay.
>>
>> make: *** [test_dynamic] Error 255
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> MAURICIO HERRERA CUADRA
> arareko at campus.iztacala.unam.mx
> Laboratorio de Gen?tica
> Unidad de Morfofisiolog?a y Funci?n
> Facultad de Estudios Superiores Iztacala, UNAM
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Aug 15 16:40:32 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 15:40:32 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
Message-ID: <E16950D3-9F60-4862-9325-57CA26107649@uiuc.edu>

The term 'stable' is relative in this case; tons of bugs fixes were  
incorporated in the 1.5.2 release.  There are a few dev-specific  
issues we'll need to resolve prior to a new release; once those are  
out of the way we'll try to get a new 'stable' out.

chris

On Aug 15, 2007, at 3:32 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> I saw the 1.5.2 version, but it stated that this was a developer  
> release and that 1.4 was the latest stable version, so I went with  
> 1.4.  I'll give 1.5.2 a try.
>
> Thanks,
>
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> You'll definitely want to update to the latest (v 1.5.2).  We hope to
> get a new stable release out sometime soon and possibly move to a
> more regular release cycle.
>
> chris
>
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
>
>> This is version 1.4.
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/
>>
>>
>>
>> -----Original Message-----
>> From: Mauricio Herrera Cuadra  
>> [mailto:arareko at campus.iztacala.unam.mx]
>> Sent: Wednesday, August 15, 2007 1:46 PM
>> To: Johnson, Mary (NIH/NCI) [C]
>> Cc: bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Need assistance with make error
>>
>> Which version of bioperl you're trying to install?
>>
>> Johnson, Mary (NIH/NCI) [C] wrote:
>>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>>> following 'make Error 255' when running make test.  I'm not sure  
>>> what
>>> this error indicates, and whether I should continue with a force
>>> install?  Could you please advise.
>>>
>>>
>>>
>>>
>>>
>>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> ---
>>> -------
>>>
>>> t/BioFetch_DB.t                  27    1   3.70%  8
>>>
>>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>>
>>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>>
>>> t/TreeIO.t                       41    1   2.44%  42
>>>
>>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>>
>>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>>
>>> 120 subtests skipped.
>>>
>>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,
>>> 98.14%
>>> okay.
>>>
>>> make: *** [test_dynamic] Error 255
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Mary Johnson
>>>
>>> Sr. Network Engineer
>>>
>>> National Cancer Institute Center for Bioinformatics
>>> Contractor, TerpSys
>>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> MAURICIO HERRERA CUADRA
>> arareko at campus.iztacala.unam.mx
>> Laboratorio de Gen?tica
>> Unidad de Morfofisiolog?a y Funci?n
>> Facultad de Estudios Superiores Iztacala, UNAM
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Kevin.M.Brown at asu.edu  Wed Aug 15 16:54:04 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 15 Aug 2007 13:54:04 -0700
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
References: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>
	<EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
Message-ID: <1A4207F8295607498283FE9E93B775B40386D612@EX02.asurite.ad.asu.edu>

It technically is a developer release, but given the age of the 1.4 release it is better because of fixes for things like doing webblasts and other improvements and I've found that it is reliable in the results that come out of the various objects that I've had to use in my current projects.

> I saw the 1.5.2 version, but it stated that this was a 
> developer release and that 1.4 was the latest stable version, 
> so I went with 1.4.  I'll give 1.5.2 a try.
> 
> Thanks,
> 
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics 
> Contractor, TerpSys http://www.terpsys.com/
> 
>  
> 
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
> 
> You'll definitely want to update to the latest (v 1.5.2).  We 
> hope to get a new stable release out sometime soon and 
> possibly move to a more regular release cycle.
> 
> chris
> 
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
> 
> > This is version 1.4.
> >
> > Mary Johnson
> >
> > Sr. Network Engineer
> >
> > National Cancer Institute Center for Bioinformatics Contractor, 
> > TerpSys http://www.terpsys.com/
> >
> >
> >
> > -----Original Message-----
> > From: Mauricio Herrera Cuadra 
> [mailto:arareko at campus.iztacala.unam.mx]
> > Sent: Wednesday, August 15, 2007 1:46 PM
> > To: Johnson, Mary (NIH/NCI) [C]
> > Cc: bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] Need assistance with make error
> >
> > Which version of bioperl you're trying to install?
> >
> > Johnson, Mary (NIH/NCI) [C] wrote:
> >> I'm trying to install bioperl on 2 Linux servers - 1 
> running Redhat 
> >> Enterprise Linux 4, and the other running RHEL3.  I'm getting the 
> >> following 'make Error 255' when running make test.  I'm 
> not sure what 
> >> this error indicates, and whether I should continue with a force 
> >> install?  Could you please advise.
> >>
> >>
> >>
> >>
> >>
> >> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> >>
> >> 
> ---------------------------------------------------------------------
> >> ---
> >> -------
> >>
> >> t/BioFetch_DB.t                  27    1   3.70%  8
> >>
> >> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> >>
> >> t/Ontology.t          9  2304    50  100 200.00%  1-50
> >>
> >> t/TreeIO.t                       41    1   2.44%  42
> >>
> >> t/Variation_IO.t                 25    3  12.00%  15 20 25
> >>
> >> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> >>
> >> 120 subtests skipped.
> >>
> >> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 
> >> 98.14% okay.
> >>
> >> make: *** [test_dynamic] Error 255
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Thanks,
> >>
> >>
> >>
> >> Mary Johnson
> >>
> >> Sr. Network Engineer
> >>
> >> National Cancer Institute Center for Bioinformatics Contractor, 
> >> TerpSys http://www.terpsys.com/ <http://www.terpsys.com/>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > --
> > MAURICIO HERRERA CUADRA
> > arareko at campus.iztacala.unam.mx
> > Laboratorio de Gen?tica
> > Unidad de Morfofisiolog?a y Funci?n
> > Facultad de Estudios Superiores Iztacala, UNAM
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bix at sendu.me.uk  Wed Aug 15 16:50:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 15 Aug 2007 21:50:02 +0100
Subject: [Bioperl-l] Developer docs
In-Reply-To: <46C34861.8090400@campus.iztacala.unam.mx>
References: <46C33E26.2050004@mail.nih.gov>
	<46C34861.8090400@campus.iztacala.unam.mx>
Message-ID: <46C366FA.40609@sendu.me.uk>

Mauricio Herrera Cuadra wrote:
> You may want to bookmark this one:
> 
> http://bioperl.org/wiki/Developer_Information#BioPerl_Code

Yup. The important one is http://bioperl.org/wiki/Bioperl_Best_Practices 
, which I've just updated with the latest info on writing test scripts.

From bix at sendu.me.uk  Wed Aug 15 16:54:45 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 15 Aug 2007 21:54:45 +0100
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
Message-ID: <46C36815.5010908@sendu.me.uk>

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.

Unless you know you really must install Bioperl 1.4, install 1.5.2 instead.

http://www.bioperl.org/wiki/Release_1.5.2

If you use the Build.PL installation, at the very least you certainly 
won't get a make error.

http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#PRELIMINARY_PREPARATION


From cjfields at uiuc.edu  Wed Aug 15 17:16:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 16:16:27 -0500
Subject: [Bioperl-l] exonerate parser in bioperl-live fails when
	protein2dna comparison is performed
In-Reply-To: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>
References: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>
Message-ID: <F853DDF2-3165-4F88-A087-744D60682104@uiuc.edu>

I can confirm this with bioperl-live.  Bio::SearchIO::exonerate docs  
indicate protein2genome and est2genome model output is supported but  
doesn't specifically indicate that it can parse any other output.   
You can add an enhancement request to bugzilla indicating this  
deficiency or, if you are inclined, add the functionality yourself  
and donate the code.

chris

On Aug 15, 2007, at 11:05 AM, Tania Oh wrote:

> Dear All,
>
> I was trying to use the Bio::SearchIO::Alignment::Exonerate module  
> to run and parse my exonerate output. But I've noticed that the  
> parser which is actually Bio::SearchIO::Exonerate works if the  
> model used in Exonerate is --model est2genome. I used exonerate  
> with the model --model protein2dna and the parser was unable to  
> parse the hsps.
>
>
> Below is a simple of code I used for testing the output from  
> exonerate:
>
> use Bio::SearchIO;
> use strict;
> <exonerate.output.works>
> my $searchio = Bio::SearchIO->new(-file => 'test_data/ 
> exonerate.output.dontwork
> <exonerate.output.dontwork>
> ',
>                                    -format => 'exonerate');
>
>   while( my $r = $searchio->next_result ) {
>           while(my $hit = $r->next_hit){
>                   while(my $hsp = $hit->next_hsp){
>                           print $hsp->start. "\t". $hsp->end. "\n";
>                   }
>           }
>
>     print $r->query_name, "\n";
>   }
>
>
> There are 2 files attached to show the examples of using either the  
> est2genome or protein2dna model:
> 1. exonerate.output.works  - produced from the command line:
> exonerate -q exonerate_cdna.fa -t exonerate_genomic.fa --model  
> est2genome --bestn 1 > exonerate.output.works
>
> 2. exonerate.output.dontwork - produced from the command line:
> exonerate -q test_aa.fa -t test_cds.fa --model protein2dna >  
> exonerate.output.dontwork
>
>
> Line 239 in Bio::searchIO::exonerate (cut and pasted below)
>
> elsif(  s/^vulgar:\s+(\S+)\s+         # query sequence id
>                  (\d+)\s+(\d+)\s+([\-\+])\s+   # query start-end- 
> strand
>                  (\S+)\s+                      # target sequence id
>                  (\d+)\s+(\d+)\s+([\-\+])\s+   # target start-end- 
> strand
>                  (\d+)\s+                      # score
>                  //ox ) {
>
> parses the vulgar line of an --model est2genome exonerate output  
> well. An example of the (complex) vulgar line which I've truncated  
> for readability is:
> vulgar: MUSSPSYN 3 1279 + 4.143962167-143965267 28 3074 + 6137 M 8  
> 8 G 0 1 M 231 231 5 0 2 I 0 253 3 0
>
> whereas the vulgar line I've obtained from a --model protein2dna  
> exonerate output is much simpler and the parser fails to pick it up:
> vulgar: SJCHGC00851 0 204 . SJCHGC00851 2 614 + 1059 M 204 612
>
> Has anyone encountered this situation before? I've not changed the  
> parser as exonerate is widely used for it's est2genome model, and  
> thought I'd run it pass the list to see if there is a work around  
> solution.
>
> many thanks in advance,
> tania
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonmar at mail.nih.gov  Wed Aug 15 17:45:36 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 17:45:36 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <E16950D3-9F60-4862-9325-57CA26107649@uiuc.edu>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805716@NIHCESMLBX11.nih.gov>

Version 1.5.2 worked fine!  Thanks to all of you for your quick response.  I wish all of our vendors were that quick in getting back to me:)


Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu] 
Sent: Wednesday, August 15, 2007 4:41 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

The term 'stable' is relative in this case; tons of bugs fixes were  
incorporated in the 1.5.2 release.  There are a few dev-specific  
issues we'll need to resolve prior to a new release; once those are  
out of the way we'll try to get a new 'stable' out.

chris

On Aug 15, 2007, at 3:32 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> I saw the 1.5.2 version, but it stated that this was a developer  
> release and that 1.4 was the latest stable version, so I went with  
> 1.4.  I'll give 1.5.2 a try.
>
> Thanks,
>
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> You'll definitely want to update to the latest (v 1.5.2).  We hope to
> get a new stable release out sometime soon and possibly move to a
> more regular release cycle.
>
> chris
>
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
>
>> This is version 1.4.
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/
>>
>>
>>
>> -----Original Message-----
>> From: Mauricio Herrera Cuadra  
>> [mailto:arareko at campus.iztacala.unam.mx]
>> Sent: Wednesday, August 15, 2007 1:46 PM
>> To: Johnson, Mary (NIH/NCI) [C]
>> Cc: bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Need assistance with make error
>>
>> Which version of bioperl you're trying to install?
>>
>> Johnson, Mary (NIH/NCI) [C] wrote:
>>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>>> following 'make Error 255' when running make test.  I'm not sure  
>>> what
>>> this error indicates, and whether I should continue with a force
>>> install?  Could you please advise.
>>>
>>>
>>>
>>>
>>>
>>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> ---
>>> -------
>>>
>>> t/BioFetch_DB.t                  27    1   3.70%  8
>>>
>>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>>
>>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>>
>>> t/TreeIO.t                       41    1   2.44%  42
>>>
>>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>>
>>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>>
>>> 120 subtests skipped.
>>>
>>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,
>>> 98.14%
>>> okay.
>>>
>>> make: *** [test_dynamic] Error 255
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Mary Johnson
>>>
>>> Sr. Network Engineer
>>>
>>> National Cancer Institute Center for Bioinformatics
>>> Contractor, TerpSys
>>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> MAURICIO HERRERA CUADRA
>> arareko at campus.iztacala.unam.mx
>> Laboratorio de Gen?tica
>> Unidad de Morfofisiolog?a y Funci?n
>> Facultad de Estudios Superiores Iztacala, UNAM
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From neetisomaiya at gmail.com  Thu Aug 16 00:22:18 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 09:52:18 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <46C1D557.7090101@pharm.stonybrook.edu>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
	<46C1D557.7090101@pharm.stonybrook.edu>
Message-ID: <764978cf0708152122oba56e13qef83544cdde7e795@mail.gmail.com>

Hi Siddhartha,

Thanks a lot for your mail.
It would be great if you could send me your parser, I will see how I can
modify it for my purpose.

Thanks and Regards,
Neeti.

On 8/14/07, Siddhartha Basu <basu at pharm.stonybrook.edu> wrote:
>
> neeti somaiya wrote:
> > Hi Andrew,
> >
> > I think the homologene data files have changed now on the ftp, from what
> you
> > had used.
> > It is now homologene.data and homologene.xml.
> > I tried using your parser, but because it was written on the file
> > hmlg.trip.ftp, it doesnt work anymore.
> >
> > I came across a parser
> >
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> > .
> > I am looking at it to see if it works for me. NOt sure if it will.
> >
> > ~Neeti.
>
> Hi Neeti,
> I have recently written a parser for 'homologene' xml data specific for
> my purpose. I am not sure whether it will suit your purpose but it could
> be extended for general purpose parsing, so i am putting it forward.
> Here is how it works .......
>
> * It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
> * It does SAX based parsing (currently uses XML::SAX::ExpatXS)
> * Returns a graph(uses Graph module of perl) object where each node is a
> homologue entry with its corresponding entrez gene id. Each node also
> contain the following attributes ...
>         * Refseq protein id.
>         * Protein id (pid)
>         * ncbi taxon id.
> * The edge attribute contain information about the ortholog(true/false)
> relationship between two nodes.
> * The rest of tags currently are not being extracted. However, parsing
> the rest of the tags should not be very difficult.
>
> Generally i get homologene xml stream from an 'efetch' through
> Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and
> then works on it.
>
> So, to make it more generic and work on local file
>
> * We need another class that reads the chunk between
> <HG-Entry>.....</HG-Entry> and sends it to the parser.
> * Add supports for most of the tags.
> * Massage the data to a bioperl compatible object.
>
> The first two i could work it out and for the last one i have to figure
> out the bioperl object that could be suitable (like  Bio::Cluster or
> Bio::NetWork::Node/Edge).
>
> Let me know if it sounds interesting and i will send you the code.
>
> -siddhartha
>
>
> >
> > On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
> >> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
> >>
> >>> Hi,
> >>>
> >>> Does anyone know of any Homologene parser, if available?
> >>> Please let me know.
> >>>
> >>> Thanks and Regards,
> >>> Neeti.
> >> Hi Neeti,
> >>
> >> Quite a long time ago now I wrote an Homologene parser and posted it
> >> to the mailing list:
> >>
> >> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
> >>
> >> I don't know if this still works but you could use it as a starting
> >> point. There may also be something newer out there too, I don't know.
> >> If you search the mailing list archives you'll get a few messages
> >> around the topic.
> >>
> >> Cheers, Andrew.
> >>
> >>
> >> Andrew Macgregor
> >> Centre for Comparative Genomics, Murdoch University
> >> Email: amacgregor at ccg.murdoch.edu.au
> >> Tel: (08) 9360 2961
> >>
> >>
> >>
> >>
> >
> >
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive

From neetisomaiya at gmail.com  Thu Aug 16 01:56:21 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 11:26:21 +0530
Subject: [Bioperl-l] PDB Parser
Message-ID: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>

Hi,

After a lot of search I could find this link from where PDB files can be
downloaded :
ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/
Is there any other link where one can download all pdb data from?

I tried using Bio::Structure::IO::pdb with some code like :-
use Bio::Structure::IO;

    $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
                                   -format => 'pdb');

    while ( my $struc = $in->next_structure() ) {
       print "Structure ", $struc->id,"\n";
    }

It works well. But I am not able to find documentation of other methods
which will give me various specific details available in a pdb file, right
from title, keywords, references to structure details, atoms, coordinates
etc. There must be different methods to fetch and parse each of this data
from a pdb file, right? Where can I find the details? Any example code of
the same would also be of great use.

Thanks and Regards,
Neeti Somaiya.

-- 
-Neeti
Even my blood says, B positive

From hrh at sanger.ac.uk  Thu Aug 16 04:48:16 2007
From: hrh at sanger.ac.uk (Hans Rudolf Hotz)
Date: Thu, 16 Aug 2007 09:48:16 +0100 (BST)
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0708160942310.14241@deskpro50.dynamic.sanger.ac.uk>


On Thu, 16 Aug 2007, neeti somaiya wrote:

> Hi,
>
> After a lot of search I could find this link from where PDB files can be
> downloaded :
> ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/
> Is there any other link where one can download all pdb data from?

try: ftp://pdb.protein.osaka-u.ac.jp/v3/pub/pdb/   or
      ftp://ftp.ebi.ac.uk/pub/databases/rcsb/pdb-remediated/

it is not BioPerl, but James Tisdall's book: O'Reilly: "Begiining Perl for 
Bioinformatics" has a nice introduction into parsing PDB files


Regards, Hans


>
> I tried using Bio::Structure::IO::pdb with some code like :-
> use Bio::Structure::IO;
>
>    $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>                                   -format => 'pdb');
>
>    while ( my $struc = $in->next_structure() ) {
>       print "Structure ", $struc->id,"\n";
>    }
>
> It works well. But I am not able to find documentation of other methods
> which will give me various specific details available in a pdb file, right
> from title, keywords, references to structure details, atoms, coordinates
> etc. There must be different methods to fetch and parse each of this data
> from a pdb file, right? Where can I find the details? Any example code of
> the same would also be of great use.
>
> Thanks and Regards,
> Neeti Somaiya.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

From neetisomaiya at gmail.com  Thu Aug 16 05:30:42 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 15:00:42 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
	<C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>
Message-ID: <764978cf0708160230o4ade944er8c8529199f3a0262@mail.gmail.com>

Hi,

For now I am using the homologene parser available here :-
http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
,
for parsing the homologene.data file. But the README at the ftp site says
HOMOLOGENE.XML has much more data, I am still to see how to parse this one.

~Neeti.


On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>
> On 14/08/2007, at 11:21 AM, Chris Fields wrote:
>
> > It looks like Heikki responded and thought a good place for it
> > would be Bio::SeqIO, but it didn't go anywhere I suppose.  I see
> > that a few other posts suggest it could be placed in Bio::Cluster
> > as well which I'm not familiar with.  We could add it in if you
> > were still interested, just need to find a good place for it; might
> > be nice to have a Parse::RecDescent-based parser.
> >
> > chris
> >
>
> Hi Chris,
>
> I was also doing some parsing of UniGene at the time but found
> RecDescent was too slow and went back to regexes. That code found
> it's way into Bio::Cluster. Occasionally I see a message with someone
> looking for a Homologene parser but not very often, so I'm not sure
> it is worth the effort of moving the code into bioperl.
>
> Cheers, Andrew.
>


-- 
-Neeti
Even my blood says, B positive

From bix at sendu.me.uk  Thu Aug 16 05:59:08 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 16 Aug 2007 10:59:08 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
Message-ID: <46C41FEC.2000206@sendu.me.uk>

neeti somaiya wrote:
> I tried using Bio::Structure::IO::pdb with some code like :-
> use Bio::Structure::IO;
> 
>     $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>                                    -format => 'pdb');
> 
>     while ( my $struc = $in->next_structure() ) {
>        print "Structure ", $struc->id,"\n";
>     }
> 
> It works well. But I am not able to find documentation of other methods
> which will give me various specific details available in a pdb file, right
> from title, keywords, references to structure details, atoms, coordinates
> etc. There must be different methods to fetch and parse each of this data
> from a pdb file, right? Where can I find the details?

$struct is a Bio::Structure::Entry, so look at the docs for that:
http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html

You'll probably want to look at the docs for the other Structure modules 
as well:
http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html


I agree, the documentation in this area could be improved. 
Bio::Structure::StructureI could actually contain something, and 
Bio::Structure should actually exist or not be referenced in the docs.

From ewijaya at gmail.com  Thu Aug 16 00:18:57 2007
From: ewijaya at gmail.com (Edward Wijaya)
Date: Thu, 16 Aug 2007 12:18:57 +0800
Subject: [Bioperl-l] How to create contrasting colors in every singe track -
	Bio::Graphics
Message-ID: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>

Dear experts,

I am trying to draw a figures that shows binding sites hits for various
program (see attached) for example.

Now, I have a problem in creating contrasting colour for each of
the Programs (MEME, AlignACE, etc).  I want to avoid "graded segments",
so that I can have more contrasting color, e.g: red, blue, yellow, etc.

Can anybody suggest how can we achieve that?

My full source code can be found here: http://dpaste.com/16985/
The portion of the script is this:

__BEGIN__
    my %prog_color = (
        "Actual"   => 800000,
        "ALIGNACE" => 230000,
        "BP"       => 80000,
        "MDSCAN"   => 5000,
        "MITRA"    => 10000,
        "MTSAMP"   => 200000,
        "SPACE"    => 40000,
        "NONE"     => 0,
    );

    foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
        my $track = $panel->add_track(
            -glyph     => 'graded_segments',
            -key       => "SEQ " . $seqid,
            -connector => "dashed",
            -label     => 1,
            -fontcolor => 'red',
            -bgcolor   => 'blue',
            -bump      => +1,
            -height    => 8,
            -min_score => 0,
            -max_score => 500000
        );
# rest of the script
__END__

Regards,
Edward
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hits.png
Type: image/png
Size: 2509 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070816/31057225/attachment.png 

From pratchusha.kamireddy at aamu.edu  Wed Aug 15 23:45:22 2007
From: pratchusha.kamireddy at aamu.edu (pratchusha kamireddy)
Date: Wed, 15 Aug 2007 22:45:22 -0500 (CDT)
Subject: [Bioperl-l] Request for Activeperl software
Message-ID: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>

Hello
  I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am working under Dr.Kantety in Plant and Soil Science Department.I am the beginner to learn perl programming. I need Activeperl software to run the perl programs. Can you help me in this regard like: where can I dowmload this software, how can i Install this and how can i use this. I am eagerlu waiting for your reply.Please help me in this regard.
   Thanking you
   Pratchusha Kamireddy

From spiros at lokku.com  Thu Aug 16 09:32:05 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Thu, 16 Aug 2007 14:32:05 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
References: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <bba689ec0708160632w315b00d5na3bf55d97ac03728@mail.gmail.com>

Hi,

You can download ActivePerl from ActiveStates website at

http://www.activestate.com/Products/ActivePerl/

Get a book: http://www.oreilly.com/catalog/lperl3/

Visit:

http://perl-begin.org/
http://learn.perl.org/

Usenet:

http://www.nntp.perl.org/group/perl.beginners/

Spiros

On 8/16/07, pratchusha kamireddy <pratchusha.kamireddy at aamu.edu> wrote:
> Hello
>   I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am working under Dr.Kantety in Plant and Soil Science Department.I am the beginner to learn perl programming. I need Activeperl software to run the perl programs. Can you help me in this regard like: where can I dowmload this software, how can i Install this and how can i use this. I am eagerlu waiting for your reply.Please help me in this regard.
>    Thanking you
>    Pratchusha Kamireddy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From razi.khaja at gmail.com  Thu Aug 16 09:37:09 2007
From: razi.khaja at gmail.com (Razi Khaja)
Date: Thu, 16 Aug 2007 09:37:09 -0400
Subject: [Bioperl-l] How to create contrasting colors in every singe
	track - Bio::Graphics
In-Reply-To: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
References: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
Message-ID: <62e9dabc0708160637o36380ecbv69fe479d0a26989d@mail.gmail.com>

You would probably want to consider a "Graph-Coloring" algorithm in
order to optimally pick contrasting colors for the features being
displayed.  This might be overkill for what your trying to accomplish
and may not be possible (depending on how many features you have in
your dataset ... ie. how big your graph is).

In anycase, some resources are:
http://en.wikipedia.org/wiki/Graph_coloring
http://web.cs.ualberta.ca/~joe/Coloring/

If your problem is simpler, see the modifications to your program Ive
made below:

Razi Khaja

On 8/16/07, Edward Wijaya <ewijaya at gmail.com> wrote:
> Dear experts,
>
> I am trying to draw a figures that shows binding sites hits for various
> program (see attached) for example.
>
> Now, I have a problem in creating contrasting colour for each of
> the Programs (MEME, AlignACE, etc).  I want to avoid "graded segments",
> so that I can have more contrasting color, e.g: red, blue, yellow, etc.
>
> Can anybody suggest how can we achieve that?
>
> My full source code can be found here: http://dpaste.com/16985/
> The portion of the script is this:
>
> __BEGIN__
>     my %prog_color = (
>         "Actual"   => 800000,
>         "ALIGNACE" => 230000,
>         "BP"       => 80000,
>         "MDSCAN"   => 5000,
>         "MITRA"    => 10000,
>         "MTSAMP"   => 200000,
>         "SPACE"    => 40000,
>         "NONE"     => 0,
>     );
>
       my %color = ( 'MEME' => 'red', 'ALIGNACE => 'blue');

>     foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
           my( @feild ) = split( /\s+/, $nlist{$seqid} );
           my $prog_name = $feild[3];

>         my $track = $panel->add_track(
>             -glyph     => 'graded_segments',
>             -key       => "SEQ " . $seqid,
>             -connector => "dashed",
>             -label     => 1,
>             -fontcolor => 'red',
               -bgcolor   => $color{ $prog_name },
>             -bump      => +1,
>             -height    => 8,
>             -min_score => 0,
>             -max_score => 500000
>         );
> # rest of the script
> __END__
>
> Regards,
> Edward
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

From bix at sendu.me.uk  Thu Aug 16 09:49:52 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 16 Aug 2007 14:49:52 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
References: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <46C45600.4040906@sendu.me.uk>

pratchusha kamireddy wrote:
> I am Pratchusha Kamireddy doing masters in Alabama A&M University. I
> am working under Dr.Kantety in Plant and Soil Science Department.I am
> the beginner to learn perl programming. I need Activeperl software to
> run the perl programs. Can you help me in this regard like: where can
> I dowmload this software, how can i Install this and how can i use
> this. I am eagerlu waiting for your reply.Please help me in this
> regard.

Firstly, Google is your friend:
http://www.google.co.uk/search?q=activeperl

The first hit is the correct one:

http://www.activestate.com/Products/activeperl/


I suppose your next question will be how to install Bioperl (if not, 
you're in the wrong place):

http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
(which also tells you where to get ActivePerl from)

From cjfields at uiuc.edu  Thu Aug 16 10:11:22 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:11:22 -0500
Subject: [Bioperl-l] How to create contrasting colors in every singe
	track - Bio::Graphics
In-Reply-To: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
References: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
Message-ID: <F3E88224-4AA2-451B-97FE-5DED15015FA2@uiuc.edu>


On Aug 15, 2007, at 11:18 PM, Edward Wijaya wrote:

> Dear experts,
>
> I am trying to draw a figures that shows binding sites hits for  
> various
> program (see attached) for example.
>
> Now, I have a problem in creating contrasting colour for each of
> the Programs (MEME, AlignACE, etc).  I want to avoid "graded  
> segments",
> so that I can have more contrasting color, e.g: red, blue, yellow,  
> etc.
>
> Can anybody suggest how can we achieve that?
>
> My full source code can be found here: http://dpaste.com/16985/
> The portion of the script is this:
>
> __BEGIN__
>     my %prog_color = (
>         "Actual"   => 800000,
>         "ALIGNACE" => 230000,
>         "BP"       => 80000,
>         "MDSCAN"   => 5000,
>         "MITRA"    => 10000,
>         "MTSAMP"   => 200000,
>         "SPACE"    => 40000,
>         "NONE"     => 0,
>     );
>
>     foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
>         my $track = $panel->add_track(
>             -glyph     => 'graded_segments',
>             -key       => "SEQ " . $seqid,
>             -connector => "dashed",
>             -label     => 1,
>             -fontcolor => 'red',
>             -bgcolor   => 'blue',
>             -bump      => +1,
>             -height    => 8,
>             -min_score => 0,
>             -max_score => 500000
>         );
> # rest of the script
> __END__
>
> Regards,
> Edward

I think you have two options:

1) Split the seqfeatures into different tracks based on the source  
(AlignACE, MP, etc), then give each it's own graded segment color.  I  
like this personally as it doesn't glob various results together onto  
one track and (at least to me) is easier to maintain.  It also allows  
one more flexibility in using varying scoring schemes.
2) Use a callback for bgcolor which changes the color explicitly  
based on the source/score.

The GenBank/EMBL section of the Bio::Graphics HOWTO reveals how to  
add different tracks, and there are several scattered examples on how  
to use callbacks.

http://www.bioperl.org/wiki/HOWTO:Graphics

chris

From cjfields at uiuc.edu  Thu Aug 16 10:12:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:12:30 -0500
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C41FEC.2000206@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
Message-ID: <5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>


On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:

> neeti somaiya wrote:
>> I tried using Bio::Structure::IO::pdb with some code like :-
>> use Bio::Structure::IO;
>>
>>     $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>>                                    -format => 'pdb');
>>
>>     while ( my $struc = $in->next_structure() ) {
>>        print "Structure ", $struc->id,"\n";
>>     }
>>
>> It works well. But I am not able to find documentation of other  
>> methods
>> which will give me various specific details available in a pdb  
>> file, right
>> from title, keywords, references to structure details, atoms,  
>> coordinates
>> etc. There must be different methods to fetch and parse each of  
>> this data
>> from a pdb file, right? Where can I find the details?
>
> $struct is a Bio::Structure::Entry, so look at the docs for that:
> http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
>
> You'll probably want to look at the docs for the other Structure  
> modules
> as well:
> http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
>
>
> I agree, the documentation in this area could be improved.
> Bio::Structure::StructureI could actually contain something, and
> Bio::Structure should actually exist or not be referenced in the docs.

There was a discussion a while back on refactoring the code within  
Bio::Structure to better deal with HETATM and other stuff.  As far as  
I'm concerned it's open for anyone wanted to tinker with it.

chris

From cjfields at uiuc.edu  Thu Aug 16 10:37:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:37:31 -0500
Subject: [Bioperl-l] Announcement: infernal/erpin/rnamotif parsers
Message-ID: <7CE60504-FA1A-4AFF-A02E-036B8E37C3F9@uiuc.edu>

To anyone using the aforementioned parsers:

I don't plan on continuing development of the Bio::Tools-related  
Infernal, RNAMotif, and ERPIN parsers at this time unless there is  
substantial interest in doing so.  Instead, I plan on focusing my  
efforts on the Bio::SearchIO-based parsers as I feel they are much  
better at representing the data present in the output.  In my opinion  
having two sets of parsers that accomplish essentially the same task  
is redundant and non-productive.  Again, if there is considerable  
interest in keeping them I suggest responding to this message,  
otherwise I would consider them deprecated and removed completely by  
rel 1.7 (maybe sooner).

Infernal: It's very likely that a new stable version (v. 1.0) of  
Infernal will be released in the near future.  I may upgrade the  
Bio::SearchIO-based parser in the meantime to parse the latest  
Infernal output (v 0.81), but I don't plan on supporting pre-1.0  
releases once the final version is out.  Infernal has been in  
developer release for some time now and the program output has  
changed dramatically over time; however, the format is expected to  
solidify once a stable release is made, which makes supporting the  
parser much easier over time.

Questions?  Gripes?

chris


From awitney at sgul.ac.uk  Thu Aug 16 10:07:02 2007
From: awitney at sgul.ac.uk (Adam Witney)
Date: Thu, 16 Aug 2007 15:07:02 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <C2EA1896.17575%awitney@sgul.ac.uk>


This would be the best place to start

http://www.activeperl.org/

Or more specifically for the language:

http://www.activeperl.org/store/activeperl/download/

(Which will require you to register with them)

adam


On 16/8/07 04:45, "pratchusha kamireddy" <pratchusha.kamireddy at aamu.edu>
wrote:

> Hello
>   I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am
> working under Dr.Kantety in Plant and Soil Science Department.I am the
> beginner to learn perl programming. I need Activeperl software to run the perl
> programs. Can you help me in this regard like: where can I dowmload this
> software, how can i Install this and how can i use this. I am eagerlu waiting
> for your reply.Please help me in this regard.
>    Thanking you
>    Pratchusha Kamireddy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From muratem at eng.uah.edu  Thu Aug 16 15:10:34 2007
From: muratem at eng.uah.edu (muratem at eng.uah.edu)
Date: Thu, 16 Aug 2007 14:10:34 -0500 (CDT)
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
Message-ID: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>

Hello

This might not be the correct list for this particular problem, but
hopefully someone can help. I am trying to install ...staden::read on a
Mac OS X 10.4. I tried installing cpan but it wouldn't work so I went to
the manual methods. Perl is on the system and appears to be installed
correctly for a Mac. Bioperl 1.5.2 was installed via fink and appears to
be OK also. I'm trying to install the Bio::SeqIO::staden::read module. I
downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the usual
perl Makefile.PL and make and get:

newyork:/usr/local/bioperl-ext-1.5.1 root# make
Makefile:1148: *** multiple target patterns.  Stop.

A snippet from the Makefile...

   1148 pm_to_blib: $(TO_INST_PM)
   1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
   1150           Bio/Ext/Align/libs/hscore.h
$(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
   1151           Bio/Ext/Align/libs/probability.c
$(INST_LIB)/Bio/Ext/Align/libs/probability.c \
   1152           Bio/Ext/Align/libs/linesubs.h
$(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
   1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/test.pl \
   1154           Bio/Ext/Align/libs/wiseoverlay.h
$(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
   1155           Bio/Ext/Align/libs/proteinsw.h
$(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
   1156           Bio/Ext/Align/libs/wisebase.h
$(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
   1157           Bio/Ext/Align/libs/seqaligndisplay.h
$(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
   1158           Bio/Ext/Align/libs/dyna.h
$(INST_LIB)/Bio/Ext/Align/libs/dyna.h \

The README says you don't have to build the whole package, so I descended
to the staden directory and did a Make and didn't get any problems
reported. But when I did a make test I get:

newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
"test_harness(0, '../blib/lib', '../blib/arch')" test.pl
test....Had problems bootstrapping Inline module 'Bio::SeqIO::staden::read'

Can't load
'/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle'
for module Bio::SeqIO::staden::read:
dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle,
2): Symbol not found: _curl_easy_init
  Referenced from:
/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle
  Expected in: dynamic lookup
 at /Library/Perl/5.8.6/Inline.pm line 500


 at test.pl line 0
INIT failed--call queue aborted, <DATA> line 1.
test....dubious
        Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 1-94
        Failed 94/94 tests, 0.00% okay
Failed Test Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
test.pl      255 65280    94  188 200.00%  1-94
Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00% okay.
make: *** [test_dynamic] Error 2

The missing symbol is apparently from libcurl. I have both libcurl.2.dylib
and libcurl.3.dylib with copies in multiple locations including /usr/lib,
/usr/local/lib and the usual Mac directories. I used the Mac otool to look
at the externals in read.bundle and it references libz.1.dylib and
libSystem.B.dylib. Could this be a case where there should have been a
link to libcurl and wasn't?

I've searched the list and see only the Inline versioning problem (which I
had and fixed). Has anybody seen this problem before or built the module
on a Mac? How did you do it? Is this a question for the Staden list on
sourceforge?

Thanks

Mike


From cjfields at uiuc.edu  Thu Aug 16 15:55:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 14:55:05 -0500
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
In-Reply-To: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
References: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
Message-ID: <9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>


On Aug 16, 2007, at 2:10 PM, muratem at eng.uah.edu wrote:

> Hello
>
> This might not be the correct list for this particular problem, but
> hopefully someone can help. I am trying to install ...staden::read  
> on a
> Mac OS X 10.4. I tried installing cpan but it wouldn't work so I  
> went to
> the manual methods. Perl is on the system and appears to be installed
> correctly for a Mac. Bioperl 1.5.2 was installed via fink and  
> appears to
> be OK also. I'm trying to install the Bio::SeqIO::staden::read  
> module. I
> downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the  
> usual
> perl Makefile.PL and make and get:
>
> newyork:/usr/local/bioperl-ext-1.5.1 root# make
> Makefile:1148: *** multiple target patterns.  Stop.
>
> A snippet from the Makefile...
>
>    1148 pm_to_blib: $(TO_INST_PM)
>    1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
> 'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
>    1150           Bio/Ext/Align/libs/hscore.h
> $(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
>    1151           Bio/Ext/Align/libs/probability.c
> $(INST_LIB)/Bio/Ext/Align/libs/probability.c \
>    1152           Bio/Ext/Align/libs/linesubs.h
> $(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
>    1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/ 
> test.pl \
>    1154           Bio/Ext/Align/libs/wiseoverlay.h
> $(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
>    1155           Bio/Ext/Align/libs/proteinsw.h
> $(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
>    1156           Bio/Ext/Align/libs/wisebase.h
> $(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
>    1157           Bio/Ext/Align/libs/seqaligndisplay.h
> $(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
>    1158           Bio/Ext/Align/libs/dyna.h
> $(INST_LIB)/Bio/Ext/Align/libs/dyna.h \
>
> The README says you don't have to build the whole package, so I  
> descended
> to the staden directory and did a Make and didn't get any problems
> reported. But when I did a make test I get:
>
> newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
> "test_harness(0, '../blib/lib', '../blib/arch')" test.pl
> test....Had problems bootstrapping Inline module  
> 'Bio::SeqIO::staden::read'
>
> Can't load
> '/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/ 
> Bio/SeqIO/staden/read/read.bundle'
> for module Bio::SeqIO::staden::read:
> dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/ 
> auto/Bio/SeqIO/staden/read/read.bundle,
> 2): Symbol not found: _curl_easy_init
>   Referenced from:
> /usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/ 
> SeqIO/staden/read/read.bundle
>   Expected in: dynamic lookup
>  at /Library/Perl/5.8.6/Inline.pm line 500
>
>
>  at test.pl line 0
> INIT failed--call queue aborted, <DATA> line 1.
> test....dubious
>         Test returned status 255 (wstat 65280, 0xff00)
> DIED. FAILED tests 1-94
>         Failed 94/94 tests, 0.00% okay
> Failed Test Stat Wstat Total Fail  Failed  List of Failed
> ---------------------------------------------------------------------- 
> ---------
> test.pl      255 65280    94  188 200.00%  1-94
> Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00%  
> okay.
> make: *** [test_dynamic] Error 2
>
> The missing symbol is apparently from libcurl. I have both libcurl. 
> 2.dylib
> and libcurl.3.dylib with copies in multiple locations including / 
> usr/lib,
> /usr/local/lib and the usual Mac directories. I used the Mac otool  
> to look
> at the externals in read.bundle and it references libz.1.dylib and
> libSystem.B.dylib. Could this be a case where there should have been a
> link to libcurl and wasn't?
>
> I've searched the list and see only the Inline versioning problem  
> (which I
> had and fixed). Has anybody seen this problem before or built the  
> module
> on a Mac? How did you do it? Is this a question for the Staden list on
> sourceforge?
>
> Thanks
>
> Mike

Haven't seen the problem you list.  I have installed it on Mac OS X  
(intel) w/o problems so I know it works; at least all tests passed  
though I remember Inline complaining for some reason.

You should try using bioperl-ext from CVS (it is really 1.5.1 but  
with updated docs and maybe a change or two).  The process is a  
little tricky but is documented in the README in the package.  You'll  
need the old io_lib (1.8.12 or earlier) from Staden if memory serves.

chris

From zhaodj at ioz.ac.cn  Thu Aug 16 22:13:16 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 17 Aug 2007 10:13:16 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
Message-ID: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>

Dear list members,

I have a question about the methods of bioperl objects.It is how and
where we can get the whole methods of a bioperl object.

Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
this object, some sample codes are given.The following five clauses
are excerpted from the synopsis.
(1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
(2)while ( my @rids = $factory->each_rid ) {
(3)$factory->remove_rid($rid);
(4)my $rc = $factory->retrieve_blast($rid);
(5)my $r = $factory->submit_blast($input);

The five clauses use five methods of the RemoteBlast object,i.e.
(1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
(5)submit_blast. However,I only find part of them(45) are listed in
the appendix while others(123) are absent. Are there some more
methods not explictly declared? I don't know.This will lead to the
partial understanding and utilization of the module.Therefore I come
here for the way to get the full methods of a bioperl object.

Thanks!
-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From zhaodj at ioz.ac.cn  Thu Aug 16 22:13:16 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 17 Aug 2007 10:13:16 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
Message-ID: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>

Dear list members,

I have a question about the methods of bioperl objects.It is how and
where we can get the whole methods of a bioperl object.

Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
this object, some sample codes are given.The following five clauses
are excerpted from the synopsis.
(1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
(2)while ( my @rids = $factory->each_rid ) {
(3)$factory->remove_rid($rid);
(4)my $rc = $factory->retrieve_blast($rid);
(5)my $r = $factory->submit_blast($input);

The five clauses use five methods of the RemoteBlast object,i.e.
(1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
(5)submit_blast. However,I only find part of them(45) are listed in
the appendix while others(123) are absent. Are there some more
methods not explictly declared? I don't know.This will lead to the
partial understanding and utilization of the module.Therefore I come
here for the way to get the full methods of a bioperl object.

Thanks!
-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From neetisomaiya at gmail.com  Fri Aug 17 02:23:08 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 17 Aug 2007 11:53:08 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
Message-ID: <764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>

Hi,

My main concern is just the pdb id and title. PDB id I am able to fetch
easily, but is there a method which can give me the title of the PDB
structure?

Like for example from the following :-

HEADER    DNA/RNA                                 05-DEC-94   100D
TITLE     CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
TITLE    2 R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO
TITLE    3 PHOSPHATE ONLY AND MINOR GROOVE TERTIARY BASE-PAIRING
COMPND    MOL_ID: 1;
COMPND   2 MOLECULE: DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*GP*)-
COMPND   3 R(*G)-3');
COMPND   4 CHAIN: A, B;
.
.
.
.

I just want "CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO PHOSPHATE ONLY AND
MINOR GROOVE TERTIARY BASE-PAIRING".

Thanks,
Neeti.

On 8/16/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
> On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:
>
> > neeti somaiya wrote:
> >> I tried using Bio::Structure::IO::pdb with some code like :-
> >> use Bio::Structure::IO;
> >>
> >>     $in  = Bio::Structure::IO->new(-file => " pdb100d.ent",
> >>                                    -format => 'pdb');
> >>
> >>     while ( my $struc = $in->next_structure() ) {
> >>        print "Structure ", $struc->id,"\n";
> >>     }
> >>
> >> It works well. But I am not able to find documentation of other
> >> methods
> >> which will give me various specific details available in a pdb
> >> file, right
> >> from title, keywords, references to structure details, atoms,
> >> coordinates
> >> etc. There must be different methods to fetch and parse each of
> >> this data
> >> from a pdb file, right? Where can I find the details?
> >
> > $struct is a Bio::Structure::Entry, so look at the docs for that:
> > http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
> >
> > You'll probably want to look at the docs for the other Structure
> > modules
> > as well:
> > http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
> >
> >
> > I agree, the documentation in this area could be improved.
> > Bio::Structure::StructureI could actually contain something, and
> > Bio::Structure should actually exist or not be referenced in the docs.
>
> There was a discussion a while back on refactoring the code within
> Bio::Structure to better deal with HETATM and other stuff.  As far as
> I'm concerned it's open for anyone wanted to tinker with it.
>
> chris
>


-- 
-Neeti
Even my blood says, B positive

From alexl at users.sourceforge.net  Fri Aug 17 03:22:16 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 00:22:16 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
Message-ID: <cg3ayi39sn.fsf@allele2.localdomain>

Hi all,

I'd like to clarify the license of bioperl.  Currently the LICENSE
only includes the text of the Artistic artist.  But the wiki
http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F
says:

 BioPerl is licensed under the same terms as Perl itself which is the
 Perl Artistic License (see
 http://www.perl.com/pub/a/language/misc/Artistic.html or
 http://www.opensource.org/licenses/artistic-license.html

and most of the modules in the source say:

 "You may distribute this module under the same terms as perl itself"

But the current distribution of Perl is actually dually-licensed under
the GPL or Artistic licenses (so the wiki is technically out of sync
with the "same terms as Perl itself"), see:

 http://dev.perl.org/licenses/

I assume that the intent of the bioperl authors is to license with the
same terms as Perl's *current* license (which would mean bioperl is
really effectively dually-licensed under the GPL or Artistic license).
If so, it would be good if the LICENSE text and the wiki were updated
to reflect this.

Also some of the source modules say "under the same terms as perl
itself", but then only mention the Artistic license.

This has important ramifications for distribution: I maintain the
Fedora package for bioperl and I have currently listed the license of
bioperl as "GPL or Artistic".  But if bioperl were distributed under
the Artistic license only then I would have to pull the package from
the distribution, because the Artistic 1.0 (original)-only license is
deprecated (but "GPL or Artistic" is OK):

http://fedoraproject.org/wiki/Licensing#head-d8cc605dd386091c8b6be97b8a43fb6a5d624ae1

Thanks!

Alex


From alexl at users.sourceforge.net  Fri Aug 17 03:42:07 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 00:42:07 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <cg3ayi39sn.fsf@allele2.localdomain> (Alex Lancaster's message of
	"Fri\, 17 Aug 2007 00\:22\:16 -0700")
References: <cg3ayi39sn.fsf@allele2.localdomain>
Message-ID: <nrsl6i1ub4.fsf@allele2.localdomain>

>>>>> "AL" == Alex Lancaster  writes:

[...]

AL> I assume that the intent of the bioperl authors is to license with
AL> the same terms as Perl's *current* license (which would mean
AL> bioperl is really effectively dually-licensed under the GPL or
AL> Artistic license).  If so, it would be good if the LICENSE text
AL> and the wiki were updated to reflect this.

Also note that since Perl's license is a dual-license "GPL or
Artistic" then people aren't required to submit their modifications
back to the bioperl distribution because they can choose to follow the
Artistic (rather than the GPL) license which doesn't require
modifications to be submitted back.  This means the point:

 "If you fix bugs, please let us know about them. This is not the GPL
 license so you are not required to submit the code fixes, but in the
 spirit of making a better product we hope you'll contribute back to
 the community any insight or code improvements."

listed here:

 http://www.bioperl.org/wiki/Licensing_BioPerl

would still stand, because you can choose the Artistic license, but
you could modify the clause to say:

 "If you fix bugs, please let us know about them. Because Bioperl is
 dual-licensed under the GPL or Artistic licenses, you can choose the
 Artistic license, which means that you are not required to submit the
 code fixes, but in the spirit of making a better product we hope
 you'll contribute back to the community any insight or code
 improvements."


From n.haigh at sheffield.ac.uk  Fri Aug 17 06:27:43 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 17 Aug 2007 11:27:43 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <46C5781F.60301@sheffield.ac.uk>

De-Jian,ZHAO wrote:
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
>   


You should check out the Deobfuscator at:
http://bioperl.org/cgi-bin/deob_interface.cgi

Search and choose the object of choice. e.g. Bio::Tools::Run::RemoteBlast

You will be provided a list of methods available to that object,
including all the methods up the inheritance hierarchy. Unfortunately,
some bioperl modules are documented more thoroughly than others.

Nath

From neetisomaiya at gmail.com  Fri Aug 17 06:42:09 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 17 Aug 2007 16:12:09 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
Message-ID: <764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>

Hi,

I have done it currently as follows :

 while ( my $struc = $in->next_structure() )
                {
                        my $title;

                        my $pdb_id = $struc->id;
                        print "Structure ", $pdb_id,"\n";

                        my $ac = $struc->annotation();

                        foreach my $key ( $ac->get_all_annotation_keys() )
                        {
                                if($key eq "title")
                                {
                                        my @values =
$ac->get_Annotations($key);
                                        foreach my $value (@values)
                                        {
                                                $title = $value->as_text;
                                                chomp($title);
                                                if($title =~ /Value\: (.*)/)
                                                {
                                                        $title = $1;
                                                }
                                                $title =~ s/\s+/ /g;

                                                print "Title ",$title,"\n";
                                                last;
                                        }
                                        last;
                                }
                  }
}

Is this ok?

On 8/17/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> My main concern is just the pdb id and title. PDB id I am able to fetch
> easily, but is there a method which can give me the title of the PDB
> structure?
>
> Like for example from the following :-
>
> HEADER    DNA/RNA                                 05-DEC-94   100D
> TITLE     CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
> TITLE    2 R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO
> TITLE    3 PHOSPHATE ONLY AND MINOR GROOVE TERTIARY BASE-PAIRING
> COMPND    MOL_ID: 1;
> COMPND   2 MOLECULE: DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*GP*)-
> COMPND   3 R(*G)-3');
> COMPND   4 CHAIN: A, B;
> .
> .
> .
> .
>
> I just want "CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
> R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO PHOSPHATE ONLY AND
> MINOR GROOVE TERTIARY BASE-PAIRING".
>
> Thanks,
> Neeti.
>
> On 8/16/07, Chris Fields <cjfields at uiuc.edu> wrote:
> >
> >
> > On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:
> >
> > > neeti somaiya wrote:
> > >> I tried using Bio::Structure::IO::pdb with some code like :-
> > >> use Bio::Structure::IO;
> > >>
> > >>     $in  = Bio::Structure::IO->new(-file => " pdb100d.ent",
> > >>                                    -format => 'pdb');
> > >>
> > >>     while ( my $struc = $in->next_structure() ) {
> > >>        print "Structure ", $struc->id,"\n";
> > >>     }
> > >>
> > >> It works well. But I am not able to find documentation of other
> > >> methods
> > >> which will give me various specific details available in a pdb
> > >> file, right
> > >> from title, keywords, references to structure details, atoms,
> > >> coordinates
> > >> etc. There must be different methods to fetch and parse each of
> > >> this data
> > >> from a pdb file, right? Where can I find the details?
> > >
> > > $struct is a Bio::Structure::Entry, so look at the docs for that:
> > > http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
> > >
> > > You'll probably want to look at the docs for the other Structure
> > > modules
> > > as well:
> > > http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
> > >
> > >
> > > I agree, the documentation in this area could be improved.
> > > Bio::Structure::StructureI could actually contain something, and
> > > Bio::Structure should actually exist or not be referenced in the docs.
> >
> >
> > There was a discussion a while back on refactoring the code within
> > Bio::Structure to better deal with HETATM and other stuff.  As far as
> > I'm concerned it's open for anyone wanted to tinker with it.
> >
> > chris
> >
>
>
>
> --
> -Neeti
> Even my blood says, B positive
>


-- 
-Neeti
Even my blood says, B positive

From n.haigh at sheffield.ac.uk  Fri Aug 17 06:27:43 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 17 Aug 2007 11:27:43 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <46C5781F.60301@sheffield.ac.uk>

De-Jian,ZHAO wrote:
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
>   


You should check out the Deobfuscator at:
http://bioperl.org/cgi-bin/deob_interface.cgi

Search and choose the object of choice. e.g. Bio::Tools::Run::RemoteBlast

You will be provided a list of methods available to that object,
including all the methods up the inheritance hierarchy. Unfortunately,
some bioperl modules are documented more thoroughly than others.

Nath

From bix at sendu.me.uk  Fri Aug 17 09:35:01 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 17 Aug 2007 14:35:01 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	
	<46C41FEC.2000206@sendu.me.uk>	
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>	
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
	<764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
Message-ID: <46C5A405.2070005@sendu.me.uk>

neeti somaiya wrote:
> Hi,
> 
> I have done it currently as follows :
[snip]
> Is this ok?

If it works, of course. There seems to be some redundant code there, 
however. I'm guessing this would be better (assuming your code worked in 
the first place):

while (my $struc = $in->next_structure()) {
     my $pdb_id = $struc->id;
     print "Structure ", $pdb_id,"\n";

     my $ac = $struc->annotation();
     my ($title) = $ac->get_Annotations('title');
     $title = $title->as_text;
     chomp($title);
     if ($title =~ /Value\: (.*)/) {
         $title = $1;
     }
     $title =~ s/\s+/ /g;

     print "Title ",$title,"\n";
}

From muratem at eng.uah.edu  Fri Aug 17 10:03:22 2007
From: muratem at eng.uah.edu (Mike Muratet)
Date: Fri, 17 Aug 2007 09:03:22 -0500 (CDT)
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
In-Reply-To: <9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>
References: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
	<9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>
Message-ID: <Pine.GSO.4.60.0708170902570.23859@eng.uah.edu>


On Thu, 16 Aug 2007, Chris Fields wrote:

> Date: Thu, 16 Aug 2007 14:55:05 -0500
> From: Chris Fields <cjfields at uiuc.edu>
> To: muratem at eng.uah.edu
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
> 
>
> On Aug 16, 2007, at 2:10 PM, muratem at eng.uah.edu wrote:
>
>> Hello
>> 
>> This might not be the correct list for this particular problem, but
>> hopefully someone can help. I am trying to install ...staden::read on a
>> Mac OS X 10.4. I tried installing cpan but it wouldn't work so I went to
>> the manual methods. Perl is on the system and appears to be installed
>> correctly for a Mac. Bioperl 1.5.2 was installed via fink and appears to
>> be OK also. I'm trying to install the Bio::SeqIO::staden::read module. I
>> downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the usual
>> perl Makefile.PL and make and get:
>> 
>> newyork:/usr/local/bioperl-ext-1.5.1 root# make
>> Makefile:1148: *** multiple target patterns.  Stop.
>> 
>> A snippet from the Makefile...
>> 
>>    1148 pm_to_blib: $(TO_INST_PM)
>>    1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
>> 'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
>>    1150           Bio/Ext/Align/libs/hscore.h
>> $(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
>>    1151           Bio/Ext/Align/libs/probability.c
>> $(INST_LIB)/Bio/Ext/Align/libs/probability.c \
>>    1152           Bio/Ext/Align/libs/linesubs.h
>> $(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
>>    1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/test.pl 
>> \
>>    1154           Bio/Ext/Align/libs/wiseoverlay.h
>> $(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
>>    1155           Bio/Ext/Align/libs/proteinsw.h
>> $(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
>>    1156           Bio/Ext/Align/libs/wisebase.h
>> $(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
>>    1157           Bio/Ext/Align/libs/seqaligndisplay.h
>> $(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
>>    1158           Bio/Ext/Align/libs/dyna.h
>> $(INST_LIB)/Bio/Ext/Align/libs/dyna.h \
>> 
>> The README says you don't have to build the whole package, so I descended
>> to the staden directory and did a Make and didn't get any problems
>> reported. But when I did a make test I get:
>> 
>> newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
>> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
>> "test_harness(0, '../blib/lib', '../blib/arch')" test.pl
>> test....Had problems bootstrapping Inline module 
>> 'Bio::SeqIO::staden::read'
>> 
>> Can't load
>> '/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/ 
>> Bio/SeqIO/staden/read/read.bundle'
>> for module Bio::SeqIO::staden::read:
>> dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/ 
>> auto/Bio/SeqIO/staden/read/read.bundle,
>> 2): Symbol not found: _curl_easy_init
>>   Referenced from:
>> /usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/ 
>> SeqIO/staden/read/read.bundle
>>   Expected in: dynamic lookup
>>  at /Library/Perl/5.8.6/Inline.pm line 500
>> 
>> 
>>  at test.pl line 0
>> INIT failed--call queue aborted, <DATA> line 1.
>> test....dubious
>>         Test returned status 255 (wstat 65280, 0xff00)
>> DIED. FAILED tests 1-94
>>         Failed 94/94 tests, 0.00% okay
>> Failed Test Stat Wstat Total Fail  Failed  List of Failed
>> ---------------------------------------------------------------------- 
>> ---------
>> test.pl      255 65280    94  188 200.00%  1-94
>> Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00% okay.
>> make: *** [test_dynamic] Error 2
>> 
>> The missing symbol is apparently from libcurl. I have both libcurl.2.dylib
>> and libcurl.3.dylib with copies in multiple locations including /usr/lib,
>> /usr/local/lib and the usual Mac directories. I used the Mac otool to look
>> at the externals in read.bundle and it references libz.1.dylib and
>> libSystem.B.dylib. Could this be a case where there should have been a
>> link to libcurl and wasn't?
>> 
>> I've searched the list and see only the Inline versioning problem (which I
>> had and fixed). Has anybody seen this problem before or built the module
>> on a Mac? How did you do it? Is this a question for the Staden list on
>> sourceforge?
>> 
>> Thanks
>> 
>> Mike
>
> Haven't seen the problem you list.  I have installed it on Mac OS X (intel) 
> w/o problems so I know it works; at least all tests passed though I remember 
> Inline complaining for some reason.
>
> You should try using bioperl-ext from CVS (it is really 1.5.1 but with 
> updated docs and maybe a change or two).  The process is a little tricky but 
> is documented in the README in the package.  You'll need the old io_lib 
> (1.8.12 or earlier) from Staden if memory serves.
>
> chris
>

Thanks, I'll give that a try.

Mike

From alexl at users.sourceforge.net  Fri Aug 17 11:23:33 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 08:23:33 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	(Kevin Brown's message of "Fri\, 17 Aug 2007 08\:11\:40 -0700")
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
Message-ID: <n9ir7e18y2.fsf@allele2.localdomain>

>>>>> "KB" == Kevin Brown  writes:

[...]

>> Also note that since Perl's license is a dual-license "GPL or
>> Artistic" then people aren't required to submit their modifications
>> back to the bioperl distribution because they can choose to follow
>> the Artistic (rather than the GPL) license which doesn't require
>> modifications to be submitted back.  This means the point:

KB> You aren't required to submit patches even under the GPL.  If I
KB> make changes and don't distribute them then I have no requirement
KB> to reveal my changes to the bioperl source code.  Also the GPL
KB> does not require that the code be made freely available to all,
KB> just that users of GPL'd software can request the source from the
KB> vendor/distributor and should not find lots of little hoops to
KB> jump through to get it.  You can even charge to get access if that
KB> charge is to cover the cost of the expense to get it (such as the
KB> cost of a cd + mail delivery charge).

Sure, I was just pointing out that you can avoid even these things if
you choose the Artistic license.  I have no problem with the GPL, but
some people do.  The other possibility (if the current Perl "GPL or
Artistic" is not a possibility) is simply upgrading to the "Artistic
2.0" license adopted by the Perl Foundation for Perl 6 and later (I
think?):

http://www.perlfoundation.org/artistic_license_2_0

it's a GPL-compatible free software license.

Alex

From Kevin.M.Brown at asu.edu  Fri Aug 17 11:11:40 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 17 Aug 2007 08:11:40 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <nrsl6i1ub4.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
Message-ID: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>

> AL> I assume that the intent of the bioperl authors is to 
> license with 
> AL> the same terms as Perl's *current* license (which would 
> mean bioperl 
> AL> is really effectively dually-licensed under the GPL or Artistic 
> AL> license).  If so, it would be good if the LICENSE text 
> and the wiki 
> AL> were updated to reflect this.
> 
> Also note that since Perl's license is a dual-license "GPL or 
> Artistic" then people aren't required to submit their 
> modifications back to the bioperl distribution because they 
> can choose to follow the Artistic (rather than the GPL) 
> license which doesn't require modifications to be submitted 
> back.  This means the point:

You aren't required to submit patches even under the GPL.  If I make
changes and don't distribute them then I have no requirement to reveal
my changes to the bioperl source code.  Also the GPL does not require
that the code be made freely available to all, just that users of GPL'd
software can request the source from the vendor/distributor and should
not find lots of little hoops to jump through to get it.  You can even
charge to get access if that charge is to cover the cost of the expense
to get it (such as the cost of a cd + mail delivery charge).


From cjfields at uiuc.edu  Fri Aug 17 12:07:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 17 Aug 2007 11:07:47 -0500
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <n9ir7e18y2.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
Message-ID: <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>


On Aug 17, 2007, at 10:23 AM, Alex Lancaster wrote:

>>>>>> "KB" == Kevin Brown  writes:
>
> [...]
>
>>> Also note that since Perl's license is a dual-license "GPL or
>>> Artistic" then people aren't required to submit their modifications
>>> back to the bioperl distribution because they can choose to follow
>>> the Artistic (rather than the GPL) license which doesn't require
>>> modifications to be submitted back.  This means the point:
>
> KB> You aren't required to submit patches even under the GPL.  If I
> KB> make changes and don't distribute them then I have no requirement
> KB> to reveal my changes to the bioperl source code.  Also the GPL
> KB> does not require that the code be made freely available to all,
> KB> just that users of GPL'd software can request the source from the
> KB> vendor/distributor and should not find lots of little hoops to
> KB> jump through to get it.  You can even charge to get access if that
> KB> charge is to cover the cost of the expense to get it (such as the
> KB> cost of a cd + mail delivery charge).
>
> Sure, I was just pointing out that you can avoid even these things if
> you choose the Artistic license.  I have no problem with the GPL, but
> some people do.  The other possibility (if the current Perl "GPL or
> Artistic" is not a possibility) is simply upgrading to the "Artistic
> 2.0" license adopted by the Perl Foundation for Perl 6 and later (I
> think?):
>
> http://www.perlfoundation.org/artistic_license_2_0
>
> it's a GPL-compatible free software license.
>
> Alex

Switching to Artistic 2.0 is probably the best way to go.  We'll need  
a more involved discussion but I don't think there'll be too many  
objections.  You mention GPL-compatibility; is that for v2 and v3?

chris


From gonzaled at tcd.ie  Fri Aug 17 13:03:35 2007
From: gonzaled at tcd.ie (David Gonzalez)
Date: Fri, 17 Aug 2007 18:03:35 +0100
Subject: [Bioperl-l] Bio::SeqIO::swiss species parsing bug?
Message-ID: <46C5D4E7.6000605@tcd.ie>

	Hi,

	I had a problem with a swissprot file in which the genus and species
were being left undefined, and I believe it could be a bug in the
swiss.pm module.


	When I tried to parse the file with Bio::SeqIO, I got the following
error messages:

Use of uninitialized value in pattern match (m//) at
/sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 965, <GEN0> line 12.
Use of uninitialized value in string eq at
/sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 967, <GEN0> line 12.

	The fields I wanted from the file (gene_id , etc.. ) were fine however,
so it was being parsed.

	I checked the output with Data::Dumper and I found the following in the
species entry; the species is left undefined, and the common name is absent.

 	'species' => bless( {
                             '_ncbi_taxid' => 'Not',
                             '_classification' => [
                                                   	undef,
                                                   	undef,
                                                   	'Aedes',
                                                  						    	'Culicini',
                                                        'Culicinae',
                                                        'Culicidae',
                                                        'Culicoidea',
                                                        'Nematocera',
                                                        'Diptera',
                                                        'Endopterygota',
                                                        'Neoptera',
                                                        'Pterygota',
                                                        'Insecta',
                                                        'Hexapoda',
                               							'Arthropoda',
                                         							'Metazoa',
                                                        'Eukaryota'
                                                            ]
                                     }, 'Bio::Species' ),

	The species line in the file is formatted according to the swissprot
specifications and includes a common name

OS   Aedes aegypti (yellow fever mosquito)
OC   Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; Neoptera;
OC   Endopterygota; Diptera; Nematocera; Culicoidea; Culicidae; Culicinae;
OC   Culicini; Aedes.
OX   NCBI_TaxID=Not defined;

	I think the problem is in the line 905 of the swiss.pm file:

902	if(/^OS\s+(\S.+)/ && (! defined($binomial))) {
903	    $osline .= " " if $osline;
904	    $osline .= $1;
905	    if($osline =~ s/(,|, and|\.)$//) {
906		($binomial, $descr) = $osline =~ /(\S[^\(]+)(.*)/;
907             ($ns_name) = $binomial;
908             $ns_name =~ s/\s+$//; #####


	The problem seems to be that there are no punctuation signs, so 905
returns false. The swissprot format does not require the line to end in
'.' I think although it normally does. By just removing the requirement
for the substitution the output of Data::Dumper seemed normal

	....
	'_common_name' => 'yellow fever mosquito',
        '_ncbi_taxid' => 'Not',
        '_classification' => [
                              'aegypti',
                              'Aedes',
                              'Culicini',
	....

	I am using the fink installed bioperl:
	bioperl-pm586   1.4-5   Perl module for biology

	I don't know if this has  been reported/solved in the newer versions of
bioperl.

	David

-- 
David Gonzalez Knowles
Smurfit Institute of Genetics
Trinity College
Dublin

From cjfields at uiuc.edu  Fri Aug 17 13:20:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 17 Aug 2007 12:20:21 -0500
Subject: [Bioperl-l] Bio::SeqIO::swiss species parsing bug?
In-Reply-To: <46C5D4E7.6000605@tcd.ie>
References: <46C5D4E7.6000605@tcd.ie>
Message-ID: <04912FDE-2AA4-414C-9CE4-A0BA5E9C89C9@uiuc.edu>


On Aug 17, 2007, at 12:03 PM, David Gonzalez wrote:

> 	Hi,
>
> 	I had a problem with a swissprot file in which the genus and species
> were being left undefined, and I believe it could be a bug in the
> swiss.pm module.
>
>
> 	When I tried to parse the file with Bio::SeqIO, I got the following
> error messages:
>
> Use of uninitialized value in pattern match (m//) at
> /sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 965, <GEN0> line 12.
> Use of uninitialized value in string eq at
> /sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 967, <GEN0> line 12.
> ...
> 	I am using the fink installed bioperl:
> 	bioperl-pm586   1.4-5   Perl module for biology
>
> 	I don't know if this has  been reported/solved in the newer  
> versions of
> bioperl.
>
> 	David
>
> -- 
> David Gonzalez Knowles
> Smurfit Institute of Genetics
> Trinity College
> Dublin

That looks like bioperl 1.4, which is several years old.  You should  
update to the latest official release (1.5.2), then see if the  
problem persists.

chris

From alexl at users.sourceforge.net  Sat Aug 18 07:33:34 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Sat, 18 Aug 2007 04:33:34 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> (Chris Fields's
	message of "Fri\, 17 Aug 2007 11\:07\:47 -0500")
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
Message-ID: <8td4xlyt4h.fsf@allele2.localdomain>

>>>>> "CF" == Chris Fields  writes:

[...]

>> Sure, I was just pointing out that you can avoid even these things
>> if you choose the Artistic license.  I have no problem with the
>> GPL, but some people do.  The other possibility (if the current
>> Perl "GPL or Artistic" is not a possibility) is simply upgrading to
>> the "Artistic 2.0" license adopted by the Perl Foundation for Perl
>> 6 and later (I think?):

>> http://www.perlfoundation.org/artistic_license_2_0

>> it's a GPL-compatible free software license.

CF> Switching to Artistic 2.0 is probably the best way to go.  We'll
CF> need a more involved discussion but I don't think there'll be too
CF> many objections.  You mention GPL-compatibility; is that for v2
CF> and v3?

IANAL, but looking at:

http://www.perlfoundation.org/artistic_2_0_notes

http://www.gnu.org/licenses/license-list.html (scroll down to
"Artistic 2.0")

it looks like you can choose any GPL license (i.e. v1 to v3).

I was really more concerned with clarifying what the bioperl license
was *right now*, because "the same license as Perl" implies the
so-called "disjunctive" "GPL or Artistic license":

http://www.gnu.org/licenses/license-list.html#PerlLicense

which is what I've marked the Fedora package as (since it listed "the
same license as Perl" in most of the source files), which is fine for
Fedora.

Fedora may possibly (still under discussion I believe) require removal
of any package that is licensed under the original (1.0) Artistic
alone and it would be a real shame if that required bioperl being
pulled from the repo.  I imagine the intent of the bioperl
contributors is that it should be under the same terms as Perl,
whatever that happens to be (which just happens to be GPL or Artistic,
which is fine).  A clarification to that effect would be useful.

Cheers,
Alex

From zhaodj at ioz.ac.cn  Sat Aug 18 11:06:41 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Sat, 18 Aug 2007 23:06:41 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <46C5781F.60301@sheffield.ac.uk>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
	<46C5781F.60301@sheffield.ac.uk>
Message-ID: <52869.159.226.67.49.1187449601.squirrel@mail.ioz.ac.cn>

Thank you,Nathan.
The Deobfuscator is very helpful.

On Fri, Aug 17, 2007 18:27, Nathan Haigh wrote:
> De-Jian,ZHAO wrote:
>> Dear list members,
>>
>> I have a question about the methods of bioperl objects.It is how
>> and
>> where we can get the whole methods of a bioperl object.
>>
>> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
>> this object, some sample codes are given.The following five
>> clauses
>> are excerpted from the synopsis.
>> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>> (2)while ( my @rids = $factory->each_rid ) {
>> (3)$factory->remove_rid($rid);
>> (4)my $rc = $factory->retrieve_blast($rid);
>> (5)my $r = $factory->submit_blast($input);
>>
>> The five clauses use five methods of the RemoteBlast object,i.e.
>> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
>> (5)submit_blast. However,I only find part of them(45) are listed
>> in
>> the appendix while others(123) are absent. Are there some more
>> methods not explictly declared? I don't know.This will lead to the
>> partial understanding and utilization of the module.Therefore I
>> come
>> here for the way to get the full methods of a bioperl object.
>>
>> Thanks!
>>
>
>
> You should check out the Deobfuscator at:
> http://bioperl.org/cgi-bin/deob_interface.cgi
>
> Search and choose the object of choice. e.g.
> Bio::Tools::Run::RemoteBlast
>
> You will be provided a list of methods available to that object,
> including all the methods up the inheritance hierarchy.
> Unfortunately,
> some bioperl modules are documented more thoroughly than others.
>
> Nath
>

From hlapp at gmx.net  Sat Aug 18 12:13:28 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 18 Aug 2007 12:13:28 -0400
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <8td4xlyt4h.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
	<8td4xlyt4h.fsf@allele2.localdomain>
Message-ID: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>


On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote:

> I imagine the intent of the bioperl
> contributors is that it should be under the same terms as Perl,
> whatever that happens to be (which just happens to be GPL or Artistic,
> which is fine).

I fully agree.

>   A clarification to that effect would be useful.

Agreed, too. Would you mind changing that language on the wiki, since  
you seem to have a fairly good grasp on the issue?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Aug 18 12:42:04 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 18 Aug 2007 11:42:04 -0500
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
	<8td4xlyt4h.fsf@allele2.localdomain>
	<8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>
Message-ID: <D3B67BC2-CB56-420F-B4E3-E0A57FEA7E80@uiuc.edu>


On Aug 18, 2007, at 11:13 AM, Hilmar Lapp wrote:

>
> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote:
>
>> I imagine the intent of the bioperl
>> contributors is that it should be under the same terms as Perl,
>> whatever that happens to be (which just happens to be GPL or  
>> Artistic,
>> which is fine).
>
> I fully agree.
>
>>   A clarification to that effect would be useful.
>
> Agreed, too. Would you mind changing that language on the wiki, since
> you seem to have a fairly good grasp on the issue?
>
> 	-hilmar

Looks like the modules mostly state 'You may distribute this module  
under the same terms as perl itself', but there are likely a few  
which need to be changed.  Might be worth running a quick code audit  
to see what's present.

chris

From avilella at gmail.com  Sat Aug 18 16:38:10 2007
From: avilella at gmail.com (Albert Vilella)
Date: Sat, 18 Aug 2007 21:38:10 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <358f4d650708181338s5a5caadbscfa85786327f4304@mail.gmail.com>

I particularly like to code and debug at the same time. When you are using
the perl debugger, you can do an:

<DB> m $object

and it will show up all the information and methods for that object.

Cheers,

    Albert.

On 8/17/07, De-Jian,ZHAO <zhaodj at ioz.ac.cn> wrote:
>
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
> --
> De-Jian Zhao
> Institute of Zoology,Chinese Academy of Sciences
> +86-10-64807217
> zhaodj at ioz.ac.cn
>
>
>
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From avilella at gmail.com  Sat Aug 18 16:38:10 2007
From: avilella at gmail.com (Albert Vilella)
Date: Sat, 18 Aug 2007 21:38:10 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <358f4d650708181338s5a5caadbscfa85786327f4304@mail.gmail.com>

I particularly like to code and debug at the same time. When you are using
the perl debugger, you can do an:

<DB> m $object

and it will show up all the information and methods for that object.

Cheers,

    Albert.

On 8/17/07, De-Jian,ZHAO <zhaodj at ioz.ac.cn> wrote:
>
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
> --
> De-Jian Zhao
> Institute of Zoology,Chinese Academy of Sciences
> +86-10-64807217
> zhaodj at ioz.ac.cn
>
>
>
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From neetisomaiya at gmail.com  Mon Aug 20 00:33:17 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 10:03:17 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C5A405.2070005@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
	<764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
Message-ID: <764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>

Hi,

Thanks for the responses.
Another question I had was, I am interested only in pdb id and title, and
for this I am downloading and unzipping each of the full pdb structure
files, parsing to get just id and title. Is there any other data source
which can give me just id and title of pdb structures, without me having to
download the full file of each structre?

Thanks,
Neeti.

On 8/17/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> neeti somaiya wrote:
> > Hi,
> >
> > I have done it currently as follows :
> [snip]
> > Is this ok?
>
> If it works, of course. There seems to be some redundant code there,
> however. I'm guessing this would be better (assuming your code worked in
> the first place):
>
> while (my $struc = $in->next_structure()) {
>      my $pdb_id = $struc->id;
>      print "Structure ", $pdb_id,"\n";
>
>      my $ac = $struc->annotation();
>      my ($title) = $ac->get_Annotations('title');
>      $title = $title->as_text;
>      chomp($title);
>      if ($title =~ /Value\: (.*)/) {
>          $title = $1;
>      }
>      $title =~ s/\s+/ /g;
>
>      print "Title ",$title,"\n";
> }
>


-- 
-Neeti
Even my blood says, B positive

From jaudall at gmail.com  Mon Aug 20 00:39:18 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Sun, 19 Aug 2007 21:39:18 -0700
Subject: [Bioperl-l] concatenating aln splices
Message-ID: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>

Based on several criteria, I've extracted several splices from a
single alignment and I'm trying to concatenate my selected sequences
together.  Unfortunately, one of the sequences in the original
alignment only has gap characters for one or more of the splices.  I'd
like to keep the gap splices because other downstream aligned bases
depend on them.  I get these two warning messages splicing my
alignments together:

-------------------- WARNING ---------------------
MSG: Got a sequence with no letters in it cannot guess alphabet []
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Slice [232-233] of sequence [X2A/1-202] contains no residues.
Sequence excluded from the new alignment.
---------------------------------------------------

and now because of missing gaps, I get this error when trying to
concatenate them:

-------------------- WARNING ---------------------
MSG: expecting 236 not 203 from X2A
---------------------------------------------------

------------- EXCEPTION  -------------
MSG: All sequences in the alignment must be the same length
STACK Bio::AlignIO::phylip::write_aln
/sw/lib/perl5/5.8.6/Bio/AlignIO/phylip.pm:292

I don't mind the warnings, in fact I like them, but is there a way to
stop the splice function from removing the 'gap' sequence from the
alignment?  Perhaps catching the warning and inserting the gaps
afterwards might work, but I'm wondering if there's is a simpler
modification of SimpleAlign.pm that might work.  Any thoughts?

Josh

From bix at sendu.me.uk  Mon Aug 20 03:43:45 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 20 Aug 2007 08:43:45 +0100
Subject: [Bioperl-l] concatenating aln splices
In-Reply-To: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
References: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
Message-ID: <46C94631.2060704@sendu.me.uk>

Joshua Udall wrote:
> Based on several criteria, I've extracted several splices from a
> single alignment and I'm trying to concatenate my selected sequences
> together.  Unfortunately, one of the sequences in the original
> alignment only has gap characters for one or more of the splices.  I'd
> like to keep the gap splices because other downstream aligned bases
> depend on them.
[snip]
> I don't mind the warnings, in fact I like them, but is there a way to
> stop the splice function from removing the 'gap' sequence from the
> alignment?  Perhaps catching the warning and inserting the gaps
> afterwards might work, but I'm wondering if there's is a simpler
> modification of SimpleAlign.pm that might work.  Any thoughts?

Let us see some code, so we can get a better idea of what you're doing 
and what you've tried.

You can avoid losing sequences during a slice by not doing a slice. 
Instead, remove_columns(). This way you don't have to splice alignments 
together; you go from original alignment to 'spliced' version in one step.

From Oliver.Wafzig at sygnis.de  Mon Aug 20 04:42:55 2007
From: Oliver.Wafzig at sygnis.de (Oliver Wafzig)
Date: Mon, 20 Aug 2007 10:42:55 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
Message-ID: <200708201042.55292.Oliver.Wafzig@sygnis.de>

On Monday 20 August 2007 06:33, neeti somaiya wrote:
> Another question I had was, I am interested only in pdb id and title, and
> for this I am downloading and unzipping each of the full pdb structure
> files, parsing to get just id and title. Is there any other data source

Hi Neeti,
this is a non bioperl way to download the data.
Use the SRS server on the EBI page to download only id and title lines from 
pdb.

1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
2) Search for 'PDB' on the 'library page' and select it.
3) Use the standard query form. Select 'id' in the dropdown list and 
insert '*' (wildcard).
4) Create a view by selecting 'ID' and 'Title', then click the search button.
5) Click the save results button.
6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of entries 
to download' field. Press 'save'.

If the download is slow, read the 'download tips' on the download page and 
split the results in chunks. 

-- 
Oliver

From neetisomaiya at gmail.com  Mon Aug 20 09:05:01 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 18:35:01 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <200708201042.55292.Oliver.Wafzig@sygnis.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
Message-ID: <764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>

Thanks for your response.
Actually I am looking for something standalone and not on the web, as in
something which I can download onto my machine and parse later to get id and
title.

On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>
> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> > Another question I had was, I am interested only in pdb id and title,
> and
> > for this I am downloading and unzipping each of the full pdb structure
> > files, parsing to get just id and title. Is there any other data source
>
> Hi Neeti,
> this is a non bioperl way to download the data.
> Use the SRS server on the EBI page to download only id and title lines
> from
> pdb.
>
> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
> 2) Search for 'PDB' on the 'library page' and select it.
> 3) Use the standard query form. Select 'id' in the dropdown list and
> insert '*' (wildcard).
> 4) Create a view by selecting 'ID' and 'Title', then click the search
> button.
> 5) Click the save results button.
> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> entries
> to download' field. Press 'save'.
>
> If the download is slow, read the 'download tips' on the download page and
> split the results in chunks.
>
> --
> Oliver
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive

From bernd at kirx.de  Mon Aug 20 12:57:28 2007
From: bernd at kirx.de (Bernd Mueller)
Date: Mon, 20 Aug 2007 18:57:28 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
Message-ID: <46C9C7F8.3020608@kirx.de>

Hello,

Maybe you wanna try the Database-EUtilities module from bioperl. They 
are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

I tried them for a similar search on pubmed but without any reasonable 
results because my target was too focused.

 From EUtilities documentation on 
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases

"Protein Database

The Protein database contains sequence data from the translated coding 
regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein 
sequences submitted to Protein Information Resource (PIR), SWISS-PROT, 
Protein Research Foundation (PRF), and Protein Data Bank (PDB) 
(sequences from solved structures). "

So PDB is included in eutilities from NCBI.

Regards,
Bernd

neeti somaiya wrote:
> Thanks for your response.
> Actually I am looking for something standalone and not on the web, as in
> something which I can download onto my machine and parse later to get id and
> title.
> 
> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>> Another question I had was, I am interested only in pdb id and title,
>> and
>>> for this I am downloading and unzipping each of the full pdb structure
>>> files, parsing to get just id and title. Is there any other data source
>> Hi Neeti,
>> this is a non bioperl way to download the data.
>> Use the SRS server on the EBI page to download only id and title lines
>> from
>> pdb.
>>
>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>> 2) Search for 'PDB' on the 'library page' and select it.
>> 3) Use the standard query form. Select 'id' in the dropdown list and
>> insert '*' (wildcard).
>> 4) Create a view by selecting 'ID' and 'Title', then click the search
>> button.
>> 5) Click the save results button.
>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>> entries
>> to download' field. Press 'save'.
>>
>> If the download is slow, read the 'download tips' on the download page and
>> split the results in chunks.
>>
>> --
>> Oliver
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> 
> 
> 

-- 
Dipl.-Inform.(FH)
Bernd Mueller
phone: +49 179 2336692
email: bernd at kirx.de


From neetisomaiya at gmail.com  Mon Aug 20 13:39:01 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 23:09:01 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C9C7F8.3020608@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
Message-ID: <764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>

Hi,

Thanks for all the responses.
I got the solution from RCBS people :-

Dear Dr. Somaiya,

Thank you for your email message.

Please try the following:
1) Go to http://www.pdb.org/pdb/statistics/holdings.do and select the
number in the bottom right corner of the table (currently 45213).
2) From the menu on the left select 'Tabulate'>>'Custom Report' and
under 'Primary Citation' select 'Title'
3) At the bottom, select 'Create Report' and then one of the 'Download'
options.

Please let us know if we can be of additional assistance.

Sincerely,
Rachel Green

On 8/20/07, Bernd Mueller <bernd at kirx.de> wrote:
>
> Hello,
>
> Maybe you wanna try the Database-EUtilities module from bioperl. They
> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> I tried them for a similar search on pubmed but without any reasonable
> results because my target was too focused.
>
> From EUtilities documentation on
>
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases
>
> "Protein Database
>
> The Protein database contains sequence data from the translated coding
> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> (sequences from solved structures). "
>
> So PDB is included in eutilities from NCBI.
>
> Regards,
> Bernd
>
> neeti somaiya wrote:
> > Thanks for your response.
> > Actually I am looking for something standalone and not on the web, as in
> > something which I can download onto my machine and parse later to get id
> and
> > title.
> >
> > On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
> >> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> >>> Another question I had was, I am interested only in pdb id and title,
> >> and
> >>> for this I am downloading and unzipping each of the full pdb structure
> >>> files, parsing to get just id and title. Is there any other data
> source
> >> Hi Neeti,
> >> this is a non bioperl way to download the data.
> >> Use the SRS server on the EBI page to download only id and title lines
> >> from
> >> pdb.
> >>
> >> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
> >> 2) Search for 'PDB' on the 'library page' and select it.
> >> 3) Use the standard query form. Select 'id' in the dropdown list and
> >> insert '*' (wildcard).
> >> 4) Create a view by selecting 'ID' and 'Title', then click the search
> >> button.
> >> 5) Click the save results button.
> >> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> >> entries
> >> to download' field. Press 'save'.
> >>
> >> If the download is slow, read the 'download tips' on the download page
> and
> >> split the results in chunks.
> >>
> >> --
> >> Oliver
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> >
> >
>
> --
> Dipl.-Inform.(FH)
> Bernd Mueller
> phone: +49 179 2336692
> email: bernd at kirx.de
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive

From jaudall at gmail.com  Mon Aug 20 14:30:26 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Mon, 20 Aug 2007 12:30:26 -0600
Subject: [Bioperl-l] concatenating aln splices
In-Reply-To: <46C94631.2060704@sendu.me.uk>
References: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
	<46C94631.2060704@sendu.me.uk>
Message-ID: <52cea20c0708201130u29af2e10w78a852d7f88c23d1@mail.gmail.com>

Thanks, Sendu!  That suggestion was exactly what I needed.  I have it worked
out now with the remove_columns function.  Much easier that way :)

Josh

On 8/20/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> Joshua Udall wrote:
> > Based on several criteria, I've extracted several splices from a
> > single alignment and I'm trying to concatenate my selected sequences
> > together.  Unfortunately, one of the sequences in the original
> > alignment only has gap characters for one or more of the splices.  I'd
> > like to keep the gap splices because other downstream aligned bases
> > depend on them.
> [snip]
> > I don't mind the warnings, in fact I like them, but is there a way to
> > stop the splice function from removing the 'gap' sequence from the
> > alignment?  Perhaps catching the warning and inserting the gaps
> > afterwards might work, but I'm wondering if there's is a simpler
> > modification of SimpleAlign.pm that might work.  Any thoughts?
>
> Let us see some code, so we can get a better idea of what you're doing
> and what you've tried.
>
> You can avoid losing sequences during a slice by not doing a slice.
> Instead, remove_columns(). This way you don't have to splice alignments
> together; you go from original alignment to 'spliced' version in one step.
>

From cjfields at uiuc.edu  Mon Aug 20 14:51:14 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 20 Aug 2007 13:51:14 -0500
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C9C7F8.3020608@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
Message-ID: <4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>

Just curious, but what kind of query were you trying?  It might be  
worth trying to work through it to add as an example to the cookbook  
page.

chris

On Aug 20, 2007, at 11:57 AM, Bernd Mueller wrote:

> Hello,
>
> Maybe you wanna try the Database-EUtilities module from bioperl. They
> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> I tried them for a similar search on pubmed but without any reasonable
> results because my target was too focused.
>
>  From EUtilities documentation on
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=helpentrez.section.EntrezHelp.The_Databases
>
> "Protein Database
>
> The Protein database contains sequence data from the translated coding
> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as  
> protein
> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> (sequences from solved structures). "
>
> So PDB is included in eutilities from NCBI.
>
> Regards,
> Bernd
>
> neeti somaiya wrote:
>> Thanks for your response.
>> Actually I am looking for something standalone and not on the web,  
>> as in
>> something which I can download onto my machine and parse later to  
>> get id and
>> title.
>>
>> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>>> Another question I had was, I am interested only in pdb id and  
>>>> title,
>>> and
>>>> for this I am downloading and unzipping each of the full pdb  
>>>> structure
>>>> files, parsing to get just id and title. Is there any other data  
>>>> source
>>> Hi Neeti,
>>> this is a non bioperl way to download the data.
>>> Use the SRS server on the EBI page to download only id and title  
>>> lines
>>> from
>>> pdb.
>>>
>>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>>> 2) Search for 'PDB' on the 'library page' and select it.
>>> 3) Use the standard query form. Select 'id' in the dropdown list and
>>> insert '*' (wildcard).
>>> 4) Create a view by selecting 'ID' and 'Title', then click the  
>>> search
>>> button.
>>> 5) Click the save results button.
>>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>>> entries
>>> to download' field. Press 'save'.
>>>
>>> If the download is slow, read the 'download tips' on the download  
>>> page and
>>> split the results in chunks.
>>>
>>> --
>>> Oliver
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>>
>
> -- 
> Dipl.-Inform.(FH)
> Bernd Mueller
> phone: +49 179 2336692
> email: bernd at kirx.de
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bernd at kirx.de  Mon Aug 20 15:03:29 2007
From: bernd at kirx.de (Bernd Mueller)
Date: Mon, 20 Aug 2007 21:03:29 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
Message-ID: <46C9E581.1010907@kirx.de>

I attached my script.

Actually I tried to download all articles to a certain search term with
that script. The problem was that the retrieved documents were not free
as mentioned in the documentation of EUtilities on the NCBI page. So
many of the downloaded documents in xml-format were just dummies
containing only the abstract but not the fulltext article.

Bernd

Chris Fields wrote:
> Just curious, but what kind of query were you trying?  It might be worth 
> trying to work through it to add as an example to the cookbook page.
> 
> chris
> 
> On Aug 20, 2007, at 11:57 AM, Bernd Mueller wrote:
> 
>> Hello,
>>
>> Maybe you wanna try the Database-EUtilities module from bioperl. They
>> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>>
>> I tried them for a similar search on pubmed but without any reasonable
>> results because my target was too focused.
>>
>>  From EUtilities documentation on
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases 
>>
>>
>> "Protein Database
>>
>> The Protein database contains sequence data from the translated coding
>> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
>> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
>> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
>> (sequences from solved structures). "
>>
>> So PDB is included in eutilities from NCBI.
>>
>> Regards,
>> Bernd
>>
>> neeti somaiya wrote:
>>> Thanks for your response.
>>> Actually I am looking for something standalone and not on the web, as in
>>> something which I can download onto my machine and parse later to get 
>>> id and
>>> title.
>>>
>>> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>>>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>>>> Another question I had was, I am interested only in pdb id and title,
>>>> and
>>>>> for this I am downloading and unzipping each of the full pdb structure
>>>>> files, parsing to get just id and title. Is there any other data 
>>>>> source
>>>> Hi Neeti,
>>>> this is a non bioperl way to download the data.
>>>> Use the SRS server on the EBI page to download only id and title lines
>>>> from
>>>> pdb.
>>>>
>>>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>>>> 2) Search for 'PDB' on the 'library page' and select it.
>>>> 3) Use the standard query form. Select 'id' in the dropdown list and
>>>> insert '*' (wildcard).
>>>> 4) Create a view by selecting 'ID' and 'Title', then click the search
>>>> button.
>>>> 5) Click the save results button.
>>>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>>>> entries
>>>> to download' field. Press 'save'.
>>>>
>>>> If the download is slow, read the 'download tips' on the download 
>>>> page and
>>>> split the results in chunks.
>>>>
>>>> -- 
>>>> Oliver
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>>
>>>
>>
>> --Dipl.-Inform.(FH)
>> Bernd Mueller
>> phone: +49 179 2336692
>> email: bernd at kirx.de
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 
> 

-- 
Dipl.-Inform.(FH)
Bernd Mueller
phone: +49 179 2336692
email: bernd at kirx.de


-------------- next part --------------
A non-text attachment was scrubbed...
Name: myBioPerl.pl
Type: application/x-perl
Size: 1983 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070820/af579f0a/attachment.bin 

From jayoung at fhcrc.org  Mon Aug 20 18:09:04 2007
From: jayoung at fhcrc.org (Janet Young)
Date: Mon, 20 Aug 2007 15:09:04 -0700
Subject: [Bioperl-l] Assembly::IO write_assembly and remove_seq
Message-ID: <EE800ED8-52E7-4D80-A18F-EDBABB90056C@fhcrc.org>

Hi all,

I realized last week that write_assembly isn't implemented in  
Assemble::IO
(see http://bioperl.org/pipermail/bioperl-l/2006-May/021619.html )
I know this has been asked before, but I wondered if anything has  
changed - does anyone have any plans to write a write_assembly  
method? Alternatively, any suggestions for an alternative solution to  
what I'm trying to do?

I'm trying to write a script to make improvements to the assembly  
that phredPhrap comes out with - it seems to quite frequently throw  
an unrelated sequence into a contig with either no matching sequence  
at all, or very little matching sequence. Mysterious. Anyway, my  
script can recognize the bad sequences easily enough, and thought I'd  
be able to remove them and then write the modified assembly. No joy.  
One very inelegant solution I've played with is that I can add some  
"markedHighQuality" tags to the discrepant sequences in the ace file,  
meaning that next time phredPhrap is run, it sometimes manages not to  
assemble the sequences that shouldn't be there. I'm not sure this  
will work in all cases, and it seems like quite an unsatisfactory way  
to do it.

For the same reason, I'm hoping someone can tell me what remove_seq  
does to a contig object? I'm using it and I don't get any error  
messages (returns 1), but when I check the contig object afterwards  
with get_seq_ids, the sequence I wanted to remove didn't seem to go  
away. Also, when I check out the primary_tags for that contig in the  
objects returned by get_features_collection, nothing seems to have  
changed. So I'm not sure whether the sequence really was removed from  
anything at all, and if it was, which object did it get removed  
from?  (a snippet of my code is below)
           my @seqids  = $contig->get_seq_ids();
           print OUT "seqids @seqids\n";
           my $seqobj = $contig->get_seq_by_name($seq);
           $contig->remove_seq($seqobj) || die "failed to remove seq\n";
           @seqids  = $contig->get_seq_ids();
           print OUT "seqids @seqids\n";

thanks for any advice,

Janet Young


-------------------------------------------------------------------

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung at fhcrc.org

http://www.fhcrc.org/labs/trask/

-------------------------------------------------------------------


From cjfields at uiuc.edu  Tue Aug 21 00:06:26 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 20 Aug 2007 23:06:26 -0500
Subject: [Bioperl-l] EUtilities, was Re:  PDB Parser
In-Reply-To: <46C9E581.1010907@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
	<46C9E581.1010907@kirx.de>
Message-ID: <7BE17595-9BC0-498B-AFA9-03ED0C853BFC@uiuc.edu>

Bernd,

Just in case you weren't aware, I have changed several aspects of  
EUtilities since the 1.5.2 release, so any code in the HOWTO cookbook  
applies ONLY to the version found in CVS (there is a big note at the  
top stating such).  This should be the finalized API which I intend  
on supporting from this point on.  The reason I indicate that is  
there are several giveaways which indicate you are using the older  
API from 1.5.2 (using next_cookie, for instance).

The following modification of your script (using the API in bioperl- 
live) works for me.  You should be able to do something similar with  
the older API as well but I haven't tried.  Note that PMC full-text  
retrieval only works if the article is declared 'open-access'; not  
all journals allow that.  Also, any full-text is only available as  
XML which (I'm guessing here) is transformed to HTML for PMC.

....
my $agent = Bio::DB::EUtilities->new(-eutil      => 'esearch',
-db         => $db,
-term       => $query,
-usehistory => 'y');

my $ct = $agent->get_count;

print "Count = $ct\n";

my $history = $agent->next_History;

if ($fetch eq 'yes') {
   my ($retmax, $retstart) = (1,0);
   while ($retstart < $ct) {
	  $agent->set_parameters(
               -eutil => 'efetch',
               -history => $history,
               -rettype => 'xml',
               -retmax => $retmax,
               -retstart => $retstart,
		  );
           $agent->get_Response(-file => ">./papers/paper_ 
$retstart.xml");
           $retstart += $retmax;
   }
}

------------------------------

It may also be possible to grab the LinkOut for these and try to nab  
the PDF or use the DOI, but I haven't tried anything like that.

chris

On Aug 20, 2007, at 2:03 PM, Bernd Mueller wrote:

> I attached my script.
>
> Actually I tried to download all articles to a certain search term  
> with
> that script. The problem was that the retrieved documents were not  
> free
> as mentioned in the documentation of EUtilities on the NCBI page. So
> many of the downloaded documents in xml-format were just dummies
> containing only the abstract but not the fulltext article.
>
> Bernd
>
> Chris Fields wrote:
>> Just curious, but what kind of query were you trying?  It might be  
>> worth trying to work through it to add as an example to the  
>> cookbook page.
>> chris


From n.haigh at sheffield.ac.uk  Tue Aug 21 04:19:59 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 21 Aug 2007 09:19:59 +0100
Subject: [Bioperl-l] subversion progress
Message-ID: <46CAA02F.60803@sheffield.ac.uk>

Hi,

I was just wondering if there was any further progress towards the svn
migration recently? What is still needing to be done?

Cheers
Nath

From neetisomaiya at gmail.com  Tue Aug 21 05:41:22 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Tue, 21 Aug 2007 15:11:22 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
Message-ID: <764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>

Hi,

I wanted to automate my pdb script, right from downloading of data. As per
the solution given by RCSB about custom report for pdb ids and titles only,
I was trying something like the code below, but it doesnt seem to work :-

my $url = '
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv';
use LWP::Simple;
my $content = get $url;
die "Couldn't get $url" unless defined $content;

Can anyone tell how I can do it, if there is any other way to do it, or if I
am going wrong somewhere, or if it is't possible for this case at all.

Please help.

On 8/20/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> Thanks for all the responses.
> I got the solution from RCBS people :-
>
> Dear Dr. Somaiya,
>
> Thank you for your email message.
>
> Please try the following:
> 1) Go to http://www.pdb.org/pdb/statistics/holdings.do and select the
> number in the bottom right corner of the table (currently 45213).
> 2) From the menu on the left select 'Tabulate'>>'Custom Report' and
> under 'Primary Citation' select 'Title'
> 3) At the bottom, select 'Create Report' and then one of the 'Download'
> options.
>
> Please let us know if we can be of additional assistance.
>
> Sincerely,
> Rachel Green
>
> On 8/20/07, Bernd Mueller <bernd at kirx.de> wrote:
> >
> > Hello,
> >
> > Maybe you wanna try the Database-EUtilities module from bioperl. They
> > are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
> >
> > I tried them for a similar search on pubmed but without any reasonable
> > results because my target was too focused.
> >
> > From EUtilities documentation on
> >
> > http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases
> >
> > "Protein Database
> >
> > The Protein database contains sequence data from the translated coding
> > regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
> >
> > sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> > Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> > (sequences from solved structures). "
> >
> > So PDB is included in eutilities from NCBI.
> >
> > Regards,
> > Bernd
> >
> > neeti somaiya wrote:
> > > Thanks for your response.
> > > Actually I am looking for something standalone and not on the web, as
> > in
> > > something which I can download onto my machine and parse later to get
> > id and
> > > title.
> > >
> > > On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
> > >> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> > >>> Another question I had was, I am interested only in pdb id and
> > title,
> > >> and
> > >>> for this I am downloading and unzipping each of the full pdb
> > structure
> > >>> files, parsing to get just id and title. Is there any other data
> > source
> > >> Hi Neeti,
> > >> this is a non bioperl way to download the data.
> > >> Use the SRS server on the EBI page to download only id and title
> > lines
> > >> from
> > >> pdb.
> > >>
> > >> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk ).
> > >> 2) Search for 'PDB' on the 'library page' and select it.
> > >> 3) Use the standard query form. Select 'id' in the dropdown list and
> > >> insert '*' (wildcard).
> > >> 4) Create a view by selecting 'ID' and 'Title', then click the search
> > >> button.
> > >> 5) Click the save results button.
> > >> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> > >> entries
> > >> to download' field. Press 'save'.
> > >>
> > >> If the download is slow, read the 'download tips' on the download
> > page and
> > >> split the results in chunks.
> > >>
> > >> --
> > >> Oliver
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >>
> > >
> > >
> > >
> >
> > --
> > Dipl.-Inform.(FH)
> > Bernd Mueller
> > phone: +49 179 2336692
> > email: bernd at kirx.de
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
>
>
>
> --
> -Neeti
> Even my blood says, B positive
>


-- 
-Neeti
Even my blood says, B positive

From cjfields at uiuc.edu  Tue Aug 21 10:40:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 09:40:03 -0500
Subject: [Bioperl-l] subversion progress
In-Reply-To: <46CAA02F.60803@sheffield.ac.uk>
References: <46CAA02F.60803@sheffield.ac.uk>
Message-ID: <5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>

Not sure myself, to tell the truth.  Pretty much everything was ready  
to go (i.e. svn commits work, commits post to bioperl-guts, etc.);  
the only possible exception was svn->cvs syncing.  I believe the  
decision for svn access is to stick with ssh only for now for  
simplicity's sake.  I may have to go back into the archives to  
refresh my memory on all the details...

I think a time for the switchover just has to be set so that  
everybody is adequately forewarned, and the docs for getting started  
on SVN need to be updated accordingly.

chris

On Aug 21, 2007, at 3:19 AM, Nathan Haigh wrote:

> Hi,
>
> I was just wondering if there was any further progress towards the svn
> migration recently? What is still needing to be done?
>
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jwalker at watson.wustl.edu  Tue Aug 21 11:20:46 2007
From: jwalker at watson.wustl.edu (Jason Walker)
Date: Tue, 21 Aug 2007 10:20:46 -0500
Subject: [Bioperl-l] RemoteBlast not handling NCBI Error message
Message-ID: <46CB02CE.1080803@watson.wustl.edu>

I've noticed RemoteBlast does not handle a specific error message from 
NCBI correctly.  retrieve_blast() should return 0 if waiting, -1 on 
error, or the results when completed.  It looks like the method relies 
on a specific tag in the NCBI return,  'QBlastInfoBegin'.  The error 
message I'm getting does not have this tag or a value of 
'Status=ERROR'.  After contacting NCBI 'Blast-help', they stated that 
QBlastInfoBegin should not be expected from all GET requests.  The error 
can be reproduced by using RID CM2YJJW501R, until it expires tomorrow.

my $rid = 'CM2YJJW501R';
my $factory = Bio::Tools::Run::RemoteBlast->new( -verbose => 1,);
my $rc = $factory->retrieve_blast($rid);
print $rc ."\n";

The content returned from NCBI looks like:
<hr><font color="red">ERROR: An error has occurred on the server, Too 
many HSPs to save all
 Contact Blast-help at ncbi.nlm.nih.gov and include your RID: 
CM2YJJW501R</font><hr>

I added a conditional statement as seen below to correct my local copy.  
I'm not sure this is the best fix, but it works.
sub retrieve_blast {
    ...
    if( /QBlastInfoBegin/i ) {
        $s = 1;
    } elsif( $s ) {
        if( /Status=(WAITING|ERROR|READY)/i ) {
            ...
         }
    } elsif( /^(?:#\s)?[\w-]*?BLAST\w+/ ) {
        $waiting = 0;
        last;
    } elsif ( /ERROR/i ) {
        close($TMP);
        open(my $ERR, "<$tempfile") or $self->throw("cannot open file 
$tempfile");
        $self->warn(join("", <$ERR>));
        close $ERR;
        return -1;
    }
    ...
}

Thanks,
Jason Walker


From cjfields at uiuc.edu  Tue Aug 21 12:15:36 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 11:15:36 -0500
Subject: [Bioperl-l] RemoteBlast not handling NCBI Error message
In-Reply-To: <46CB02CE.1080803@watson.wustl.edu>
References: <46CB02CE.1080803@watson.wustl.edu>
Message-ID: <348D8645-5DC2-4606-9650-EB08D8053F3D@uiuc.edu>


On Aug 21, 2007, at 10:20 AM, Jason Walker wrote:

> I've noticed RemoteBlast does not handle a specific error message from
> NCBI correctly.  retrieve_blast() should return 0 if waiting, -1 on
> error, or the results when completed.  It looks like the method relies
> on a specific tag in the NCBI return,  'QBlastInfoBegin'.  The error
> message I'm getting does not have this tag or a value of
> 'Status=ERROR'.  After contacting NCBI 'Blast-help', they stated that
> QBlastInfoBegin should not be expected from all GET requests.  The  
> error
> can be reproduced by using RID CM2YJJW501R, until it expires tomorrow.
> ...
> I added a conditional statement as seen below to correct my local  
> copy.
> I'm not sure this is the best fix, but it works.
> sub retrieve_blast {
>     ...
>     if( /QBlastInfoBegin/i ) {
>         $s = 1;
>     } elsif( $s ) {
>         if( /Status=(WAITING|ERROR|READY)/i ) {
>             ...
>          }
>     } elsif( /^(?:#\s)?[\w-]*?BLAST\w+/ ) {
>         $waiting = 0;
>         last;
>     } elsif ( /ERROR/i ) {
>         close($TMP);
>         open(my $ERR, "<$tempfile") or $self->throw("cannot open file
> $tempfile");
>         $self->warn(join("", <$ERR>));
>         close $ERR;
>         return -1;
>     }
>     ...
> }
>
> Thanks,
> Jason Walker

I have added this to RemoteBlast in bioperl cvs.  Thanks for the notice!

chris

From bernd.web at gmail.com  Tue Aug 21 12:32:09 2007
From: bernd.web at gmail.com (Bernd Web)
Date: Tue, 21 Aug 2007 18:32:09 +0200
Subject: [Bioperl-l] SearchIO-BLAST
Message-ID: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>

Dear all,

Recently, I stumbled on something with parsing BLAST reports.  To a
plain text blast report from NCBI a ">aaa" got prepended. This
(fasta-like header) changes the $result->hits array.
The amount of hits is now 2*num_hits + 1. Clearly, this is related to
faulty input, but still the effect of this line is great. Does someone
see what is causing this, and should the BLAST parser maybe be
slightly more relaxed wrt pre/appended text? I have not seen yet why
this extra fastaheader line has such a "large" effect.

A short example BLASTN output is attached.
Example code is:

use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast',
                           -file   => 'apoe_plain.bls');
while( my $result = $in->next_result ) {
  print "Num of hits: ", $result->num_hits, "\n";
  my @hits = $result->hits;
  foreach my $el (@hits) {
  	print $el->name, "\n";
  }


Kind regards,
Bernd
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apoe_plain.bls
Type: application/octet-stream
Size: 7890 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070821/a367809e/attachment.obj 

From cjfields at uiuc.edu  Tue Aug 21 17:53:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 16:53:44 -0500
Subject: [Bioperl-l] SearchIO-BLAST
In-Reply-To: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>
References: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>
Message-ID: <59FF775C-8CAC-4947-A5BA-835ADD45CD32@uiuc.edu>

I can confirm this (I'm using bioperl-live).  The output I get is:

Num of hits: 9
ref|NM_000039.1|
ref|NT_113960.1|Hs22_111679
ref|NT_033899.7|Hs11_34054
ref|NW_925173.1|HsCraAADB02_444
ref|NM_000039.1|
ref|NT_113960.1|Hs22_111679
ref|NT_033899.7|Hs11_34054
ref|NW_925173.1|HsCraAADB02_444
ref|NW_925173.1|HsCraAADB02_444

The extra '>' is definitely throwing the event calls for a loop; the  
2x increase is b/c an extra iteration is started when '>' is  
encountered (changing the event handler reduces the number to 5).   
The extra hit is from the '>' at the beginning.

I hate to say it, but this is an instance where we can't be more  
flexible, primarily b/c '>' is a legit token the parser looks for (it  
is the beginning of the hit block in reports).  Finding it as the  
initial token in the report is also legitimate for some older BLAST  
output, so we also can't simply bypass it.  You'll unfortunately have  
to preparse the reports to get rid of those lines prior to feeding  
them to the BLAST text report parser.

chris

On Aug 21, 2007, at 11:32 AM, Bernd Web wrote:

> Dear all,
>
> Recently, I stumbled on something with parsing BLAST reports.  To a
> plain text blast report from NCBI a ">aaa" got prepended. This
> (fasta-like header) changes the $result->hits array.
> The amount of hits is now 2*num_hits + 1. Clearly, this is related to
> faulty input, but still the effect of this line is great. Does someone
> see what is causing this, and should the BLAST parser maybe be
> slightly more relaxed wrt pre/appended text? I have not seen yet why
> this extra fastaheader line has such a "large" effect.
>
> A short example BLASTN output is attached.
> Example code is:
>
> use Bio::SearchIO;
> my $in = new Bio::SearchIO(-format => 'blast',
>                            -file   => 'apoe_plain.bls');
> while( my $result = $in->next_result ) {
>   print "Num of hits: ", $result->num_hits, "\n";
>   my @hits = $result->hits;
>   foreach my $el (@hits) {
>   	print $el->name, "\n";
>   }
>
>
> Kind regards,
> Bernd
> <apoe_plain.bls>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Tue Aug 21 23:03:55 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 21 Aug 2007 23:03:55 -0400
Subject: [Bioperl-l] subversion progress
In-Reply-To: <5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>
References: <46CAA02F.60803@sheffield.ac.uk>
	<5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>
Message-ID: <51A5996D-A976-47FD-8807-20F6EBAF9E42@gmx.net>


On Aug 21, 2007, at 10:40 AM, Chris Fields wrote:

> I think a time for the switchover just has to be set so that
> everybody is adequately forewarned, and the docs for getting started
> on SVN need to be updated accordingly.

That was my recollection too. -hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Wed Aug 22 03:51:42 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 22 Aug 2007 08:51:42 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>	<46C9C7F8.3020608@kirx.de>	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
	<764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
Message-ID: <46CBEB0E.8030200@sendu.me.uk>

neeti somaiya wrote:
> Hi,
> 
> I wanted to automate my pdb script, right from downloading of data. As per
> the solution given by RCSB about custom report for pdb ids and titles only,
> I was trying something like the code below, but it doesnt seem to work :-
> 
> my $url = '
> http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
> VStructureSummary.structureId~VCitation.title&format=csv';
> use LWP::Simple;
> my $content = get $url;
> die "Couldn't get $url" unless defined $content;
> 
> Can anyone tell how I can do it, if there is any other way to do it, or if I
> am going wrong somewhere, or if it is't possible for this case at all.

Use LWP::UserAgent so you can see what's going on.

my $ua = LWP::UserAgent->new;
$ua->timeout(10);
my $response = $ua->get($url);
if ($response->is_success) {
   print $response->content;
}
else {
   die $response->status_line;
}


Gives:
500 Internal Server Error

Most likely the server is expecting some kind of cookie and falls over 
when you try to visit that url without it. So start where they told you 
to and grab pages successively, keeping any cookies.

From neetisomaiya at gmail.com  Wed Aug 22 06:06:38 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 22 Aug 2007 15:36:38 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46CBEB0E.8030200@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
	<764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
	<46CBEB0E.8030200@sendu.me.uk>
Message-ID: <764978cf0708220306u77cedf22xdd132b324e306f33@mail.gmail.com>

Thanks a lot. It worked for me.

use LWP::UserAgent;
use HTTP::Cookies;

$ua = LWP::UserAgent->new;
$ua->cookie_jar(HTTP::Cookies->new(file => "lwpcookies.txt",
                                     autosave => 1));

$request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=HoldingsQuery&moleculeType=ignore&experimentalMethod=ignore'
);

$response = $ua->request($request);

if ($response->is_success)
{
        print "\nSuccessfully connected to url
http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=HoldingsQuery&moleculeType=ignore&experimentalMethod=ignore\n
";

        $request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/results/tabularForm.do');

        $response = $ua->request($request);

        if ($response->is_success)
        {
                print "\nSuccessfully connected to url
http://www.pdb.org/pdb/results/tabularForm.do\n";

                $request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv');

                $response = $ua->request($request);

                if ($response->is_success)
                {
                        print "\nSuccessfully connected to url
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv\n";
                       open(FH,">tabularResults.csv");
                        print FH $response->content;
                        close(FH);
                }
                else
                {
                        die $response->status_line;
                }
        }
        else
        {
                die $response->status_line;
        }
}
else
{
  die $response->status_line;
}


On 8/22/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> neeti somaiya wrote:
> > Hi,
> >
> > I wanted to automate my pdb script, right from downloading of data. As
> per
> > the solution given by RCSB about custom report for pdb ids and titles
> only,
> > I was trying something like the code below, but it doesnt seem to work
> :-
> >
> > my $url = '
> >
> http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
> > VStructureSummary.structureId~VCitation.title&format=csv';
> > use LWP::Simple;
> > my $content = get $url;
> > die "Couldn't get $url" unless defined $content;
> >
> > Can anyone tell how I can do it, if there is any other way to do it, or
> if I
> > am going wrong somewhere, or if it is't possible for this case at all.
>
> Use LWP::UserAgent so you can see what's going on.
>
> my $ua = LWP::UserAgent->new;
> $ua->timeout(10);
> my $response = $ua->get($url);
> if ($response->is_success) {
>    print $response->content;
> }
> else {
>    die $response->status_line;
> }
>
>
> Gives:
> 500 Internal Server Error
>
> Most likely the server is expecting some kind of cookie and falls over
> when you try to visit that url without it. So start where they told you
> to and grab pages successively, keeping any cookies.
>


-- 
-Neeti
Even my blood says, B positive

From jay at jays.net  Wed Aug 22 08:54:29 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 22 Aug 2007 07:54:29 -0500
Subject: [Bioperl-l] wiki: Current Events
Message-ID: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>

http://www.bioperl.org/wiki/Main_Page

Please change:

< BOSC 2007 will be held July 19-20, 2007
 > BOSC 2007 was held July 19-20, 2007

I'd change it but the page is locked. Even when I'm logged in.   :)

Thanks,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Wed Aug 22 09:58:32 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 22 Aug 2007 08:58:32 -0500
Subject: [Bioperl-l] wiki: Current Events
In-Reply-To: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>
References: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>
Message-ID: <A7C5314E-662C-4160-85B1-0225B95C0BD2@uiuc.edu>

Done.

chris

On Aug 22, 2007, at 7:54 AM, Jay Hannah wrote:

> http://www.bioperl.org/wiki/Main_Page
>
> Please change:
>
> < BOSC 2007 will be held July 19-20, 2007
>> BOSC 2007 was held July 19-20, 2007
>
> I'd change it but the page is locked. Even when I'm logged in.   :)
>
> Thanks,
>
> Jay Hannah
> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Wed Aug 22 15:45:42 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Thu, 23 Aug 2007 01:15:42 +0530 (IST)
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw image according to
 input file ?
In-Reply-To: <A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
Message-ID: <44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>

Dear All,

Is there any option in Bio::Graphics to draw image based on the hits as
explained in the hits file.

For example I am using an input file:
# hit   score   start   end
Query   0       1       101
Sequence_Segment_1      0       1       101
PD:LRR_1|CS:AAC34139        0.16        1        23
PD:LRR_1|CS:AAC34139        3.6        1        22
PD:LRR_1|CS:AAC34139        1.8        1        22
PD:LRR_1|CS:AAC34139        1.3        1        22
PD:LRR_1|CS:XP_640228        2.5        2        23
..... Cropped
PD:LRR_1|CS:NP_611007        55        3        23
PD:LRR_1|CS:NP_611007        3.7        3        24
PD:LRR_1|CS:NP_611007        4.5        3        24
PD:LRR_1|CS:NP_611007        0.71        3        24
If you look at the image, you can see that, its all jumbled up and it
doesnt make any sense in the first look. I am looking for an option to
draw each of the  glyph one by one (say \n), rather that accomodating it
internally by the Bio::Graphics.

PS. Image is attached with this mail.
I am using  Dr. L. Stein's example :

use strict;
use Bio::Graphics;
use Bio::SeqFeature::Generic;
my $panel = Bio::Graphics::Panel->new(-length => 700,
                                      -width  => 800,
                                      -pad_left => 10,
                                      -pad_right => 10,
                                     );

my $full_length = Bio::SeqFeature::Generic->new(-start=>1,-end=>700);
$panel->add_track($full_length,
                  -glyph   => 'arrow',
                  -tick    => 2,
                  -fgcolor => 'black',
                  -double  => 1,
                 );

my $track = $panel->add_track(
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.png
Type: image/png
Size: 27974 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/be285f43/attachment-0001.png 

From cjfields at uiuc.edu  Thu Aug 23 00:53:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 22 Aug 2007 23:53:55 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
Message-ID: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>

As many of the devs know, there are a number of Feature/Annotation  
issues that need to be resolved prior to a 1.6 release:

http://www.bioperl.org/wiki/Release_Schedule#SeqFeature. 
2FAnnotation_changes:_Keep_or_roll_back.3F

There has been little work done over the last 2 1/2 years to undo or  
rectify problems associated with those additions; I feel like those  
of us still routinely contributing have been left holding the bag.   
There has also been very little attempt to document any of this  
adequately enough; as an example see POD for  
Bio::SeqFeature::Annotated (what little there is).

I would like to suggest the radical idea of rolling back AnnotatableI/ 
SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags  
are simple scalars) and possibly work in implementing Ewan's  
SeqFeature::TypedSeqFeatureI for those who want strong data types  
(i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various  
AnnotatableI changes, odd inheritance, and operator overloading have  
really obfuscated the code to the point where no one wants to touch  
it in case it breaks something important.  However, I believe it is  
the one serious impediment to a new stable release.

My thought is we simplify all the relevant interfaces, essentially  
reverting back to rel 1.4.  For instance, we move the various  
Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.   
Bio::SeqFeature::Annotated would implement Bio::AnnotatableI  
directly, and (if needed) also implement  
Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on  
Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI  
methods correctly, just as any other class would when implementing an  
abstract interface.  I have played around with this a bit and managed  
to get most tests working again for Bio::SeqFeature::Generic and  
FeatureIO but a number of others break.

If needed I can try this out on a branch (a bit ironic, since the  
changes instigating this mess should have been tested on a branch!).   
Maybe this will get the ball rolling towards a 1.6 release.  Any  
thoughts?

chris


From shameer at ncbs.res.in  Thu Aug 23 03:06:34 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Thu, 23 Aug 2007 12:36:34 +0530 (IST)
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw image
 according to input file ?
In-Reply-To: <44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
	<44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
Message-ID: <34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>

Dear All,

I will make my question simple :
Is there any way to force the 'Bio::graphics' module to print only one
glyph in a track ?

PS. More Detailed explanation is in my earlier mail (Dont want to spam the
community with my same mail)

Eagerly waiting for a reply.
Thanks,
-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From cain.cshl at gmail.com  Thu Aug 23 04:54:40 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 23 Aug 2007 04:54:40 -0400
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw
	image	according to input file ?
In-Reply-To: <34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
	<44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
	<34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>
Message-ID: <1187859296.2546.6.camel@103.48.216.10.in-addr.arpa>

Shameer,

I don't think that's really what you want.  It seems to me that sorting
them in some useful way (say, by score) would make more sense.  There is
an example using the -sort_order option in Lincoln's howto.

Scott


On Thu, 2007-08-23 at 12:36 +0530, Shameer Khadar wrote:
> Dear All,
> 
> I will make my question simple :
> Is there any way to force the 'Bio::graphics' module to print only one
> glyph in a track ?
> 
> PS. More Detailed explanation is in my earlier mail (Dont want to spam the
> community with my same mail)
> 
> Eagerly waiting for a reply.
> Thanks,
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/6066f0ec/attachment.bin 

From cjfields at uiuc.edu  Thu Aug 23 10:14:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 09:14:51 -0500
Subject: [Bioperl-l] extra rel. 1.6 suggestion
Message-ID: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>

Some interesting points by Sendu:

http://www.bioperl.org/wiki/Release_Schedule#Need_tests

which I agree with completely.

Maybe the best way out if this is a variation on something that was  
suggested before, which was 'splitting' the code into groups.  What  
if we set up a way to automatically gauge test coverage,  
documentation, etc.?  If I remember correctly Nathan had something  
running at one point which did this.

If so, we could determine which code is potentially 'non-compliant'  
and needs to be fixed (tests added, docs brought up to spec, so on),  
and thus prioritize at the minimum what needs to be done for a 1.6  
release.  If it's deemed not worth worrying about (no active  
development, author is out of contact, we have more important  
priorities) we split that code off into a separate 'dev' package.   
That would save some of the headache of trying to split maintenance  
of ~1000 modules up on only a few devs.

Thoughts?

chris

From bix at sendu.me.uk  Thu Aug 23 10:57:21 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 23 Aug 2007 15:57:21 +0100
Subject: [Bioperl-l] extra rel. 1.6 suggestion
In-Reply-To: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
References: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
Message-ID: <46CDA051.40408@sendu.me.uk>

Chris Fields wrote:
> Maybe the best way out if this is a variation on something that was  
> suggested before, which was 'splitting' the code into groups.  What  
> if we set up a way to automatically gauge test coverage,  
> documentation, etc.?  If I remember correctly Nathan had something  
> running at one point which did this.

You can generate this yourself by doing
./Build testcover

Mauricio was going to sort out having this run daily with the results 
displayed on the website... Mauricio?

The major 'annoyance' is that the coverage results don't get generated 
if any test fails. But they shouldn't be failing anyway ;)

From cain.cshl at gmail.com  Thu Aug 23 15:53:37 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 23 Aug 2007 15:53:37 -0400
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
Message-ID: <1187898817.2562.19.camel@localhost.localdomain>

Hi Chris,

GBrowse would be unaffected by this as it doesn't use
Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
hand will almost certainly break horribly, as it depends on the strong
typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
your ideas out in a branch that I could checkout and test on, that would
be good.

Thanks,
Scott


On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
> As many of the devs know, there are a number of Feature/Annotation  
> issues that need to be resolved prior to a 1.6 release:
> 
> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature. 
> 2FAnnotation_changes:_Keep_or_roll_back.3F
> 
> There has been little work done over the last 2 1/2 years to undo or  
> rectify problems associated with those additions; I feel like those  
> of us still routinely contributing have been left holding the bag.   
> There has also been very little attempt to document any of this  
> adequately enough; as an example see POD for  
> Bio::SeqFeature::Annotated (what little there is).
> 
> I would like to suggest the radical idea of rolling back AnnotatableI/ 
> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags  
> are simple scalars) and possibly work in implementing Ewan's  
> SeqFeature::TypedSeqFeatureI for those who want strong data types  
> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various  
> AnnotatableI changes, odd inheritance, and operator overloading have  
> really obfuscated the code to the point where no one wants to touch  
> it in case it breaks something important.  However, I believe it is  
> the one serious impediment to a new stable release.
> 
> My thought is we simplify all the relevant interfaces, essentially  
> reverting back to rel 1.4.  For instance, we move the various  
> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.   
> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI  
> directly, and (if needed) also implement  
> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on  
> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI  
> methods correctly, just as any other class would when implementing an  
> abstract interface.  I have played around with this a bit and managed  
> to get most tests working again for Bio::SeqFeature::Generic and  
> FeatureIO but a number of others break.
> 
> If needed I can try this out on a branch (a bit ironic, since the  
> changes instigating this mess should have been tested on a branch!).   
> Maybe this will get the ball rolling towards a 1.6 release.  Any  
> thoughts?
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/11ce47d3/attachment.bin 

From N.Haigh at sheffield.ac.uk  Thu Aug 23 16:32:12 2007
From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 23 Aug 2007 21:32:12 +0100
Subject: [Bioperl-l] extra rel. 1.6 suggestion
In-Reply-To: <46CDA051.40408@sendu.me.uk>
References: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
	<46CDA051.40408@sendu.me.uk>
Message-ID: <1187901132.46cdeeccce68d@webmail.shef.ac.uk>

Quoting Sendu Bala <bix at sendu.me.uk>:

> Chris Fields wrote:
> > Maybe the best way out if this is a variation on something that was  
> > suggested before, which was 'splitting' the code into groups.  What  
> > if we set up a way to automatically gauge test coverage,  
> > documentation, etc.?  If I remember correctly Nathan had something  
> > running at one point which did this.
> 
> You can generate this yourself by doing
> ./Build testcover

What I did was to patch Devel::Cover to include JavaScript to allow soring of the results by clicking a header in the table. This way, it was easier
to find those modules with poor POD coverage, and any other coverage metric. The developer(s) of Devel::Cover are introducing this into their next
release, ut who knows when that release will be. I could provide a diff, but we may be able to check out Devel::Cover from cvs/svn until the 0.62 is
made.

> 
> Mauricio was going to sort out having this run daily with the results 
> displayed on the website... Mauricio?
> 
> The major 'annoyance' is that the coverage results don't get generated 
> if any test fails. But they shouldn't be failing anyway ;)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Thu Aug 23 17:33:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 16:33:25 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <1187898817.2562.19.camel@localhost.localdomain>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<1187898817.2562.19.camel@localhost.localdomain>
Message-ID: <38B989E4-34CA-42CD-A608-9D2A095E7ADF@uiuc.edu>

Scott,

So far most of FeatureIO.t passes, with only a few exceptions dealing  
with the from_feature method (I know what the problem is there).  A  
large number of other tests crash horribly (not so surprising), so  
I'll have to go through those.  Ergo any changes and testing will  
definitely be conducted on a branch then merged back to main trunk  
once everything is okay.  I'll probably start a branch in the next  
few days or so.

Here's what I have been working on so far, which I think is reasonable:

1) Move all *_tag_* related methods out of Bio::AnnotatableI and into  
Bio::SeqFeature::Annotatable.

2) Reinstate the same tag methods in Bio::SeqFeatureI and remove  
Bio::AnnotatableI from the inheritance tree.

3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it  
already was, strangely enough).  Now it simple implements the proper  
methods from the interface classes SeqFeatureI and AnnotatableI.

4) Revert Bio::SeqFeature::Generic tags back to simple untyped  
strings (reimplement the 1.4 rel methods).

I'm interested in seeing whether this results in a significant  
performance increase in SeqIO since the Annotation instantiation is  
removed.

ToDo: I plan on removing the operator overloading in Bio::Annotation,  
which was a serious sticking point with most of the devs.  This won't  
be done until after tests pass for everything else.

What we will need at some point which I can't provide:  
Bio::SeqFeature::Annotated has no docs (no synopsis, no  
description).  The reason I bring this up is Sendu and I are  
seriously considering running an automated code audits in order to  
gauge which modules lack docs, test coverage, etc..  We're likely  
splitting those without adequate test/doc coverage off into a  
separate 'dev' release.

chris

On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:

> Hi Chris,
>
> GBrowse would be unaffected by this as it doesn't use
> Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
> hand will almost certainly break horribly, as it depends on the strong
> typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
> your ideas out in a branch that I could checkout and test on, that  
> would
> be good.
>
> Thanks,
> Scott
>
>
> On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
>> As many of the devs know, there are a number of Feature/Annotation
>> issues that need to be resolved prior to a 1.6 release:
>>
>> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
>> 2FAnnotation_changes:_Keep_or_roll_back.3F
>>
>> There has been little work done over the last 2 1/2 years to undo or
>> rectify problems associated with those additions; I feel like those
>> of us still routinely contributing have been left holding the bag.
>> There has also been very little attempt to document any of this
>> adequately enough; as an example see POD for
>> Bio::SeqFeature::Annotated (what little there is).
>>
>> I would like to suggest the radical idea of rolling back  
>> AnnotatableI/
>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>> are simple scalars) and possibly work in implementing Ewan's
>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
>> AnnotatableI changes, odd inheritance, and operator overloading have
>> really obfuscated the code to the point where no one wants to touch
>> it in case it breaks something important.  However, I believe it is
>> the one serious impediment to a new stable release.
>>
>> My thought is we simplify all the relevant interfaces, essentially
>> reverting back to rel 1.4.  For instance, we move the various
>> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
>> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
>> directly, and (if needed) also implement
>> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
>> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
>> methods correctly, just as any other class would when implementing an
>> abstract interface.  I have played around with this a bit and managed
>> to get most tests working again for Bio::SeqFeature::Generic and
>> FeatureIO but a number of others break.
>>
>> If needed I can try this out on a branch (a bit ironic, since the
>> changes instigating this mess should have been tested on a branch!).
>> Maybe this will get the ball rolling towards a 1.6 release.  Any
>> thoughts?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                          
> cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From smarkel at accelrys.com  Thu Aug 23 17:59:37 2007
From: smarkel at accelrys.com (Scott Markel)
Date: Thu, 23 Aug 2007 14:59:37 -0700
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <38B989E4-34CA-42CD-A608-9D2A095E7ADF@uiuc.edu>
Message-ID: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>

Chris,

Pipeline Pilot's Sequence Analysis Collection wraps BioPerl.
Once you think the branch changes have converged a bit we'd
be happy to try running our regression suite and report what
we find.

Scott

Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at accelrys.com
Accelrys, Inc.                      mobile: +1 858 205 3653
10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
San Diego, CA 92121                 fax:    +1 858 799 5222
USA                                 web:    http://www.accelrys.com


bioperl-l-bounces at lists.open-bio.org wrote on 23.08.2007 14:33:25:

> Scott,
> 
> So far most of FeatureIO.t passes, with only a few exceptions dealing 
> with the from_feature method (I know what the problem is there).  A 
> large number of other tests crash horribly (not so surprising), so 
> I'll have to go through those.  Ergo any changes and testing will 
> definitely be conducted on a branch then merged back to main trunk 
> once everything is okay.  I'll probably start a branch in the next 
> few days or so.
> 
> Here's what I have been working on so far, which I think is reasonable:
> 
> 1) Move all *_tag_* related methods out of Bio::AnnotatableI and into 
> Bio::SeqFeature::Annotatable.
> 
> 2) Reinstate the same tag methods in Bio::SeqFeatureI and remove 
> Bio::AnnotatableI from the inheritance tree.
> 
> 3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it 
> already was, strangely enough).  Now it simple implements the proper 
> methods from the interface classes SeqFeatureI and AnnotatableI.
> 
> 4) Revert Bio::SeqFeature::Generic tags back to simple untyped 
> strings (reimplement the 1.4 rel methods).
> 
> I'm interested in seeing whether this results in a significant 
> performance increase in SeqIO since the Annotation instantiation is 
> removed.
> 
> ToDo: I plan on removing the operator overloading in Bio::Annotation, 
> which was a serious sticking point with most of the devs.  This won't 
> be done until after tests pass for everything else.
> 
> What we will need at some point which I can't provide: 
> Bio::SeqFeature::Annotated has no docs (no synopsis, no 
> description).  The reason I bring this up is Sendu and I are 
> seriously considering running an automated code audits in order to 
> gauge which modules lack docs, test coverage, etc..  We're likely 
> splitting those without adequate test/doc coverage off into a 
> separate 'dev' release.
> 
> chris
> 
> On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > GBrowse would be unaffected by this as it doesn't use
> > Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
> > hand will almost certainly break horribly, as it depends on the strong
> > typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
> > your ideas out in a branch that I could checkout and test on, that 
> > would
> > be good.
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
> >> As many of the devs know, there are a number of Feature/Annotation
> >> issues that need to be resolved prior to a 1.6 release:
> >>
> >> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
> >> 2FAnnotation_changes:_Keep_or_roll_back.3F
> >>
> >> There has been little work done over the last 2 1/2 years to undo or
> >> rectify problems associated with those additions; I feel like those
> >> of us still routinely contributing have been left holding the bag.
> >> There has also been very little attempt to document any of this
> >> adequately enough; as an example see POD for
> >> Bio::SeqFeature::Annotated (what little there is).
> >>
> >> I would like to suggest the radical idea of rolling back 
> >> AnnotatableI/
> >> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
> >> are simple scalars) and possibly work in implementing Ewan's
> >> SeqFeature::TypedSeqFeatureI for those who want strong data types
> >> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
> >> AnnotatableI changes, odd inheritance, and operator overloading have
> >> really obfuscated the code to the point where no one wants to touch
> >> it in case it breaks something important.  However, I believe it is
> >> the one serious impediment to a new stable release.
> >>
> >> My thought is we simplify all the relevant interfaces, essentially
> >> reverting back to rel 1.4.  For instance, we move the various
> >> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
> >> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
> >> directly, and (if needed) also implement
> >> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
> >> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
> >> methods correctly, just as any other class would when implementing an
> >> abstract interface.  I have played around with this a bit and managed
> >> to get most tests working again for Bio::SeqFeature::Generic and
> >> FeatureIO but a number of others break.
> >>
> >> If needed I can try this out on a branch (a bit ironic, since the
> >> changes instigating this mess should have been tested on a branch!).
> >> Maybe this will get the ball rolling towards a 1.6 release.  Any
> >> thoughts?
> >>
> >> chris
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ---------------------------------------------------------------------- 

> > --
> > Scott Cain, Ph. D. 
> > cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/) 
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> -- 
> Click on the link below to report this email as spam
> https://www.mailcontrol.com/sr/Z!
> PZbyWH8JjiAfutpwULH4r7uW5Ugf1xtM+hyl21+efKtFgsAvNc3weh2hLqBsx8qT3rbOWim!
> Vn7A6djKguyK4O2gER4dLr9AKQF+tbnNRe+5lUPSgNICEO3B01XGW5n2DPe!
> yEtP3js8LAfwb38Bepj7AEJrDzVAG8yHc2pI5Y2U7!
> XHn0N1xbhPb0KSgNCfpTRCAMi3+BBkPbzT1bgrPmgUSJxQ9e 


From cjfields at uiuc.edu  Thu Aug 23 20:39:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 19:39:30 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>
References: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>
Message-ID: <241563BB-F96A-4631-B504-F73699FDE84B@uiuc.edu>

Having an independent test would be great!  The reason I suggest  
there may be a speedup: one complaint popping up after 1.5 was the  
slowdown in sequence parsing, which could be related to the 'heavier'  
objectified tags.

chris

On Aug 23, 2007, at 4:59 PM, Scott Markel wrote:

> Chris,
>
> Pipeline Pilot's Sequence Analysis Collection wraps BioPerl.
> Once you think the branch changes have converged a bit we'd
> be happy to try running our regression suite and report what
> we find.
>
> Scott
>
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect  email:  smarkel at accelrys.com
> Accelrys, Inc.                      mobile: +1 858 205 3653
> 10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
> San Diego, CA 92121                 fax:    +1 858 799 5222
> USA                                 web:    http://www.accelrys.com
>
>
> bioperl-l-bounces at lists.open-bio.org wrote on 23.08.2007 14:33:25:
>
>> Scott,
>>
>> So far most of FeatureIO.t passes, with only a few exceptions dealing
>> with the from_feature method (I know what the problem is there).  A
>> large number of other tests crash horribly (not so surprising), so
>> I'll have to go through those.  Ergo any changes and testing will
>> definitely be conducted on a branch then merged back to main trunk
>> once everything is okay.  I'll probably start a branch in the next
>> few days or so.
>>
>> Here's what I have been working on so far, which I think is  
>> reasonable:
>>
>> 1) Move all *_tag_* related methods out of Bio::AnnotatableI and into
>> Bio::SeqFeature::Annotatable.
>>
>> 2) Reinstate the same tag methods in Bio::SeqFeatureI and remove
>> Bio::AnnotatableI from the inheritance tree.
>>
>> 3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it
>> already was, strangely enough).  Now it simple implements the proper
>> methods from the interface classes SeqFeatureI and AnnotatableI.
>>
>> 4) Revert Bio::SeqFeature::Generic tags back to simple untyped
>> strings (reimplement the 1.4 rel methods).
>>
>> I'm interested in seeing whether this results in a significant
>> performance increase in SeqIO since the Annotation instantiation is
>> removed.
>>
>> ToDo: I plan on removing the operator overloading in Bio::Annotation,
>> which was a serious sticking point with most of the devs.  This won't
>> be done until after tests pass for everything else.
>>
>> What we will need at some point which I can't provide:
>> Bio::SeqFeature::Annotated has no docs (no synopsis, no
>> description).  The reason I bring this up is Sendu and I are
>> seriously considering running an automated code audits in order to
>> gauge which modules lack docs, test coverage, etc..  We're likely
>> splitting those without adequate test/doc coverage off into a
>> separate 'dev' release.
>>
>> chris
>>
>> On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> GBrowse would be unaffected by this as it doesn't use
>>> Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
>>> hand will almost certainly break horribly, as it depends on the  
>>> strong
>>> typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you  
>>> could try
>>> your ideas out in a branch that I could checkout and test on, that
>>> would
>>> be good.
>>>
>>> Thanks,
>>> Scott
>>>
>>>
>>> On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
>>>> As many of the devs know, there are a number of Feature/Annotation
>>>> issues that need to be resolved prior to a 1.6 release:
>>>>
>>>> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
>>>> 2FAnnotation_changes:_Keep_or_roll_back.3F
>>>>
>>>> There has been little work done over the last 2 1/2 years to  
>>>> undo or
>>>> rectify problems associated with those additions; I feel like those
>>>> of us still routinely contributing have been left holding the bag.
>>>> There has also been very little attempt to document any of this
>>>> adequately enough; as an example see POD for
>>>> Bio::SeqFeature::Annotated (what little there is).
>>>>
>>>> I would like to suggest the radical idea of rolling back
>>>> AnnotatableI/
>>>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>>>> are simple scalars) and possibly work in implementing Ewan's
>>>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>>>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
>>>> AnnotatableI changes, odd inheritance, and operator overloading  
>>>> have
>>>> really obfuscated the code to the point where no one wants to touch
>>>> it in case it breaks something important.  However, I believe it is
>>>> the one serious impediment to a new stable release.
>>>>
>>>> My thought is we simplify all the relevant interfaces, essentially
>>>> reverting back to rel 1.4.  For instance, we move the various
>>>> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
>>>> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
>>>> directly, and (if needed) also implement
>>>> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
>>>> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
>>>> methods correctly, just as any other class would when  
>>>> implementing an
>>>> abstract interface.  I have played around with this a bit and  
>>>> managed
>>>> to get most tests working again for Bio::SeqFeature::Generic and
>>>> FeatureIO but a number of others break.
>>>>
>>>> If needed I can try this out on a branch (a bit ironic, since the
>>>> changes instigating this mess should have been tested on a  
>>>> branch!).
>>>> Maybe this will get the ball rolling towards a 1.6 release.  Any
>>>> thoughts?
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>
>>> --
>>> Scott Cain, Ph. D.
>>> cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> -- 
>> Click on the link below to report this email as spam
>> https://www.mailcontrol.com/sr/Z!
>> PZbyWH8JjiAfutpwULH4r7uW5Ugf1xtM+hyl21 
>> +efKtFgsAvNc3weh2hLqBsx8qT3rbOWim!
>> Vn7A6djKguyK4O2gER4dLr9AKQF+tbnNRe+5lUPSgNICEO3B01XGW5n2DPe!
>> yEtP3js8LAfwb38Bepj7AEJrDzVAG8yHc2pI5Y2U7!
>> XHn0N1xbhPb0KSgNCfpTRCAMi3+BBkPbzT1bgrPmgUSJxQ9e
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Thu Aug 23 23:34:12 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 23 Aug 2007 23:34:12 -0400
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
Message-ID: <CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>


On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:

> There has been little work done over the last 2 1/2 years to undo or
> rectify problems associated with those additions; I feel like those
> of us still routinely contributing have been left holding the bag.

Not by intention, but unfortunately that's probably a fair  
assessment. (And I'm one of those guilty of inaction.)

> [...]
> I would like to suggest the radical idea of rolling back AnnotatableI/
> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
> are simple scalars) and possibly work in implementing Ewan's
> SeqFeature::TypedSeqFeatureI for those who want strong data types
> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).

I fully support this; to me that sounds exactly like the way to go.

> The various AnnotatableI changes, odd inheritance, and operator  
> overloading have
> really obfuscated the code to the point where no one wants to touch
> it in case it breaks something important.  However, I believe it is
> the one serious impediment to a new stable release.

Yes, I think you're hitting the nail on the head.

Chris, if you take the lead on this and carry it through we will all  
owe you hugely. I'm not sure how many beers that would compare to,  
but I'll throw in some. (Who else do I owe beer? I'm losing track.  
Strangely nobody tried to redeem beer from me in Vienna. Maybe in  
Toronto?)

Seriously, rectifying this problem would lift a huge weight.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From florent.angly at gmail.com  Fri Aug 24 00:43:23 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 23 Aug 2007 21:43:23 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
Message-ID: <46CE61EB.5000300@gmail.com>

Dear list members,

I would like to "produce" an alignment of a contig, or more exactly 
visualize it in a such a fashion based on the aligned sequences provided 
to be by a sequence assembler:

Consensus: ACGTACGTTG
Sequence1: ACG-AC
Sequence2:  CGTACGT
Sequence3:     AC-TTG

It sounds like a very trivial task but after searching for a long time, 
it seems impossible using the methods BioPerl provides.

Using the Bio::Align classes, it seems like the only way is if the 
sequences have the same aligned length, i.e. like this:

Consensus: ACGTACGTTG
Sequence1: ACG-AC----
Sequence2: -CGTACGT--
Sequence3: ----AC-TTG

It's not very satisfactory if I have to pad the sequences with gaps 
manually. In the context of a phylogenetic alignment, it might make 
sense, but not for contigs.

For assemblies whole sequences are mapped on contigs. Bio::LocatableSeq 
does not help here because it defines locations _within_ the sequence 
(the name LocatableSeq was pretty misleading to me).

I think it's all very strange that contigs have the coordinates of the 
aligned sequences composing them but there is no straightforward way to 
exploit this information.

So what's the bottom line? Am I missing something obvious, an 
out-of-the-box solution? Is it a "missing feature" of BioPerl that is 
planned to be implemented in the future or that should be requested? 
Should I pad my sequences with dashes or spaces after assembly? Or is it 
expected that my aligned reads coming from my assembly be padded with 
lots of gaps at their beginning and end? What's the BioPerl philosophy here?

Thanks for giving me pointers,

Florent

From bix at sendu.me.uk  Fri Aug 24 04:35:23 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 09:35:23 +0100
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CE61EB.5000300@gmail.com>
References: <46CE61EB.5000300@gmail.com>
Message-ID: <46CE984B.3060701@sendu.me.uk>

Florent Angly wrote:
> Dear list members,
> 
> I would like to "produce" an alignment of a contig, or more exactly 
> visualize it in a such a fashion based on the aligned sequences provided 
> to be by a sequence assembler:
> 
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC
> Sequence2:  CGTACGT
> Sequence3:     AC-TTG
> 
> It sounds like a very trivial task but after searching for a long time, 
> it seems impossible using the methods BioPerl provides.

Isn't Bio::Assembly::Contig what you need?

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Assembly/Contig.html

From zhaodj at ioz.ac.cn  Fri Aug 24 05:34:07 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 24 Aug 2007 17:34:07 +0800 (CST)
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CE61EB.5000300@gmail.com>
References: <46CE61EB.5000300@gmail.com>
Message-ID: <51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>

On Fri, Aug 24, 2007 12:43, Florent Angly wrote:
> Dear list members,
>
> I would like to "produce" an alignment of a contig, or more
exactly
> visualize it in a such a fashion based on the aligned sequences
> provided
> to be by a sequence assembler:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC
> Sequence2:  CGTACGT
> Sequence3:     AC-TTG
>
> It sounds like a very trivial task but after searching for a long
time,
> it seems impossible using the methods BioPerl provides.
>
> Using the Bio::Align classes, it seems like the only way is if the
sequences have the same aligned length, i.e. like this:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC----
> Sequence2: -CGTACGT--
> Sequence3: ----AC-TTG
>
> It's not very satisfactory if I have to pad the sequences with
gaps
> manually. In the context of a phylogenetic alignment, it might
make
> sense, but not for contigs.

How do you pad the sequences with gaps manually? Just replace the
hyphens with blanks? If yes, you can program in perl to automate
this process.

> For assemblies whole sequences are mapped on contigs.
> Bio::LocatableSeq
> does not help here because it defines locations _within_ the
> sequence
> (the name LocatableSeq was pretty misleading to me).
>
> I think it's all very strange that contigs have the coordinates of
the
> aligned sequences composing them but there is no straightforward
way
> to
> exploit this information.
>
> So what's the bottom line? Am I missing something obvious, an
> out-of-the-box solution? Is it a "missing feature" of BioPerl that
is
> planned to be implemented in the future or that should be
requested?
> Should I pad my sequences with dashes or spaces after assembly? Or
is it
> expected that my aligned reads coming from my assembly be padded
with
> lots of gaps at their beginning and end? What's the BioPerl
> philosophy here?
>
> Thanks for giving me pointers,
>
> Florent
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From marian.thieme at arcor.de  Fri Aug 24 06:05:55 2007
From: marian.thieme at arcor.de (Marian Thieme)
Date: Fri, 24 Aug 2007 12:05:55 +0200
Subject: [Bioperl-l] ReseqChip, module/package name
Message-ID: <46CEAD83.2050904@arcor.de>

Hi,

2 questions about the naming of the module I did submit
(see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)

1.) The package:
because there exists already an expression package I suggest to create a
new package called resequencing

2.) I would suggest that the module is called RedundantFragments or
AdditionalFragments

so we would have something like:

Bio::Resequencing::AdditionalFragments

Any other ideas ?

Marian

By the way can anybody change my email adress to marian.thieme at arcor.de
in bugzilla as well as in the bioperl list, please ?!! didnt achieve
that by my own...


From mcons004 at fiu.edu  Thu Aug 23 23:30:44 2007
From: mcons004 at fiu.edu (mcons004 at fiu.edu)
Date: Thu, 23 Aug 2007 23:30:44 -0400 (EDT)
Subject: [Bioperl-l] please some help
Message-ID: <20070823233044.BJQ45014@mailstore2.fiu.edu>

  Hello,
     I am new to this software and I am having some trouble starting. The version of Bioperl I am working on is v5.8.6. My OS is Unix (Mac OS X). I am trying to use Bioperl with a file called blastParser to process a file which is the output of a "blastall" operation.
  
 The code that gives me error is:
> perl blastParser.pl junk.out 1 1 1.0
 and the error message says:
Can't locate Bio/SearchIO.pm in @INC (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level

 You online info says I probably means that the module Bio::SearchIO.pm is not instaled and I can either install Bundle::Bioperl or install that specific module by hand. Could you give me some tips in this? I am new working with Unix, and Bioperl so I am a little confused. Any information will be helpful for me. Thanks

From bix at sendu.me.uk  Fri Aug 24 10:38:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 15:38:39 +0100
Subject: [Bioperl-l] please some help
In-Reply-To: <20070823233044.BJQ45014@mailstore2.fiu.edu>
References: <20070823233044.BJQ45014@mailstore2.fiu.edu>
Message-ID: <46CEED6F.1080101@sendu.me.uk>

mcons004 at fiu.edu wrote:
> Hello, I am new to this software and I am having some trouble
> starting. The version of Bioperl I am working on is v5.8.6. My OS is
> Unix (Mac OS X). I am trying to use Bioperl with a file called
> blastParser to process a file which is the output of a "blastall"
> operation.
> 
> The code that gives me error is:
>> perl blastParser.pl junk.out 1 1 1.0
> and the error message says: Can't locate Bio/SearchIO.pm in @INC
> (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level
> 
> 
> You online info says I probably means that the module
> Bio::SearchIO.pm is not instaled and I can either install
> Bundle::Bioperl or install that specific module by hand. Could you
> give me some tips in this? I am new working with Unix, and Bioperl so
> I am a little confused.

You need to install Bioperl first. You can find instructions here:
http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix

If this is your own Mac (you have the root/admin password), when it 
tells you to run cpan (">perl -MCPAN -e shell" or ">cpan"), start the 
command with 'sudo'. So:

 >sudo cpan

From florent.angly at gmail.com  Fri Aug 24 12:07:04 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Fri, 24 Aug 2007 09:07:04 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
Message-ID: <46CF0228.2000404@gmail.com>

Thanks for all the replies.

Sendu Bala wrote:

> Isn't Bio::Assembly::Contig what you need?
>
> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Assembly/Contig.html
>
I'm using this module already to manipulate the contigs, but there's no
option that I know of to _display_ the contigs in the way I described.
(Sorry, the title of my email was misleading.)


De-Jian,ZHAO wrote:
> How do you pad the sequences with gaps manually? Just replace the
> hyphens with blanks? If yes, you can program in perl to automate
> this process.
>   
How do I pad the sequences manually?? I calculate how many gaps have to
go left and right of the aligned sequence based on its length, its
position in the aligned consensus and the consensus length.
my $newseq = '-' x $leftnum . $seq . '-'x$rightnum
By the way, the sequences cannot be stored with blanks in them...

I think the best way to provide an out-of-the-box solution for
displaying contigs the described way would be to _not_ use Bio::Align at
all, but rather to create a new assembly IO module like
Bio::Assembly::IO::simpleout for example. That would be useful.

The reason I wanted to visualize these contigs is because I made a
Bio::Assembly::IO module for TIGR Assembler files that I intend on
submitting to BioPerl. I wanted to make sure first that I did not have
any obvious bug in my contig coordinates. I've read the documentation on
the Wiki so if a BioPerl developer would please like lo step up and
contact me directly for checking my code, that would be nice =)

Florent

From cjfields at uiuc.edu  Fri Aug 24 12:07:36 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 11:07:36 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip, module/package name
In-Reply-To: <46CEAD83.2050904@arcor.de>
References: <46CEAD83.2050904@arcor.de>
Message-ID: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>

Marian,

First, apologies about not getting on this sooner.  It's shaping up  
to be a busy year!

The new package: How about Bio::Expression::Tools::MitoChip?  My  
reasoning: I don't think it's necessary to define a new  
Bio::Resequencing namespace for just one module; my inclination is  
towards using Bio::Expression namespace as Bio::Tools have been  
traditionally reserved for output parsers.  I am unsure what the  
Bio::Expression status is (very little is documented, no tests are  
written, nothing on the mail list archives); maybe Allen can answer  
that?  I don't see anything that precludes you from using that  
namespace as long as your tools are fairly well-defined (they are)  
and have tests (they do).

Also, your module deals with doing one specific thing (extraction and  
incorporation of information about redundant fragments) for the Affy  
MitoChip.  It might be worth genericizing the class a bit so that you  
can add new parser or analysis methods w/o having to define new  
classes to deal with the same Mitochip data.

Mail list: The mail list subscription page (http://bioperl.org/ 
mailman/listinfo/bioperl-l) allows you to subscribe or change  
subscription options (at the bottom of the page).

Bugzilla: if you are logged into Bugzilla under your old email, there  
is an option at the bottom of the page (Edit : Prefs) where you can  
change your email address and other preferences.

chris

On Aug 24, 2007, at 5:05 AM, Marian Thieme wrote:

> Hi,
>
> 2 questions about the naming of the module I did submit
> (see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)
>
> 1.) The package:
> because there exists already an expression package I suggest to  
> create a
> new package called resequencing
>
> 2.) I would suggest that the module is called RedundantFragments or
> AdditionalFragments
>
> so we would have something like:
>
> Bio::Resequencing::AdditionalFragments
>
> Any other ideas ?
>
> Marian
>
> By the way can anybody change my email adress to  
> marian.thieme at arcor.de
> in bugzilla as well as in the bioperl list, please ?!! didnt achieve
> that by my own...
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Aug 24 12:23:12 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 11:23:12 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
Message-ID: <4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>

On Aug 23, 2007, at 10:34 PM, Hilmar Lapp wrote:

> On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:
>
>> There has been little work done over the last 2 1/2 years to undo or
>> rectify problems associated with those additions; I feel like those
>> of us still routinely contributing have been left holding the bag.
>
> Not by intention, but unfortunately that's probably a fair  
> assessment. (And I'm one of those guilty of inaction.)

Not completely.  You, Jason, Chris M., and several others expressed  
yourselves quite clearly (move the code to a branch and test).  I  
think that everyone was trying to be diplomatic about it and so never  
attempted to do anything except get it working correctly.

>> [...]
>> I would like to suggest the radical idea of rolling back  
>> AnnotatableI/
>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>> are simple scalars) and possibly work in implementing Ewan's
>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).
>
> I fully support this; to me that sounds exactly like the way to go.

Okay, I'll probably go ahead and get a branch started today.  I'll  
have to look at Ewan's interface in more detail; it's possible a new  
SeqFeature implementation will need to be written up to incorporate it.

>> The various AnnotatableI changes, odd inheritance, and operator  
>> overloading have
>> really obfuscated the code to the point where no one wants to touch
>> it in case it breaks something important.  However, I believe it is
>> the one serious impediment to a new stable release.
>
> Yes, I think you're hitting the nail on the head.
>
> Chris, if you take the lead on this and carry it through we will  
> all owe you hugely. I'm not sure how many beers that would compare  
> to, but I'll throw in some. (Who else do I owe beer? I'm losing  
> track. Strangely nobody tried to redeem beer from me in Vienna.  
> Maybe in Toronto?)
>
> Seriously, rectifying this problem would lift a huge weight.
>
> 	-hilmar

It would be nice to get regular releases started again.  I think  
this'll help.

chris

From marian.thieme at arcor.de  Fri Aug 24 13:01:07 2007
From: marian.thieme at arcor.de (Marian Thieme)
Date: Fri, 24 Aug 2007 19:01:07 +0200
Subject: [Bioperl-l] Bio::Expression & Re: ReseqChip, module/package name
Message-ID: <46CF0ED3.8000708@arcor.de>

> The new package: How about Bio::Expression::Tools::MitoChip?  My  
> reasoning: I don't think it's necessary to define a new  
> Bio::Resequencing namespace for just one module; my inclination is  
> towards using Bio::Expression namespace as Bio::Tools have been  
> traditionally reserved for output parsers.  I am unsure what the  
> Bio::Expression status is (very little is documented, no tests are  
> written, nothing on the mail list archives); maybe Allen can answer  
> that?  I don't see anything that precludes you from using that  
> namespace as long as your tools are fairly well-defined (they are)  
> and have tests (they do).

The problem I see, with Bio::Expression, is that Resequencing chips are
not belongs to Expression chips.
(Expression chips are designed to hybridisize RNA strands and hence
measure RNA expression levels, on the other hand a resequencing chip is
based on DNA, also the design and the probe length is quite different).
So, from my point of view it make sence to differ between dna and rna
chips, at least.

>
> Also, your module deals with doing one specific thing (extraction and  
> incorporation of information about redundant fragments) for the Affy  
> MitoChip.  It might be worth genericizing the class a bit so that you  
> can add new parser or analysis methods w/o having to define new  
> classes to deal with the same Mitochip data.

OK, need to think about that.

>
> Mail list: The mail list subscription page (http://bioperl.org/
<http://www.arcor.de/home/link.php?url=http%3A%2F%2Fbioperl.org%2F&ts=1187974826&hash=13eb66beff4317844b3e2448aa7af12a>

> mailman/listinfo/bioperl-l) allows you to subscribe or change  
> subscription options (at the bottom of the page).
>
cleared

> Bugzilla: if you are logged into Bugzilla under your old email, there  
> is an option at the bottom of the page (Edit : Prefs) where you can  
> change your email address and other preferences.
>
unfortunatly I dont recieve a mail to confirm the change. did try that
twice..


Marian

From bix at sendu.me.uk  Fri Aug 24 12:43:22 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 17:43:22 +0100
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CF0228.2000404@gmail.com>
References: <46CE61EB.5000300@gmail.com>	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
Message-ID: <46CF0AAA.4090301@sendu.me.uk>

Florent Angly wrote:
> Thanks for all the replies.
> 
> Sendu Bala wrote:
> 
>> Isn't Bio::Assembly::Contig what you need?
>
> I'm using this module already to manipulate the contigs, but there's 
> no option that I know of to _display_ the contigs in the way I 
> described.
[snip]
> I think the best way to provide an out-of-the-box solution for 
> displaying contigs the described way would be to _not_ use Bio::Align
> at all, but rather to create a new assembly IO module like 
> Bio::Assembly::IO::simpleout for example. That would be useful.

Yes...


> The reason I wanted to visualize these contigs is because I made a 
> Bio::Assembly::IO module for TIGR Assembler files that I intend on 
> submitting to BioPerl.

That's wonderful... might I cheekily suggest that the solution to your
problem is to extend your IO module so that it does the 'O' as well? Ie.
unlike the other IO modules, write_assembly() is actually implemented.
Then you can round-trip to ensure your next_assembly() method has no bugs.


> I've read the documentation on the Wiki so if a BioPerl developer
> would please like lo step up and contact me directly for checking my
> code, that would be nice =)

If no one does, post it as an enhancement request to bugzilla. A test
script is a must.

http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests

From cjfields at uiuc.edu  Fri Aug 24 13:16:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 12:16:10 -0500
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CF0228.2000404@gmail.com>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
Message-ID: <32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>


On Aug 24, 2007, at 11:07 AM, Florent Angly wrote:
...

> De-Jian,ZHAO wrote:
>> How do you pad the sequences with gaps manually? Just replace the
>> hyphens with blanks? If yes, you can program in perl to automate
>> this process.
>>
> How do I pad the sequences manually?? I calculate how many gaps  
> have to
> go left and right of the aligned sequence based on its length, its
> position in the aligned consensus and the consensus length.
> my $newseq = '-' x $leftnum . $seq . '-'x$rightnum
> By the way, the sequences cannot be stored with blanks in them...
>
> I think the best way to provide an out-of-the-box solution for
> displaying contigs the described way would be to _not_ use  
> Bio::Align at
> all, but rather to create a new assembly IO module like
> Bio::Assembly::IO::simpleout for example. That would be useful.
>
> The reason I wanted to visualize these contigs is because I made a
> Bio::Assembly::IO module for TIGR Assembler files that I intend on
> submitting to BioPerl. I wanted to make sure first that I did not have
> any obvious bug in my contig coordinates. I've read the  
> documentation on
> the Wiki so if a BioPerl developer would please like lo step up and
> contact me directly for checking my code, that would be nice =)
>
> Florent

A similar question has been previously asked on the same subject:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2827/focus=2869

Jason's suggestion was to have a Bio::Assembly::Contig method get_aln 
() which produces a Bio::SimpleAlign object containing appropriately  
padded seqs compatible for AlignIO output.  However, the method was  
never implemented.

Personally, the way I would try going about this would be to  
implement the Contig::get_aln() method, padding with bioperl- 
compliant alignment gap symbols (currently -.*?=~), so if anyone  
wanted they could write to any AlignIO-implemented format (MSF,  
Clustal, etc).  In your Bio::Assembly::IO::simpleout module implement  
write_assembly() and use the Contig::get_aln() method where needed to  
grab the SimpleAlign, then simply substitute gap symbols with spaces  
when writing contig output.

In general, any new code is attached to a bugzilla report as an  
enhancement request:

http://bugzilla.open-bio.org/

One of the devs will work on getting the code incorporated into  
bioperl.  Make sure the code is documented (http://www.bioperl.org/ 
wiki/Advanced_BioPerl), and attach appropriate tests (http:// 
www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests) and test data.

chris


From cjfields at uiuc.edu  Fri Aug 24 13:20:16 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 12:20:16 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <9824900.1187973171940.JavaMail.ngmail@webmail17>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
	<9824900.1187973171940.JavaMail.ngmail@webmail17>
Message-ID: <A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>


On Aug 24, 2007, at 11:32 AM, marian.thieme at arcor.de wrote:

>> ...
> The problem I see, with Bio::Expression, is that Resequencing chips  
> are not belongs to Expression chips.
> (Expression chips are designed to hybridisize RNA strands and hence  
> measure RNA expression levels, on the other hand a resequencing  
> chip is based on DNA, also the design and the probe length is quite  
> different). So, from my point of view it make sence to differ  
> between dna and rna chips, at least.

Then maybe the more generic Bio::Microarray namespace is the way to  
go, with the module name Bio::Microarray::Tools:: MitoChip.  If  
needed other tools can be added as needed.

>> Also, your module deals with doing one specific thing (extraction and
>> incorporation of information about redundant fragments) for the Affy
>> MitoChip.  It might be worth genericizing the class a bit so that you
>> can add new parser or analysis methods w/o having to define new
>> classes to deal with the same Mitochip data.
>
> OK, need to think about that.

It all depends on how much you intend to contribute; if you plan on  
adding to it over time we can talk about starting up a developer  
account.

>> Mail list: The mail list subscription page (http://bioperl.org/
>> mailman/listinfo/bioperl-l) allows you to subscribe or change
>> subscription options (at the bottom of the page).
>>
> cleared
>
>> Bugzilla: if you are logged into Bugzilla under your old email, there
>> is an option at the bottom of the page (Edit : Prefs) where you can
>> change your email address and other preferences.
>>
> unfortunatly I dont recieve a mail to confirm the change. did try  
> that twice..
>
>
> Marian

I tested it out and received the email at both addresses (as it  
states).  If you respond to either email it should implement the  
change in three days time.  If it doesn't you can email support at  
open.bio.org to see if there is a problem.

chris

From florent.angly at gmail.com  Fri Aug 24 13:58:13 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Fri, 24 Aug 2007 10:58:13 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
	<32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>
Message-ID: <46CF1C35.3050100@gmail.com>

Chris Fields wrote:
>
> A similar question has been previously asked on the same subject:
>
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2827/focus=2869
>
> Jason's suggestion was to have a Bio::Assembly::Contig method 
> get_aln() which produces a Bio::SimpleAlign object containing 
> appropriately padded seqs compatible for AlignIO output.  However, the 
> method was never implemented.
>
> Personally, the way I would try going about this would be to implement 
> the Contig::get_aln() method, padding with bioperl-compliant alignment 
> gap symbols (currently -.*?=~), so if anyone wanted they could write 
> to any AlignIO-implemented format (MSF, Clustal, etc).  In your 
> Bio::Assembly::IO::simpleout module implement write_assembly() and use 
> the Contig::get_aln() method where needed to grab the SimpleAlign, 
> then simply substitute gap symbols with spaces when writing contig 
> output.
>
> In general, any new code is attached to a bugzilla report as an 
> enhancement request:
>
> http://bugzilla.open-bio.org/
>
> One of the devs will work on getting the code incorporated into 
> bioperl.  Make sure the code is documented 
> (http://www.bioperl.org/wiki/Advanced_BioPerl), and attach appropriate 
> tests (http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests) and 
> test data.
>
> chris
>
>
Thanks Chris for the pointers, I will be looking into these things.
Florent

From hlapp at gmx.net  Fri Aug 24 14:25:57 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 24 Aug 2007 14:25:57 -0400
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
	<9824900.1187973171940.JavaMail.ngmail@webmail17>
	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
Message-ID: <BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>


On Aug 24, 2007, at 1:20 PM, Chris Fields wrote:

>>> ...
>> The problem I see, with Bio::Expression, is that Resequencing chips
>> are not belongs to Expression chips.
>> (Expression chips are designed to hybridisize RNA strands and hence
>> measure RNA expression levels, on the other hand a resequencing
>> chip is based on DNA, also the design and the probe length is quite
>> different). So, from my point of view it make sence to differ
>> between dna and rna chips, at least.
>
> Then maybe the more generic Bio::Microarray namespace is the way to
> go, with the module name Bio::Microarray::Tools:: MitoChip.  If
> needed other tools can be added as needed.
>

Makes sense to me too. Presumably, regardless of DNA or RNA being  
hybridized or length of probes, the data that comes out of them is  
quite similar in a general nature (namely hybridization signals)?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From marian.thieme at arcor.de  Fri Aug 24 12:32:51 2007
From: marian.thieme at arcor.de (marian.thieme at arcor.de)
Date: Fri, 24 Aug 2007 18:32:51 +0200 (CEST)
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
 module/package name
In-Reply-To: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
Message-ID: <9824900.1187973171940.JavaMail.ngmail@webmail17>

> The new package: How about Bio::Expression::Tools::MitoChip?  My  
> reasoning: I don't think it's necessary to define a new  
> Bio::Resequencing namespace for just one module; my inclination is  
> towards using Bio::Expression namespace as Bio::Tools have been  
> traditionally reserved for output parsers.  I am unsure what the  
> Bio::Expression status is (very little is documented, no tests are  
> written, nothing on the mail list archives); maybe Allen can answer  
> that?  I don't see anything that precludes you from using that  
> namespace as long as your tools are fairly well-defined (they are)  
> and have tests (they do).

The problem I see, with Bio::Expression, is that Resequencing chips are not belongs to Expression chips.
(Expression chips are designed to hybridisize RNA strands and hence measure RNA expression levels, on the other hand a resequencing chip is based on DNA, also the design and the probe length is quite different). So, from my point of view it make sence to differ between dna and rna chips, at least.

> 
> Also, your module deals with doing one specific thing (extraction and  
> incorporation of information about redundant fragments) for the Affy  
> MitoChip.  It might be worth genericizing the class a bit so that you  
> can add new parser or analysis methods w/o having to define new  
> classes to deal with the same Mitochip data.

OK, need to think about that.

> 
> Mail list: The mail list subscription page (http://bioperl.org/ 
> mailman/listinfo/bioperl-l) allows you to subscribe or change  
> subscription options (at the bottom of the page).
> 
cleared

> Bugzilla: if you are logged into Bugzilla under your old email, there  
> is an option at the bottom of the page (Edit : Prefs) where you can  
> change your email address and other preferences.
> 
unfortunatly I dont recieve a mail to confirm the change. did try that twice..


Marian

> On Aug 24, 2007, at 5:05 AM, Marian Thieme wrote:
> 
> > Hi,
> >
> > 2 questions about the naming of the module I did submit
> > (see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)
> >
> > 1.) The package:
> > because there exists already an expression package I suggest to  
> > create a
> > new package called resequencing
> >
> > 2.) I would suggest that the module is called RedundantFragments or
> > AdditionalFragments
> >
> > so we would have something like:
> >
> > Bio::Resequencing::AdditionalFragments
> >
> > Any other ideas ?
> >
> > Marian
> >
> > By the way can anybody change my email adress to  
> > marian.thieme at arcor.de
> > in bugzilla as well as in the bioperl list, please ?!! didnt achieve
> > that by my own...
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT F?R ALLE NEUEINSTEIGER
Jetzt bei Arcor: g?nstig und schnell mit DSL - das All-Inclusive-Paket
f?r clevere Doppel-Sparer, nur  34,95 ?  inkl. DSL- und ISDN-Grundgeb?hr!
http://www.arcor.de/rd/emf-dsl-2


From cjfields at uiuc.edu  Fri Aug 24 17:12:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 16:12:25 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
	<4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>
Message-ID: <ABED5057-CFB5-4AAA-9D23-B6A069575BF6@uiuc.edu>

Okay, I have started a new branch in cvs (tagged featann_rollback).   
I'll start looking through everything within the next few days to get  
a general idea of what needs to be done.  All I know is the changes  
were extensive and included modifications to tests.

If anyone has comments I have added a wiki page here:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

chris

On Aug 24, 2007, at 11:23 AM, Chris Fields wrote:

> On Aug 23, 2007, at 10:34 PM, Hilmar Lapp wrote:
>
>> On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:
>>
>>> There has been little work done over the last 2 1/2 years to undo or
>>> rectify problems associated with those additions; I feel like those
>>> of us still routinely contributing have been left holding the bag.
>>
>> Not by intention, but unfortunately that's probably a fair
>> assessment. (And I'm one of those guilty of inaction.)
>
> Not completely.  You, Jason, Chris M., and several others expressed
> yourselves quite clearly (move the code to a branch and test).  I
> think that everyone was trying to be diplomatic about it and so never
> attempted to do anything except get it working correctly.
>
>>> [...]
>>> I would like to suggest the radical idea of rolling back
>>> AnnotatableI/
>>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>>> are simple scalars) and possibly work in implementing Ewan's
>>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).
>>
>> I fully support this; to me that sounds exactly like the way to go.
>
> Okay, I'll probably go ahead and get a branch started today.  I'll
> have to look at Ewan's interface in more detail; it's possible a new
> SeqFeature implementation will need to be written up to incorporate  
> it.
>
>>> The various AnnotatableI changes, odd inheritance, and operator
>>> overloading have
>>> really obfuscated the code to the point where no one wants to touch
>>> it in case it breaks something important.  However, I believe it is
>>> the one serious impediment to a new stable release.
>>
>> Yes, I think you're hitting the nail on the head.
>>
>> Chris, if you take the lead on this and carry it through we will
>> all owe you hugely. I'm not sure how many beers that would compare
>> to, but I'll throw in some. (Who else do I owe beer? I'm losing
>> track. Strangely nobody tried to redeem beer from me in Vienna.
>> Maybe in Toronto?)
>>
>> Seriously, rectifying this problem would lift a huge weight.
>>
>> 	-hilmar
>
> It would be nice to get regular releases started again.  I think
> this'll help.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From marian at arcor.de  Fri Aug 24 14:48:20 2007
From: marian at arcor.de (marian)
Date: Fri, 24 Aug 2007 20:48:20 +0200
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
 module/package name
In-Reply-To: <BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>	<46CEAD83.2050904@arcor.de>	<9824900.1187973171940.JavaMail.ngmail@webmail17>	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
	<BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
Message-ID: <46CF27F4.8030608@arcor.de>

Hilmar Lapp schrieb:
> On Aug 24, 2007, at 1:20 PM, Chris Fields wrote:
>
>   
>>>> ...
>>>>         
>>> The problem I see, with Bio::Expression, is that Resequencing chips
>>> are not belongs to Expression chips.
>>> (Expression chips are designed to hybridisize RNA strands and hence
>>> measure RNA expression levels, on the other hand a resequencing
>>> chip is based on DNA, also the design and the probe length is quite
>>> different). So, from my point of view it make sence to differ
>>> between dna and rna chips, at least.
>>>       
>> Then maybe the more generic Bio::Microarray namespace is the way to
>> go, with the module name Bio::Microarray::Tools:: MitoChip.  If
>> needed other tools can be added as needed.
>>
>>     
>
> Makes sense to me too. Presumably, regardless of DNA or RNA being  
> hybridized or length of probes, the data that comes out of them is  
> quite similar in a general nature (namely hybridization signals)?
>
> 	-hilmar
>   

Bio::Microarray::Tools::MitoChip would be OK to me. I merely meant, that it 
isnt an expression chip and you also wont/cant analyze expression data with 
the tool I am talking about.

Marian


From cjfields at uiuc.edu  Fri Aug 24 18:36:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 17:36:46 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
Message-ID: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>

One thing I am noticing with the rollback to tag as strings is that  
tags with an undefined value are not set; I'm assuming when tags were  
Bio::AnnotationI they were instantiated regardless with an undef  
value.  When attempting to call an undef tag with get_tag_values() I  
get:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: asking for tag value that does not exist signalPeptideLength
STACK: Error::throw
STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/ 
bioperl-live/blib/lib/Bio/Root/Root.pm:357
STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/ 
featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
STACK: t/targetp.t:189
-----------------------------------------------------------

I personally think of this as a feature (why set a tag at all if it  
is undef?).  However, are there any circumstances where we might want  
this behavior?  Do we want to simply return w/o a value if a tag name  
isn't found (i.e. remove the exception)?

chris


From hlapp at gmx.net  Fri Aug 24 19:02:43 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 24 Aug 2007 19:02:43 -0400
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
Message-ID: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>

You're supposed to call has_tag() first before you can assume that  
you can call get_tag_values() w/o an exception. That was the original  
API.

	-hilmar

On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:

> One thing I am noticing with the rollback to tag as strings is that
> tags with an undefined value are not set; I'm assuming when tags were
> Bio::AnnotationI they were instantiated regardless with an undef
> value.  When attempting to call an undef tag with get_tag_values() I
> get:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: asking for tag value that does not exist signalPeptideLength
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
> bioperl-live/blib/lib/Bio/Root/Root.pm:357
> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
> STACK: t/targetp.t:189
> -----------------------------------------------------------
>
> I personally think of this as a feature (why set a tag at all if it
> is undef?).  However, are there any circumstances where we might want
> this behavior?  Do we want to simply return w/o a value if a tag name
> isn't found (i.e. remove the exception)?
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Aug 25 00:05:58 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 23:05:58 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
Message-ID: <6392DF1D-D91B-4B6E-812B-38FC0EA0D234@uiuc.edu>

Makes sense.  Okay, I'll leave the exception in.  Thanks!

chris

On Aug 24, 2007, at 6:02 PM, Hilmar Lapp wrote:

> You're supposed to call has_tag() first before you can assume that
> you can call get_tag_values() w/o an exception. That was the original
> API.
>
> 	-hilmar
>
> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>
>> One thing I am noticing with the rollback to tag as strings is that
>> tags with an undefined value are not set; I'm assuming when tags were
>> Bio::AnnotationI they were instantiated regardless with an undef
>> value.  When attempting to call an undef tag with get_tag_values() I
>> get:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: asking for tag value that does not exist signalPeptideLength
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>> STACK: t/targetp.t:189
>> -----------------------------------------------------------
>>
>> I personally think of this as a feature (why set a tag at all if it
>> is undef?).  However, are there any circumstances where we might want
>> this behavior?  Do we want to simply return w/o a value if a tag name
>> isn't found (i.e. remove the exception)?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sat Aug 25 03:50:29 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 25 Aug 2007 08:50:29 +0100
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
Message-ID: <46CFDF45.8030200@sheffield.ac.uk>

This sort of highlights a comment I made previously about how do you
test for a stable API?

It seems to me that unless you have intricate knowledge about the
changes that took place, you will find it difficult to know when an API
change has occurred. Is it possible to run the 1.4 test suite against
existing code to ensure tests pass? What if the 1.4 tests contained
bugs? This approach would need good code coverage by the tests to ensure
things work the same i.e. test code in HEAD against the test suite from
the previous stable release's branch - would/should this work
conceptually?**

Nath

Hilmar Lapp wrote:
> You're supposed to call has_tag() first before you can assume that  
> you can call get_tag_values() w/o an exception. That was the original  
> API.
>
> 	-hilmar
>
> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>
>   
>> One thing I am noticing with the rollback to tag as strings is that
>> tags with an undefined value are not set; I'm assuming when tags were
>> Bio::AnnotationI they were instantiated regardless with an undef
>> value.  When attempting to call an undef tag with get_tag_values() I
>> get:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: asking for tag value that does not exist signalPeptideLength
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>> STACK: t/targetp.t:189
>> -----------------------------------------------------------
>>
>> I personally think of this as a feature (why set a tag at all if it
>> is undef?).  However, are there any circumstances where we might want
>> this behavior?  Do we want to simply return w/o a value if a tag name
>> isn't found (i.e. remove the exception)?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>   


From cjfields at uiuc.edu  Sat Aug 25 10:36:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 25 Aug 2007 09:36:08 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <46CFDF45.8030200@sheffield.ac.uk>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
	<46CFDF45.8030200@sheffield.ac.uk>
Message-ID: <3F3C311E-3CD5-436B-987F-FD7695904647@uiuc.edu>

The rollback branch is off of HEAD, not 1.4, so any bugs fixed since  
then and any modules/tests added will be present.  So far everything  
has worked relatively well; you can check the history of this page to  
track what has happened so far:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

The only problem code remaining for the first round of changes is a  
single method in Bio::SeqFeature::Annotated (if the tests are to be  
trusted) and one test in Bio::SeqFeature::AnnotationAdaptor using  
Hilmar's original test suite.  Most of those were tests breaking  
Feature/Annotation API outlined in the HOWTO (calling get_Annotations  
directly from a Bio::SeqI or Bio::SeqFeatureI for instance), or  
examples where has_tag() was not used.  I agree good test coverage  
would probably help catch some of those still silently lingering in  
code, but I don't think it can find everything; that's the reason I  
indicate there will need extensive testing.  That applies within the  
suite but also by users in the wild.

The SeqFeatureI and AnnotatableI API is defined very specifically in  
the Feature/Annotation HOWTO, so if anything the introduced changes  
violated much of that and started a domino effect of users  
unknowingly violating the API (me among them).  Also, just b/c a test  
passes doesn't mean it is the ->correct<- result; it is very easy to  
just throw something from Data::Dumper into an is() test and have it  
pass.  As an example, it appears there was a bit of cheating going on  
with AnnotationAdaptor.t in particular, where expected numbers were  
changed to conform to results w/o explanation.  Which is the correct  
answer?  I trust Hilmar's original test suite over the (rushed) changes.

chris

On Aug 25, 2007, at 2:50 AM, Nathan S. Haigh wrote:

> This sort of highlights a comment I made previously about how do you
> test for a stable API?
>
> It seems to me that unless you have intricate knowledge about the
> changes that took place, you will find it difficult to know when an  
> API
> change has occurred. Is it possible to run the 1.4 test suite against
> existing code to ensure tests pass? What if the 1.4 tests contained
> bugs? This approach would need good code coverage by the tests to  
> ensure
> things work the same i.e. test code in HEAD against the test suite  
> from
> the previous stable release's branch - would/should this work
> conceptually?**
>
> Nath
>
> Hilmar Lapp wrote:
>> You're supposed to call has_tag() first before you can assume that
>> you can call get_tag_values() w/o an exception. That was the original
>> API.
>>
>> 	-hilmar
>>
>> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>>
>>
>>> One thing I am noticing with the rollback to tag as strings is that
>>> tags with an undefined value are not set; I'm assuming when tags  
>>> were
>>> Bio::AnnotationI they were instantiated regardless with an undef
>>> value.  When attempting to call an undef tag with get_tag_values() I
>>> get:
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: asking for tag value that does not exist signalPeptideLength
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>>> STACK: t/targetp.t:189
>>> -----------------------------------------------------------
>>>
>>> I personally think of this as a feature (why set a tag at all if it
>>> is undef?).  However, are there any circumstances where we might  
>>> want
>>> this behavior?  Do we want to simply return w/o a value if a tag  
>>> name
>>> isn't found (i.e. remove the exception)?
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sat Aug 25 18:12:49 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 25 Aug 2007 17:12:49 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
Message-ID: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>

I have finished rolling back most of the specific changes made prior  
to the 1.5 release and have relevant tests passing:

http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round

Operator overloading of Bio::Annotation objects will be trickier to  
debug as tons of tests fail when the overloading is removed:

http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round

I'll start looking into fixes.  I don't like overloads from a  
personal standpoint (problems w/ long-term code maintenance), but was  
there a more specific reason for removing them?

chris

From hlapp at gmx.net  Sun Aug 26 00:58:46 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 00:58:46 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
Message-ID: <3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>

The reason was to provide for backward compatibility with the  
original API in which tag values were scalars, not objects. The idea  
was that if someone relied on that and treats the object as a scalar  
(comparison, printing, etc), the operator overloading would take care  
of that.

So by going back to the original API the overloading should become  
obsolete, at least theoretically.

The overloading can cause some very subtle issues that I pointed out  
in an earlier email. It's one of those really "clever" tricks that  
just make it very hard for newcomers to understand what's going on in  
their code.

	-hilmar

On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:

> I have finished rolling back most of the specific changes made prior
> to the 1.5 release and have relevant tests passing:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>
> Operator overloading of Bio::Annotation objects will be trickier to
> debug as tons of tests fail when the overloading is removed:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>
> I'll start looking into fixes.  I don't like overloads from a
> personal standpoint (problems w/ long-term code maintenance), but was
> there a more specific reason for removing them?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From n.haigh at sheffield.ac.uk  Sun Aug 26 03:35:36 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 26 Aug 2007 08:35:36 +0100
Subject: [Bioperl-l] please some help
In-Reply-To: <20070823233044.BJQ45014@mailstore2.fiu.edu>
References: <20070823233044.BJQ45014@mailstore2.fiu.edu>
Message-ID: <46D12D48.8080301@sheffield.ac.uk>

mcons004 at fiu.edu wrote:
>   Hello,
>      I am new to this software and I am having some trouble starting. The version of Bioperl I am working on is v5.8.6. My OS is Unix (Mac OS X). I am trying to use Bioperl with a file called blastParser to process a file which is the output of a "blastall" operation.
>   
>  The code that gives me error is:
>> perl blastParser.pl junk.out 1 1 1.0
>  and the error message says:
> Can't locate Bio/SearchIO.pm in @INC (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level
> 
>  You online info says I probably means that the module Bio::SearchIO.pm is not instaled and I can either install Bundle::Bioperl or install that specific module by hand. Could you give me some tips in this? I am new working with Unix, and Bioperl so I am a little confused. Any information will be helpful for me. Thanks
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

 From what you have said, it appears you need some basic info to 
understand what you are trying to achieve.

The Perl programming language requires the Perl interpreter in order to 
execute a Perl script. The Perl interpreter is usually installed as 
standard with Unix/Linux based Operating Systems. The version you 
mention (5.8.6) will not be the version of Bioperl but the version of 
the Perl interpreter you have installed - you can check this by typing 
"perl -v" at a command prompt.

Given your apparent lack of understanding of the Unix OS, it is likely 
that you don't have Bioperl installed. You should have a look at:
http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink

Nath

From cjfields at uiuc.edu  Sun Aug 26 15:22:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 26 Aug 2007 14:22:24 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
Message-ID: <B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>

I managed to find your comments (as well as ones from Ewan, Jason,  
and a few others) on the mail list archives, so I'll link to them.   
The problem will be fixing the several places where overloading is  
assumed but no longer exists (i.e. in write_* methods), but we can  
probably pinpoint those by throwing or warning when overloading is  
assumed.

My thought is to either modify as_text() or add a new display_text()  
method to all AnnotationI that explicitly does what the overloading  
implied (print the annotation in a specified or assumed way).  We  
could then delegate to that in the stringification overload (with  
appropriate deprecation warnings) until 1.6, where we remove it  
completely.  Something like:

my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
                                         -primary_id => 'TSC0000030',
                                         -tagname => "tag2);

# either
print $link1->display_text(),"\n";
# or ...
print $link1->as_text("display"),"\n";
# prints "TSC:TSC0000030"

# default human readable
print $link1->as_text(),"\n";
# prints "Direct database link to TSC0000030 in database TSC"

print "$link1\n";
# gets a deprecation warning for now, removed completely for 1.6

chris

On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:

> The reason was to provide for backward compatibility with the  
> original API in which tag values were scalars, not objects. The  
> idea was that if someone relied on that and treats the object as a  
> scalar (comparison, printing, etc), the operator overloading would  
> take care of that.
>
> So by going back to the original API the overloading should become  
> obsolete, at least theoretically.
>
> The overloading can cause some very subtle issues that I pointed  
> out in an earlier email. It's one of those really "clever" tricks  
> that just make it very hard for newcomers to understand what's  
> going on in their code.
>
> 	-hilmar
>
> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>
>> I have finished rolling back most of the specific changes made prior
>> to the 1.5 release and have relevant tests passing:
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>
>> Operator overloading of Bio::Annotation objects will be trickier to
>> debug as tons of tests fail when the overloading is removed:
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>>
>> I'll start looking into fixes.  I don't like overloads from a
>> personal standpoint (problems w/ long-term code maintenance), but was
>> there a more specific reason for removing them?
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Sun Aug 26 16:57:37 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 16:57:37 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
Message-ID: <503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>

The thing that I actually never quite understood (and predates the  
API changes) is why $ann->as_text() needs to include explanatory text  
such as 'Direct database link to blah in database foo.' I would have  
said that "TSC:TSC0000030" is human readable enough, unless you  
present it without any context so that one would have no clue that it  
is a database cross-reference.

The as_text() method shouldn't be meant for the sole purpose of  
debugging annotation collections. However, I'm not sure for what else  
you could use it for, given that there are no guidelines for what to  
expect.

In fact, I do use as_text() a lot for a real purpose, namely as a  
surrogate unique key. For example, making a collection of dblinks  
unique is quite simple using the as_text() method:

	my %dbhash = map { ($_->as_text(), $_) } $anncoll->remove_Annotations 
('dblink');
	$anncoll->add_Annotation('dblink',$_) foreach (values %dbhash);

This is a common task when harvesting annotation from various places  
and then integrating it. However, there is nothing in the API  
documentation that suggests that this might be a reliable or even  
expected property such that you could omit the 'dblink' tag above.

I agree that having a conceptual equivalent to $feature->display_name  
and $seq->display_id would be good, but these methods have no claim  
to returning something that's unique in any way.

I guess I've now raised more questions than I answered (in fact I  
didn't answer any). Sorry 'bout that.

	-hilmar

On Aug 26, 2007, at 3:22 PM, Chris Fields wrote:

> I managed to find your comments (as well as ones from Ewan, Jason,  
> and a few others) on the mail list archives, so I'll link to them.   
> The problem will be fixing the several places where overloading is  
> assumed but no longer exists (i.e. in write_* methods), but we can  
> probably pinpoint those by throwing or warning when overloading is  
> assumed.
>
> My thought is to either modify as_text() or add a new display_text 
> () method to all AnnotationI that explicitly does what the  
> overloading implied (print the annotation in a specified or assumed  
> way).  We could then delegate to that in the stringification  
> overload (with appropriate deprecation warnings) until 1.6, where  
> we remove it completely.  Something like:
>
> my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
>                                         -primary_id => 'TSC0000030',
>                                         -tagname => "tag2);
>
> # either
> print $link1->display_text(),"\n";
> # or ...
> print $link1->as_text("display"),"\n";
> # prints "TSC:TSC0000030"
>
> # default human readable
> print $link1->as_text(),"\n";
> # prints "Direct database link to TSC0000030 in database TSC"
>
> print "$link1\n";
> # gets a deprecation warning for now, removed completely for 1.6
>
> chris
>
> On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:
>
>> The reason was to provide for backward compatibility with the  
>> original API in which tag values were scalars, not objects. The  
>> idea was that if someone relied on that and treats the object as a  
>> scalar (comparison, printing, etc), the operator overloading would  
>> take care of that.
>>
>> So by going back to the original API the overloading should become  
>> obsolete, at least theoretically.
>>
>> The overloading can cause some very subtle issues that I pointed  
>> out in an earlier email. It's one of those really "clever" tricks  
>> that just make it very hard for newcomers to understand what's  
>> going on in their code.
>>
>> 	-hilmar
>>
>> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>>
>>> I have finished rolling back most of the specific changes made prior
>>> to the 1.5 release and have relevant tests passing:
>>>
>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>>
>>> Operator overloading of Bio::Annotation objects will be trickier to
>>> debug as tons of tests fail when the overloading is removed:
>>>
>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>>>
>>> I'll start looking into fixes.  I don't like overloads from a
>>> personal standpoint (problems w/ long-term code maintenance), but  
>>> was
>>> there a more specific reason for removing them?
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sun Aug 26 18:47:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 26 Aug 2007 17:47:41 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
	<503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
Message-ID: <E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>

Either way I implement, it would be used simply as a generic  
convenience method to replicate output via stringification  
overloading, using a common method name for all AnnotationI; there  
seem to be several instances where this is used for generating output  
(i.e. SeqIO::genbank).  So, for instance, when formatting output you  
could just call as_text('display') or display_text() and you would  
get the most common formatting for that particular annotation type.

chris

On Aug 26, 2007, at 3:57 PM, Hilmar Lapp wrote:

> The thing that I actually never quite understood (and predates the  
> API changes) is why $ann->as_text() needs to include explanatory  
> text such as 'Direct database link to blah in database foo.' I  
> would have said that "TSC:TSC0000030" is human readable enough,  
> unless you present it without any context so that one would have no  
> clue that it is a database cross-reference.
>
> The as_text() method shouldn't be meant for the sole purpose of  
> debugging annotation collections. However, I'm not sure for what  
> else you could use it for, given that there are no guidelines for  
> what to expect.
>
> In fact, I do use as_text() a lot for a real purpose, namely as a  
> surrogate unique key. For example, making a collection of dblinks  
> unique is quite simple using the as_text() method:
>
> 	my %dbhash = map { ($_->as_text(), $_) } $anncoll- 
> >remove_Annotations('dblink');
> 	$anncoll->add_Annotation('dblink',$_) foreach (values %dbhash);
>
> This is a common task when harvesting annotation from various  
> places and then integrating it. However, there is nothing in the  
> API documentation that suggests that this might be a reliable or  
> even expected property such that you could omit the 'dblink' tag  
> above.
>
> I agree that having a conceptual equivalent to $feature- 
> >display_name and $seq->display_id would be good, but these methods  
> have no claim to returning something that's unique in any way.
>
> I guess I've now raised more questions than I answered (in fact I  
> didn't answer any). Sorry 'bout that.
>
> 	-hilmar
>
> On Aug 26, 2007, at 3:22 PM, Chris Fields wrote:
>
>> I managed to find your comments (as well as ones from Ewan, Jason,  
>> and a few others) on the mail list archives, so I'll link to  
>> them.  The problem will be fixing the several places where  
>> overloading is assumed but no longer exists (i.e. in write_*  
>> methods), but we can probably pinpoint those by throwing or  
>> warning when overloading is assumed.
>>
>> My thought is to either modify as_text() or add a new display_text 
>> () method to all AnnotationI that explicitly does what the  
>> overloading implied (print the annotation in a specified or  
>> assumed way).  We could then delegate to that in the  
>> stringification overload (with appropriate deprecation warnings)  
>> until 1.6, where we remove it completely.  Something like:
>>
>> my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
>>                                         -primary_id => 'TSC0000030',
>>                                         -tagname => "tag2);
>>
>> # either
>> print $link1->display_text(),"\n";
>> # or ...
>> print $link1->as_text("display"),"\n";
>> # prints "TSC:TSC0000030"
>>
>> # default human readable
>> print $link1->as_text(),"\n";
>> # prints "Direct database link to TSC0000030 in database TSC"
>>
>> print "$link1\n";
>> # gets a deprecation warning for now, removed completely for 1.6
>>
>> chris
>>
>> On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:
>>
>>> The reason was to provide for backward compatibility with the  
>>> original API in which tag values were scalars, not objects. The  
>>> idea was that if someone relied on that and treats the object as  
>>> a scalar (comparison, printing, etc), the operator overloading  
>>> would take care of that.
>>>
>>> So by going back to the original API the overloading should  
>>> become obsolete, at least theoretically.
>>>
>>> The overloading can cause some very subtle issues that I pointed  
>>> out in an earlier email. It's one of those really "clever" tricks  
>>> that just make it very hard for newcomers to understand what's  
>>> going on in their code.
>>>
>>> 	-hilmar
>>>
>>> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>>>
>>>> I have finished rolling back most of the specific changes made  
>>>> prior
>>>> to the 1.5 release and have relevant tests passing:
>>>>
>>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>>>
>>>> Operator overloading of Bio::Annotation objects will be trickier to
>>>> debug as tons of tests fail when the overloading is removed:
>>>>
>>>> http://www.bioperl.org/wiki/ 
>>>> Feature_Annotation_rollback#Second_round
>>>>
>>>> I'll start looking into fixes.  I don't like overloads from a
>>>> personal standpoint (problems w/ long-term code maintenance),  
>>>> but was
>>>> there a more specific reason for removing them?
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Sun Aug 26 19:01:03 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 19:01:03 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
	<503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
	<E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>
Message-ID: <35BBCF3B-BA1B-4C8D-8753-2A27AB3B423C@gmx.net>


On Aug 26, 2007, at 6:47 PM, Chris Fields wrote:

> just call as_text('display') or display_text()

The latter is more obvious, and can be better tested for presence and  
implementation, though in the world of perl that's of course not  
strictly true.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From zeroliu at 163.com  Mon Aug 27 07:49:53 2007
From: zeroliu at 163.com (zeroliu)
Date: Mon, 27 Aug 2007 19:49:53 +0800 (CST)
Subject: [Bioperl-l] Problems of parse emboss water result by Bio::AlignIO
Message-ID: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>

 Hello,
I'm trying to parse water (EMBOSS 5.0.0) result by Bio::AlignIO
(Bioperl-1.4) and encountered some problems.
1. What does the Bio::AlignIO->next_aln() return?
Does it return a Bio::Align::AlignI or Bio::SimpleAlign object?
Or it depends on the alignment file format?
2. How can I get the "score" properity in a water alignment result?
There is a score method in Bio::SimpleAlign but not in Bio::AlignIO.
In 2004, Jason mentioned:
Scores are set by the Alignment parser - we separate the 'running' from
the 'parsing'.
Bio::AlignIO::emboss had to be updated.
(http://article.gmane.org/gmane.comp.lang.perl.bio.general/7156/match=alignio+water)
How could I know it?
Thank you very much!  

From cjfields at uiuc.edu  Mon Aug 27 13:13:13 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 12:13:13 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Annotated status
Message-ID: <6DC5ECA8-3DF1-4B84-914C-4F2B3B44E29A@uiuc.edu>

What is the current status on maintenance of  
Bio::SeqFeature::Annotated?  From what I gather (based on the code  
and past mail list posts) the intent of the module seems to be to  
store any SeqFeature-specific data (tags, score, source, primary_tag,  
etc) in a Bio::AnnotationCollectionI as strongly typed data.  However  
there are several inconsistencies, such as objects being returned  
when a string is expected (score(), source()).

Also, several methods appear half-implemented, aren't consistent with  
SeqFeatureI API or similar methods in other SeqFeatureI's, and there  
are no docs explaining what is expected.
If no one speaks up on it, I'll do my best with maintaining it  
myself, but don't expect the API to stay as it is.

chris

From cjfields at uiuc.edu  Mon Aug 27 18:31:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 17:31:01 -0500
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
Message-ID: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>

This is related to the ongoing Feature/Annotation rollback.  I have  
found that a few Ontology-related modules are (either directly or  
indirectly) passing strings instead of Bio::Annotation::DBLinks to  
Bio::Ontology::Term::new(), add_dblink(), or add_dblink_context()  
(thelast is where the error occurs).

If needed we could allow strings to be passed but this isn't  
consistent with the API.  Any thoughts on what to do here?

chris

From hlapp at gmx.net  Mon Aug 27 19:07:12 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 27 Aug 2007 19:07:12 -0400
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
In-Reply-To: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
References: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
Message-ID: <01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>

The B::O::TermI interface actually says that get_dblinks() would  
return scalars. That's why the add_dblink methods accept strings. I  
also agree that this is inconsistent with with the rest of BioPerl.

Oddly enough, Term::add_dblink_context() does ask for DBLink objects,  
though it doesn't seem to be enforced, even though  
Term::get_dblink_context() is advertised as returning scalars.

So it does seem this is messed up design-wise. It seems to me that to  
really fix this would inevitably break the API, and I don't see how  
you would make this backwards compatible w/o creating a lot of messy  
code, the sole purpose of which would be backwards compatibility.

One could only fix Term::add_dblink_context() as it's not in the  
interface but that wouldn't contribute anything to improving  
consistency.

So the alternative to breaking the API in a non-backwards compatible  
fashion would be to add to it, map the existing dblink methods onto  
the added ones, and start deprecating them. For example, you could  
add methods get_dbxrefs() (also on the interface), add_dbxref(),  
etc,   and build in a context argument so we don't need another set  
of methods for that. They would accept and return DBLink objects, and  
the get_dblink() methods could be changed to map those to scalars  
while also getting slated for deprecation.

Does this make sense?

	-hilmar

On Aug 27, 2007, at 6:31 PM, Chris Fields wrote:

> This is related to the ongoing Feature/Annotation rollback.  I have
> found that a few Ontology-related modules are (either directly or
> indirectly) passing strings instead of Bio::Annotation::DBLinks to
> Bio::Ontology::Term::new(), add_dblink(), or add_dblink_context()
> (thelast is where the error occurs).
>
> If needed we could allow strings to be passed but this isn't
> consistent with the API.  Any thoughts on what to do here?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Aug 27 21:12:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 20:12:35 -0500
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
In-Reply-To: <01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>
References: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
	<01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>
Message-ID: <EF121F1E-BAA0-49BD-830F-1F3BC6FAC807@uiuc.edu>


On Aug 27, 2007, at 6:07 PM, Hilmar Lapp wrote:

> The B::O::TermI interface actually says that get_dblinks() would  
> return scalars. That's why the add_dblink methods accept strings. I  
> also agree that this is inconsistent with with the rest of BioPerl.
>
> Oddly enough, Term::add_dblink_context() does ask for DBLink  
> objects, though it doesn't seem to be enforced, even though  
> Term::get_dblink_context() is advertised as returning scalars.

This happened b/c of stringification and 'eq' overloading.  Just  
removing the overloads didn't reveal this problem; I had to add  
exceptions to them to fish this out.

> So it does seem this is messed up design-wise. It seems to me that  
> to really fix this would inevitably break the API, and I don't see  
> how you would make this backwards compatible w/o creating a lot of  
> messy code, the sole purpose of which would be backwards  
> compatibility.
>
> One could only fix Term::add_dblink_context() as it's not in the  
> interface but that wouldn't contribute anything to improving  
> consistency.

Agreed; in fact it may make it more confusing.

> So the alternative to breaking the API in a non-backwards  
> compatible fashion would be to add to it, map the existing dblink  
> methods onto the added ones, and start deprecating them. For  
> example, you could add methods get_dbxrefs() (also on the  
> interface), add_dbxref(), etc,   and build in a context argument so  
> we don't need another set of methods for that. They would accept  
> and return DBLink objects, and the get_dblink() methods could be  
> changed to map those to scalars while also getting slated for  
> deprecation.
>
> Does this make sense?
>
> 	-hilmar

I think so; I'll have to look over the code to see how we would  
implement this, though I'm guessing everything would be stored as  
DBLink objects by default.  Any changes will probably need to wait  
until after I fish out any remaining spots in the code where  
overloading is being used, but at least we have a direction on where  
to go.

chris

From cjfields at uiuc.edu  Tue Aug 28 00:18:19 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 23:18:19 -0500
Subject: [Bioperl-l] Feature/Annotation rollback (update #2)
Message-ID: <A91DD20B-841B-480A-A953-E811AD634AF0@uiuc.edu>

Okay, the planned rollback on is pretty much complete with a few  
exceptions.  I'll probably merge back to bioperl-live within the next  
few days once the following issues are addressed:

1)  Bio::Ontology::Term - several classes are using  
Bio::Ontology::Term in ways inconsistent with one another; some are  
passing Bio::Annotation::DBLink instances and other are passing  
simple strings.  This was somewhat transparent with various operator  
overloads but now they have really come to the surface.  I'll  
probably work on Hilmar's suggestion on adding extra class methods to  
give it a more consistent interface and deprecate the older ones.  As  
one might guess this affects much of Bio::Ontology but also  
Bio::Seqfeature::Annotated; strangely enough FeatureIO tests pass  
(which may simply mean there isn't enough test coverage for FeatureIO).

2)  Bio::SeqFeature::Annotated - no word back on maintenance for this  
module.  It needs to implement Bio::SeqFeature::TypedSeqFeatureI  
(pretty easy) and needs documentation (not so easy).  It's apparently  
essential for FeatureIO.  I'll basically get it up-and-running and  
clean up the API.

There are a few odds and ends that need to be addressed with  
roundtripping, but these are already problems on the MAIN trunk so  
they will be addressed once code is merged back in.

chris

From Frigerio at pierroton.inra.fr  Tue Aug 28 03:12:22 2007
From: Frigerio at pierroton.inra.fr (Jean-Marc FRIGERIO)
Date: Tue, 28 Aug 2007 09:12:22 +0200
Subject: [Bioperl-l] Bio::SeqIO::phd_comment objet
Message-ID: <200708280912.22798.Frigerio@pierroton.inra.fr>

Hi,

The Bio::SeqIO::phd module says, speaking about the COMMENT section of a phd 
file:
 # this should be an actual object to assist in serialization
  # but I don't have time for this now."

The doc says ( http://www.bioperl.org/wiki/Core_1.5.1_1.5.2_delta)

   This really needs a "phred_comments" object of some sort so that it will be 
serializable. Then when java clients get this object they will be able to 
deserialize it. 

I volunteer to do this,  but need your opinion.

Do we really need an object (Bio::phd_comment ? Bio::SeqIO::phd_comment ? 
Bio::phd_header ? other ?).

Or adding  few  Bio::Seq::SeqWithQuality subs in the Bio::SeqIO::phd module 
would suffice ? What are the constraints of serialization/deserialization of 
the java clients ?
I was thinking of just adding get/setter for all the comments
chromat_file(), abi_thumbprint(), etc.

touch() for the timestamp
attribute() for new unknown comments
write_comment().

others ?

		-- jmf

-- 
Jean-Marc Frigerio,
UMR BIOGECO   69, route d'Arcachon, 33612 CESTAS France
Tel : +33(0) 557 122 829   Fax : +33(0) 557 122 881
Frigerio at pierroton.inra.fr   http://www.pierroton.inra.fr/biogeco/index.html

From jay at jays.net  Tue Aug 28 07:14:37 2007
From: jay at jays.net (Jay Hannah)
Date: Tue, 28 Aug 2007 06:14:37 -0500
Subject: [Bioperl-l] Problems of parse emboss water result by
	Bio::AlignIO
In-Reply-To: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>
References: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>
Message-ID: <4CD8B5C2-3C87-495C-894E-17C3C67091DA@jays.net>

On Aug 27, 2007, at 6:49 AM, zeroliu wrote:
> I'm trying to parse water (EMBOSS 5.0.0) result by Bio::AlignIO
> (Bioperl-1.4) and encountered some problems.
> 1. What does the Bio::AlignIO->next_aln() return?
> Does it return a Bio::Align::AlignI or Bio::SimpleAlign object?
> Or it depends on the alignment file format?

http://doc.bioperl.org/bioperl-live/Bio/AlignIO.html
  Title   : next_aln
  Usage   : $aln = stream->next_aln
  Function: reads the next $aln object from the stream
  Returns : a Bio::Align::AlignI compliant object

> 2. How can I get the "score" properity in a water alignment result?
> There is a score method in Bio::SimpleAlign but not in Bio::AlignIO.
> In 2004, Jason mentioned:
> Scores are set by the Alignment parser - we separate the 'running'  
> from
> the 'parsing'.
> Bio::AlignIO::emboss had to be updated.
> (http://article.gmane.org/gmane.comp.lang.perl.bio.general/7156/ 
> match=alignio+water)
> How could I know it?

Line 480 of t/AlignIO.t seems to walk you through? Here's the block,  
with the test overhead removed.

# EMBOSS water
$str = Bio::AlignIO->new('-format' => 'emboss',
                          '-file' => 'cysprot.water');
$aln = $str->next_aln();
# $aln is now a Bio::Align::AlignI object
print $aln->score;    # '501.50'

HTH,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Tue Aug 28 17:05:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 28 Aug 2007 16:05:10 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
Message-ID: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>

I'm now wrapping up the Feature/Annotation rollback.  I will probably  
start merging back to the main branch in the next day or two., as  
soon as interested parties (*cough*devs*cough*) look over the last  
batch of changes.

http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round

I have also added a small benchmark test which indicates a decrease  
in parsing time in SeqIO::genbank with all tests passing.  I expect  
this will translate over to any Bio::SeqFeature::Generic-using class  
(open mouth, prepare to insert foot....).

It is also possible there are still some instances where overloading  
is expected lurking about in the ~1000 or so modules, so I'll leave  
the exceptions I added to all Bio::AnnotationI; we can remove them  
down the line, maybe prior to rel1.6, after more tests are added or  
if they get particularly annoying.  My guess is I caught 99.99% of  
them (prepare to insert other foot....).

The key change in this last round is the addition of several class  
*dbxref* methods to Bio::Ontology::Term and  
Bio::Annotation::OntologyTerm, all of which are capable of working  
with either DBLink instances or simple scalars.  This was primarily  
done in order to clear up inconsistencies in the older *dblink*  
methods, which were ambiguous (some indicates simple scalar  
arguments, others DBLink objects); operator overloading was used  
extensively in these cases, which led to several issues.  I have  
added deprecation warnings to the older methods which now map to  
using the newer methods.  All tests pass with the exception of a few  
already failing on the MAIN branch; the single test which needs to be  
fixed is a round-tripping error in swiss.t (now a TODO), which can be  
fixed after merging back.

Please respond to this if there are any questions or if I need to  
clarify the changes I made a bit more.

chris

From hlapp at duke.edu  Tue Aug 28 18:13:32 2007
From: hlapp at duke.edu (Hilmar Lapp)
Date: Tue, 28 Aug 2007 18:13:32 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
References: <20070828070219.DE03668527@evol.biology.mcmaster.ca>
Message-ID: <1F006707-291C-4895-A178-33FDFBDE6AE6@duke.edu>

Is anyone thinking about adding support for this as an aligner  
option? I'm not sure whether aside from a Bio::Tools::Run module we'd  
also need a format parser - it sounds like it's emitting clustalw  
format?

	-hilmar

Begin forwarded message:

> From: evoldir at evol.biology.mcmaster.ca
> Date: August 28, 2007 3:02:19 AM EDT
> To: hlapp at duke.edu
> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
> Reply-To: racartwr at ncsu.edu
>
>
> Ngila is a global, pairwise alignment program that uses logarithmic  
> and
> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are more
> biologically realistic than the more popular (and efficient) affine  
> gap
> cost model.
>
> I have recently completed updating the program to version 1.2.1.  The
> new version includes two new, evolutionary alignment models based  
> on my
> current research.  These models allow you to find the maximum  
> alignment
> of two sequences based on biological, evolutionary parameters---no  
> more
> guessing at biological costs.  Additional changes are noted on the  
> website.
>
> Website & Manual:
>
> http://scit.us/projects/ngila/
>
> Windows Binary:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>
> Unix/Mac Source Code:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>
> I'll be happy to answer any questions users have about the new  
> models or
> the program.
>
> -- 
> *********************************************************
> Reed A. Cartwright, PhD     http://scit.us/
> Postdoctoral Researcher     http://www.dererumnatura.us/
> Department of Genetics      http://www.pandasthumb.org/
>
> Bioinformatics Research Center
> North Carolina State University
> Campus Box 7566
> Raleigh, NC 27695-7566
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:- hlapp at duke dot edu :
===========================================================


From hlapp at duke.edu  Tue Aug 28 18:13:32 2007
From: hlapp at duke.edu (Hilmar Lapp)
Date: Tue, 28 Aug 2007 18:13:32 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
Message-ID: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>

Is anyone thinking about adding support for this as an aligner  
option? I'm not sure whether aside from a Bio::Tools::Run module we'd  
also need a format parser - it sounds like it's emitting clustalw  
format?

	-hilmar

Begin forwarded message:

> From: evoldir at evol.biology.mcmaster.ca
> Date: August 28, 2007 3:02:19 AM EDT
> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
> Reply-To: racartwr at ncsu.edu
>
>
> Ngila is a global, pairwise alignment program that uses logarithmic  
> and
> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are more
> biologically realistic than the more popular (and efficient) affine  
> gap
> cost model.
>
> I have recently completed updating the program to version 1.2.1.  The
> new version includes two new, evolutionary alignment models based  
> on my
> current research.  These models allow you to find the maximum  
> alignment
> of two sequences based on biological, evolutionary parameters---no  
> more
> guessing at biological costs.  Additional changes are noted on the  
> website.
>
> Website & Manual:
>
> http://scit.us/projects/ngila/
>
> Windows Binary:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>
> Unix/Mac Source Code:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>
> I'll be happy to answer any questions users have about the new  
> models or
> the program.
>
> -- 
> *********************************************************
> Reed A. Cartwright, PhD     http://scit.us/
> Postdoctoral Researcher     http://www.dererumnatura.us/
> Department of Genetics      http://www.pandasthumb.org/
>
> Bioinformatics Research Center
> North Carolina State University
> Campus Box 7566
> Raleigh, NC 27695-7566
>


From hlapp at gmx.net  Tue Aug 28 19:09:46 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 28 Aug 2007 19:09:46 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
In-Reply-To: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
References: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
Message-ID: <EF683AC3-F30C-49BC-9F16-7BA10C70F751@gmx.net>

Sorry for the double post, BTW. I had erroneously assumed that the  
first email would be held for post by non-member. -hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 00:01:13 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 28 Aug 2007 23:01:13 -0500
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
In-Reply-To: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
References: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
Message-ID: <EDED724C-3219-45FF-BAF2-592EEEBCB634@uiuc.edu>

It probably wouldn't be hard to write one up, particularly if it's  
got already parsable format.  We could probably base it off the  
current clustalw wrapper unless someone else thinks there is a better  
way.

chris

On Aug 28, 2007, at 5:13 PM, Hilmar Lapp wrote:

> Is anyone thinking about adding support for this as an aligner
> option? I'm not sure whether aside from a Bio::Tools::Run module we'd
> also need a format parser - it sounds like it's emitting clustalw
> format?
>
> 	-hilmar
>
> Begin forwarded message:
>
>> From: evoldir at evol.biology.mcmaster.ca
>> Date: August 28, 2007 3:02:19 AM EDT
>> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
>> Reply-To: racartwr at ncsu.edu
>>
>>
>> Ngila is a global, pairwise alignment program that uses logarithmic
>> and
>> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are  
>> more
>> biologically realistic than the more popular (and efficient) affine
>> gap
>> cost model.
>>
>> I have recently completed updating the program to version 1.2.1.  The
>> new version includes two new, evolutionary alignment models based
>> on my
>> current research.  These models allow you to find the maximum
>> alignment
>> of two sequences based on biological, evolutionary parameters---no
>> more
>> guessing at biological costs.  Additional changes are noted on the
>> website.
>>
>> Website & Manual:
>>
>> http://scit.us/projects/ngila/
>>
>> Windows Binary:
>>
>> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>>
>> Unix/Mac Source Code:
>>
>> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>>
>> I'll be happy to answer any questions users have about the new
>> models or
>> the program.
>>
>> -- 
>> *********************************************************
>> Reed A. Cartwright, PhD     http://scit.us/
>> Postdoctoral Researcher     http://www.dererumnatura.us/
>> Department of Genetics      http://www.pandasthumb.org/
>>
>> Bioinformatics Research Center
>> North Carolina State University
>> Campus Box 7566
>> Raleigh, NC 27695-7566
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Aug 29 12:03:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 11:03:07 -0500
Subject: [Bioperl-l] remote SwissProt server problems
Message-ID: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>

Just as a notice, DBFetch is currently retrieving only single records  
for the UniProtKB database (where Bio::DB::SwissProt fetches  
sequences).  If anyone runs remote sevrer tests and DB.t in the test  
suite you'll see a failure towards the end which indicates this.   
I've posted a notice to the server help desk and will respond when I  
hear more.

chris

From cain.cshl at gmail.com  Wed Aug 29 15:45:48 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 29 Aug 2007 15:45:48 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
Message-ID: <1188416748.2567.36.camel@localhost.localdomain>

Hi Chris,

I just wanted to let you know that I was out of town for a few days, but
now I'm back and I'm doing testing of GMOD software based on the branch
you are working on.  I'll let you know how it goes, but don't let me
stop you if you confident of your changes.  I'm sure whatever goes
wrong, it will just point out holes in the FeatureIO tests (I'm sure
there are plenty) and will require hopefully minimal changes on my end.

Thanks for your considerable efforts on this!  (Regardless of how much
work it makes for me :-)
Scott


On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> I'm now wrapping up the Feature/Annotation rollback.  I will probably  
> start merging back to the main branch in the next day or two., as  
> soon as interested parties (*cough*devs*cough*) look over the last  
> batch of changes.
> 
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
> 
> I have also added a small benchmark test which indicates a decrease  
> in parsing time in SeqIO::genbank with all tests passing.  I expect  
> this will translate over to any Bio::SeqFeature::Generic-using class  
> (open mouth, prepare to insert foot....).
> 
> It is also possible there are still some instances where overloading  
> is expected lurking about in the ~1000 or so modules, so I'll leave  
> the exceptions I added to all Bio::AnnotationI; we can remove them  
> down the line, maybe prior to rel1.6, after more tests are added or  
> if they get particularly annoying.  My guess is I caught 99.99% of  
> them (prepare to insert other foot....).
> 
> The key change in this last round is the addition of several class  
> *dbxref* methods to Bio::Ontology::Term and  
> Bio::Annotation::OntologyTerm, all of which are capable of working  
> with either DBLink instances or simple scalars.  This was primarily  
> done in order to clear up inconsistencies in the older *dblink*  
> methods, which were ambiguous (some indicates simple scalar  
> arguments, others DBLink objects); operator overloading was used  
> extensively in these cases, which led to several issues.  I have  
> added deprecation warnings to the older methods which now map to  
> using the newer methods.  All tests pass with the exception of a few  
> already failing on the MAIN branch; the single test which needs to be  
> fixed is a round-tripping error in swiss.t (now a TODO), which can be  
> fixed after merging back.
> 
> Please respond to this if there are any questions or if I need to  
> clarify the changes I made a bit more.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070829/f8433568/attachment.bin 

From cjfields at uiuc.edu  Wed Aug 29 16:13:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 15:13:17 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188416748.2567.36.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
Message-ID: <8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>

I'll probably go ahead and start merging this stuff over to CVS HEAD  
then.  There haven't been any objections so far.

The page I posted outlines the more critical fixes, primarily the  
changes to Bio::Ontology::Term methods (along with relevant code) due  
to inconsistencies in the interface.  The Bio::Annotation classes  
also now throw if you attempt to use them in an overloaded context.   
I also split off SeqFeature::Annotated tests into it's own test suite  
(SeqFeatAnnotated.t).

Let me know if there are any problems along the way!

chris

On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:

> Hi Chris,
>
> I just wanted to let you know that I was out of town for a few  
> days, but
> now I'm back and I'm doing testing of GMOD software based on the  
> branch
> you are working on.  I'll let you know how it goes, but don't let me
> stop you if you confident of your changes.  I'm sure whatever goes
> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> there are plenty) and will require hopefully minimal changes on my  
> end.
>
> Thanks for your considerable efforts on this!  (Regardless of how much
> work it makes for me :-)
> Scott
>
>
> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>> I'm now wrapping up the Feature/Annotation rollback.  I will probably
>> start merging back to the main branch in the next day or two., as
>> soon as interested parties (*cough*devs*cough*) look over the last
>> batch of changes.
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>>
>> I have also added a small benchmark test which indicates a decrease
>> in parsing time in SeqIO::genbank with all tests passing.  I expect
>> this will translate over to any Bio::SeqFeature::Generic-using class
>> (open mouth, prepare to insert foot....).
>>
>> It is also possible there are still some instances where overloading
>> is expected lurking about in the ~1000 or so modules, so I'll leave
>> the exceptions I added to all Bio::AnnotationI; we can remove them
>> down the line, maybe prior to rel1.6, after more tests are added or
>> if they get particularly annoying.  My guess is I caught 99.99% of
>> them (prepare to insert other foot....).
>>
>> The key change in this last round is the addition of several class
>> *dbxref* methods to Bio::Ontology::Term and
>> Bio::Annotation::OntologyTerm, all of which are capable of working
>> with either DBLink instances or simple scalars.  This was primarily
>> done in order to clear up inconsistencies in the older *dblink*
>> methods, which were ambiguous (some indicates simple scalar
>> arguments, others DBLink objects); operator overloading was used
>> extensively in these cases, which led to several issues.  I have
>> added deprecation warnings to the older methods which now map to
>> using the newer methods.  All tests pass with the exception of a few
>> already failing on the MAIN branch; the single test which needs to be
>> fixed is a round-tripping error in swiss.t (now a TODO), which can be
>> fixed after merging back.
>>
>> Please respond to this if there are any questions or if I need to
>> clarify the changes I made a bit more.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                          
> cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jay at jays.net  Wed Aug 29 18:11:55 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 29 Aug 2007 17:11:55 -0500
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
Message-ID: <46D5EF2B.5000101@jays.net>

Please slap me if I'm hysterical.

I'm seeking a broad bioinformatics search engine platform. I want to 
take gobs of data in gobs of formats and allow people to search it on 
the web.

- Entrez is awesome. Unfortunately I don't see anything in the NCBI 
toolkit that helps me run my own version of it. Even a tiny one. After 
an initial "check out our toolkit" response from NCBI I don't seem to be 
getting anywhere. Maybe I'm not communicating enough or well enough.

- EB-eye Search is slick. I don't see any developer kit or source code 
of any kind and I've gotten no response to my emails to them.

- LuceGene is very cool. But it looks like no one has touched it in 2.5 
years and I've gotten no response from their contact email address. I'm 
especially intrigued by their

  src/LuceGene/src/org/eugenes/index/LuceneReadseqIndexer.java

which seems to use the rather popular(?) Java Readseq to populate Lucene 
with source data in all sorts of different formats.

I don't know Java.

- Solr is really neat. It's easy to install and gives a simple/powerful 
XML API to populate a Lucene index.

... so ...

I'm thinking BioPerl knows how to parse lots of formats into a Bio::Seq.

I'm thinking I could write Perl which would take a Bio::Seq object and 
convert it to an XML file which Solr would happily inject into Lucene 
for me.

If I could do that I'm thinking that any of the many formats that 
Bio::SeqIO can slurp could magically be sent into a Lucene index for 
searching.

I'm thinking that would be really cool and I'm going to write it.

Now's your chance to slap me.

Since I haven't started yet, what would I call this thing? 
Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)

Thanks,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


More notes:
http://clab.ist.unomaha.edu/CLAB/index.php/RT11


From hlapp at gmx.net  Wed Aug 29 21:37:59 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 29 Aug 2007 21:37:59 -0400
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <46D5EF2B.5000101@jays.net>
References: <46D5EF2B.5000101@jays.net>
Message-ID: <D202078D-8F88-4FAA-94EA-8C08CE653C41@gmx.net>


On Aug 29, 2007, at 6:11 PM, Jay Hannah wrote:

> [...]
>
> I'm thinking I could write Perl which would take a Bio::Seq object and
> convert it to an XML file which Solr would happily inject into Lucene
> for me.
>
> If I could do that I'm thinking that any of the many formats that
> Bio::SeqIO can slurp could magically be sent into a Lucene index for
> searching.
>
> [...]
> Since I haven't started yet, what would I call this thing?
> Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)

Would this be a Solr-specific XML writer? Or could you use an  
existing XML format for sequences?

(as an aside, if you do need a Solr-specific format writer, my  
suggestion would be to name it solrxml [lowercase])

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 22:01:45 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 21:01:45 -0500
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <46D5EF2B.5000101@jays.net>
References: <46D5EF2B.5000101@jays.net>
Message-ID: <0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>


On Aug 29, 2007, at 5:11 PM, Jay Hannah wrote:

> Please slap me if I'm hysterical.
>
> I'm seeking a broad bioinformatics search engine platform. I want to
> take gobs of data in gobs of formats and allow people to search it on
> the web.
>
> - Entrez is awesome. Unfortunately I don't see anything in the NCBI
> toolkit that helps me run my own version of it. Even a tiny one. After
> an initial "check out our toolkit" response from NCBI I don't seem  
> to be
> getting anywhere. Maybe I'm not communicating enough or well enough.

No.  I have had non-responses before from NCBI; they may just be too  
busy.  Warnock probably applies.

> - EB-eye Search is slick. I don't see any developer kit or source code
> of any kind and I've gotten no response to my emails to them.

Not sure of this one personally.

> - LuceGene is very cool.
> ...
> I don't know Java.

...but you could write a (perl) wrapper around it.  You can try  
contacting Don Gilbert about it, though I think he's been trying out  
Chado.

> - Solr is really neat. It's easy to install and gives a simple/ 
> powerful
> XML API to populate a Lucene index.
> ... so ...
>
> I'm thinking BioPerl knows how to parse lots of formats into a  
> Bio::Seq.
>
> ...
>
> I'm thinking that would be really cool and I'm going to write it.
>
> Now's your chance to slap me.

No need.

> Since I haven't started yet, what would I call this thing?
> Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)
>
> Thanks,
>
> Jay Hannah
> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
> More notes:
> http://clab.ist.unomaha.edu/CLAB/index.php/RT11

The way I would go about it is use an established XML schema as a  
starting point and implement a writer (if bioperl doesn't already  
support it).  It's better than reinventing (a constantly reinvented)  
wheel and starting up a brand-new schema of your own.  INSDSeq  
(http://www.insdc.org/page.php?page=xmlstatus) is one I've been  
wanting to add for a while but haven't had time to work on; there are  
several other examples.  Note that a few of the currently supported  
ones in bioperl, such as bsml and game, have had very little to no  
development over the years in favor of newer (better?) XML flavors,  
so it likely isn't worth working with those.

chris


From hlapp at gmx.net  Wed Aug 29 22:02:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 29 Aug 2007 22:02:45 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
Message-ID: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>


On Aug 28, 2007, at 5:05 PM, Chris Fields wrote:

> I'm now wrapping up the Feature/Annotation rollback.  I will probably
> start merging back to the main branch in the next day or two., as
> soon as interested parties (*cough*devs*cough*) look over the last
> batch of changes.
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>
> [...]
> It is also possible there are still some instances where overloading
> is expected lurking about in the ~1000 or so modules, so I'll leave
> the exceptions I added to all Bio::AnnotationI

Keep in mind that code such as

	if ($ann) { ... }

is mostly not b/c someone wanted to use overloading, but rather  
someone was lazy and really meant to say

	if (defined($ann)) { ... }

In the absence of eq overloading, these will behave identically. So  
if you leave the exceptions in it is sort-of policing lazy  
programmers, which I guess is fine in principle, but is guaranteed to  
trip up a lot of script code. I'd take it out if you're reasonably  
sure that at least within BioPerl itself those lazy programming  
incidents are removed.

> [...]
> The key change in this last round is the addition of several class
> *dbxref* methods to Bio::Ontology::Term and
> Bio::Annotation::OntologyTerm, all of which are capable of working
> with either DBLink instances or simple scalars.

I don't think you need the code here to deal with both scalars and  
objects. It is fine I think to define the new methods from the outset  
to consistently accept and return DBLink objects, and period.

The backwards compatibility logic should rather be in the *_dblink*()  
methods; i.e., instead of simple aliases they should have the code to  
map to and from the new API. That way, once the deprecation cycle  
ends, they can be removed, and with them all the legacy code that now  
is no longer needed, whereas if you have that in the new methods, it  
keeps bothering the maintainers.

You also mention a add_dbxref_context() on the wiki page - I'm not  
sure why that would be needed given that you build in the -context  
option to add_dbxref() from the outset. But maybe I've glossed over  
some detail.

Once this is merged back to the main trunk, I guess we need to give  
Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it  
makes real sense.

Thanks Chris for this effort, this clears a monumental roadblock.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 23:23:14 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 22:23:14 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
Message-ID: <A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>


On Aug 29, 2007, at 9:02 PM, Hilmar Lapp wrote:

>
> On Aug 28, 2007, at 5:05 PM, Chris Fields wrote:
>
>> I'm now wrapping up the Feature/Annotation rollback.  I will probably
>> start merging back to the main branch in the next day or two., as
>> soon as interested parties (*cough*devs*cough*) look over the last
>> batch of changes.
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>>
>> [...]
>> It is also possible there are still some instances where overloading
>> is expected lurking about in the ~1000 or so modules, so I'll leave
>> the exceptions I added to all Bio::AnnotationI
>
> Keep in mind that code such as
>
> 	if ($ann) { ... }
>
> is mostly not b/c someone wanted to use overloading, but rather
> someone was lazy and really meant to say
>
> 	if (defined($ann)) { ... }

Agreed.

> In the absence of eq overloading, these will behave identically. So
> if you leave the exceptions in it is sort-of policing lazy
> programmers, which I guess is fine in principle, but is guaranteed to
> trip up a lot of script code. I'd take it out if you're reasonably
> sure that at least within BioPerl itself those lazy programming
> incidents are removed.

I agree the overload exceptions shouldn't be left in.  The problem is  
I'm not certain we have caught most implicit overload calls (just the  
ones tested for).  Scott's checking everything against GMOD, though,  
so we can remove them after that.

>> [...]
>> The key change in this last round is the addition of several class
>> *dbxref* methods to Bio::Ontology::Term and
>> Bio::Annotation::OntologyTerm, all of which are capable of working
>> with either DBLink instances or simple scalars.
>
> I don't think you need the code here to deal with both scalars and
> objects. It is fine I think to define the new methods from the outset
> to consistently accept and return DBLink objects, and period.
>
> The backwards compatibility logic should rather be in the *_dblink*()
> methods; i.e., instead of simple aliases they should have the code to
> map to and from the new API. That way, once the deprecation cycle
> ends, they can be removed, and with them all the legacy code that now
> is no longer needed, whereas if you have that in the new methods, it
> keeps bothering the maintainers.

That should be easy enough to fix and would be more consistent.  I  
can look over the various calls to dbxref methods and see what needs  
to be done, then fix that in cvs.

> You also mention a add_dbxref_context() on the wiki page - I'm not
> sure why that would be needed given that you build in the -context
> option to add_dbxref() from the outset. But maybe I've glossed over
> some detail.

The -context parameter was in get_dbxref(), to grab those DBLinks in  
a particular context.  We could do the same with add_dbxref() (pass  
DBLinks in first arg as array ref, context as second arg).  That  
would then obviate the need for add_dbxref_context().

I'll also change the parameter passing in get_dbxref() to just accept  
context as an single optional argument since we're dealing with only  
DBLink instances now.

> Once this is merged back to the main trunk, I guess we need to give
> Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it
> makes real sense.

It describes one method, ontology_term(), which returns a  
Bio::Ontology::TermI.  This is similar to SeqFeature::Annotated::type 
(), which returns a Bio::Annotation::OntologyTerm (a  
Bio::Ontology::TermI).  My thought is to simply deprecate type() in  
favor of TypedSeqFeatureI::ontology_term().

> Thanks Chris for this effort, this clears a monumental roadblock.
>
> 	-hilmar

No problem.  It just needed to be done.

chris

From florent.angly at gmail.com  Wed Aug 29 23:44:58 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 29 Aug 2007 20:44:58 -0700
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
Message-ID: <46D63D3A.6050308@gmail.com>

Hilmar Lapp wrote:
> Keep in mind that code such as
>
> 	if ($ann) { ... }
>
> is mostly not b/c someone wanted to use overloading, but rather  
> someone was lazy and really meant to say
>
> 	if (defined($ann)) { ... }
>
> In the absence of eq overloading, these will behave identically. So  
> if you leave the exceptions in it is sort-of policing lazy  
> programmers, which I guess is fine in principle, but is guaranteed to  
> trip up a lot of script code. I'd take it out if you're reasonably  
> sure that at least within BioPerl itself those lazy programming  
> incidents are removed.
	if ($ann) { ... }

and 

	if (defined($ann)) { ... }

are not the same.

	if ($ann)

is evaluated false for an empty string like

        $ann = '';

and for a value of zero, i.e.

	$ann = 0;

while

	defined($ann)

returns true in these 2 cases.

Florent


From cjfields at uiuc.edu  Wed Aug 29 23:54:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 22:54:05 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <46D63D3A.6050308@gmail.com>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<46D63D3A.6050308@gmail.com>
Message-ID: <90C3DE31-12FD-4BF3-B9F7-0FB5E1DE2A28@uiuc.edu>


On Aug 29, 2007, at 10:44 PM, Florent Angly wrote:

> Hilmar Lapp wrote:
>> Keep in mind that code such as
>>
>> 	if ($ann) { ... }
>>
>> is mostly not b/c someone wanted to use overloading, but rather   
>> someone was lazy and really meant to say
>>
>> 	if (defined($ann)) { ... }
>>
>> In the absence of eq overloading, these will behave identically.  
>> So  if you leave the exceptions in it is sort-of policing lazy   
>> programmers, which I guess is fine in principle, but is guaranteed  
>> to  trip up a lot of script code. I'd take it out if you're  
>> reasonably  sure that at least within BioPerl itself those lazy  
>> programming  incidents are removed.
> 	if ($ann) { ... }
>
> and
> 	if (defined($ann)) { ... }
>
> are not the same.
>
> 	if ($ann)
>
> is evaluated false for an empty string like
>
>        $ann = '';
>
> and for a value of zero, i.e.
>
> 	$ann = 0;
>
> while
>
> 	defined($ann)
>
> returns true in these 2 cases.
>
> Florent

I agree, but we're talking about the context in which this test is  
performed, where $ann is either an instance of a Bio::AnnotationI or  
undef (not a scalar value or '').  In this case it works both as 'if  
($ann)' or 'if (defined($ann))', though the latter is preferred.   
Never underestimate laziness!

chris

From cain.cshl at gmail.com  Wed Aug 29 23:59:11 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 29 Aug 2007 23:59:11 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <46D63D3A.6050308@gmail.com>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<46D63D3A.6050308@gmail.com>
Message-ID: <1188446351.2567.55.camel@localhost.localdomain>

Hi Florent,

Of course what you wrote below is true, but what Hilmar was writing
about was lazy programmers (like me) who assume that the empty string
and 0 value cases aren't going to happen (because we happen to know they
never should in certain contexts), and so use 'if ($ann)'.  Of course,
at the moment, I am in the process of de-lazifying my code (though I
tended to think of it as being efficent :-)

Scott


On Wed, 2007-08-29 at 20:44 -0700, Florent Angly wrote:
> Hilmar Lapp wrote:
> > Keep in mind that code such as
> >
> > 	if ($ann) { ... }
> >
> > is mostly not b/c someone wanted to use overloading, but rather  
> > someone was lazy and really meant to say
> >
> > 	if (defined($ann)) { ... }
> >
> > In the absence of eq overloading, these will behave identically. So  
> > if you leave the exceptions in it is sort-of policing lazy  
> > programmers, which I guess is fine in principle, but is guaranteed to  
> > trip up a lot of script code. I'd take it out if you're reasonably  
> > sure that at least within BioPerl itself those lazy programming  
> > incidents are removed.
> 	if ($ann) { ... }
> 
> and 
> 
> 	if (defined($ann)) { ... }
> 
> are not the same.
> 
> 	if ($ann)
> 
> is evaluated false for an empty string like
> 
>         $ann = '';
> 
> and for a value of zero, i.e.
> 
> 	$ann = 0;
> 
> while
> 
> 	defined($ann)
> 
> returns true in these 2 cases.
> 
> Florent
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070829/27872681/attachment.bin 

From cain.cshl at gmail.com  Thu Aug 30 00:05:06 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 00:05:06 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
Message-ID: <1188446706.2567.59.camel@localhost.localdomain>

Hi Chris,

Is there a reason that the value method of the
Bio::Annotation::SimpleValue (and possibly some of its siblings)
returning "Value: $value"?  It didn't used to have the "Value: " before,
did it?

Thanks,
Scott


On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> I'll probably go ahead and start merging this stuff over to CVS HEAD  
> then.  There haven't been any objections so far.
> 
> The page I posted outlines the more critical fixes, primarily the  
> changes to Bio::Ontology::Term methods (along with relevant code) due  
> to inconsistencies in the interface.  The Bio::Annotation classes  
> also now throw if you attempt to use them in an overloaded context.   
> I also split off SeqFeature::Annotated tests into it's own test suite  
> (SeqFeatAnnotated.t).
> 
> Let me know if there are any problems along the way!
> 
> chris
> 
> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > I just wanted to let you know that I was out of town for a few  
> > days, but
> > now I'm back and I'm doing testing of GMOD software based on the  
> > branch
> > you are working on.  I'll let you know how it goes, but don't let me
> > stop you if you confident of your changes.  I'm sure whatever goes
> > wrong, it will just point out holes in the FeatureIO tests (I'm sure
> > there are plenty) and will require hopefully minimal changes on my  
> > end.
> >
> > Thanks for your considerable efforts on this!  (Regardless of how much
> > work it makes for me :-)
> > Scott
> >
> >
> > On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >> I'm now wrapping up the Feature/Annotation rollback.  I will probably
> >> start merging back to the main branch in the next day or two., as
> >> soon as interested parties (*cough*devs*cough*) look over the last
> >> batch of changes.
> >>
> >> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
> >>
> >> I have also added a small benchmark test which indicates a decrease
> >> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >> this will translate over to any Bio::SeqFeature::Generic-using class
> >> (open mouth, prepare to insert foot....).
> >>
> >> It is also possible there are still some instances where overloading
> >> is expected lurking about in the ~1000 or so modules, so I'll leave
> >> the exceptions I added to all Bio::AnnotationI; we can remove them
> >> down the line, maybe prior to rel1.6, after more tests are added or
> >> if they get particularly annoying.  My guess is I caught 99.99% of
> >> them (prepare to insert other foot....).
> >>
> >> The key change in this last round is the addition of several class
> >> *dbxref* methods to Bio::Ontology::Term and
> >> Bio::Annotation::OntologyTerm, all of which are capable of working
> >> with either DBLink instances or simple scalars.  This was primarily
> >> done in order to clear up inconsistencies in the older *dblink*
> >> methods, which were ambiguous (some indicates simple scalar
> >> arguments, others DBLink objects); operator overloading was used
> >> extensively in these cases, which led to several issues.  I have
> >> added deprecation warnings to the older methods which now map to
> >> using the newer methods.  All tests pass with the exception of a few
> >> already failing on the MAIN branch; the single test which needs to be
> >> fixed is a round-tripping error in swiss.t (now a TODO), which can be
> >> fixed after merging back.
> >>
> >> Please respond to this if there are any questions or if I need to
> >> clarify the changes I made a bit more.
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                          
> > cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/b03eef7e/attachment.bin 

From cjfields at uiuc.edu  Thu Aug 30 00:17:18 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 23:17:18 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188446706.2567.59.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
Message-ID: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>

It shouldn't, that sounds like the output for add_text().  value()  
should just return the scalar value.

As a note, I added a new method, display_text(), for all  
Bio::AnnotationI classes which by default replicates the same output  
that stringification overloads produced.  So you should be able to  
explicitly call $ann->display_text for any Bio::AnnotationI where you  
once used an implicit call:

# old
print "$ann\n";

# new
print $ann->display_text,"\n";

chris

On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:

> Hi Chris,
>
> Is there a reason that the value method of the
> Bio::Annotation::SimpleValue (and possibly some of its siblings)
> returning "Value: $value"?  It didn't used to have the "Value: "  
> before,
> did it?
>
> Thanks,
> Scott
>
>
> On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
>> I'll probably go ahead and start merging this stuff over to CVS HEAD
>> then.  There haven't been any objections so far.
>>
>> The page I posted outlines the more critical fixes, primarily the
>> changes to Bio::Ontology::Term methods (along with relevant code) due
>> to inconsistencies in the interface.  The Bio::Annotation classes
>> also now throw if you attempt to use them in an overloaded context.
>> I also split off SeqFeature::Annotated tests into it's own test suite
>> (SeqFeatAnnotated.t).
>>
>> Let me know if there are any problems along the way!
>>
>> chris
>>
>> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> I just wanted to let you know that I was out of town for a few
>>> days, but
>>> now I'm back and I'm doing testing of GMOD software based on the
>>> branch
>>> you are working on.  I'll let you know how it goes, but don't let me
>>> stop you if you confident of your changes.  I'm sure whatever goes
>>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
>>> there are plenty) and will require hopefully minimal changes on my
>>> end.
>>>
>>> Thanks for your considerable efforts on this!  (Regardless of how  
>>> much
>>> work it makes for me :-)
>>> Scott
>>>
>>>
>>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
>>>> probably
>>>> start merging back to the main branch in the next day or two., as
>>>> soon as interested parties (*cough*devs*cough*) look over the last
>>>> batch of changes.
>>>>
>>>> http://www.bioperl.org/wiki/ 
>>>> Feature_Annotation_rollback#Fourth_Round
>>>>
>>>> I have also added a small benchmark test which indicates a decrease
>>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
>>>> this will translate over to any Bio::SeqFeature::Generic-using  
>>>> class
>>>> (open mouth, prepare to insert foot....).
>>>>
>>>> It is also possible there are still some instances where  
>>>> overloading
>>>> is expected lurking about in the ~1000 or so modules, so I'll leave
>>>> the exceptions I added to all Bio::AnnotationI; we can remove them
>>>> down the line, maybe prior to rel1.6, after more tests are added or
>>>> if they get particularly annoying.  My guess is I caught 99.99% of
>>>> them (prepare to insert other foot....).
>>>>
>>>> The key change in this last round is the addition of several class
>>>> *dbxref* methods to Bio::Ontology::Term and
>>>> Bio::Annotation::OntologyTerm, all of which are capable of working
>>>> with either DBLink instances or simple scalars.  This was primarily
>>>> done in order to clear up inconsistencies in the older *dblink*
>>>> methods, which were ambiguous (some indicates simple scalar
>>>> arguments, others DBLink objects); operator overloading was used
>>>> extensively in these cases, which led to several issues.  I have
>>>> added deprecation warnings to the older methods which now map to
>>>> using the newer methods.  All tests pass with the exception of a  
>>>> few
>>>> already failing on the MAIN branch; the single test which needs  
>>>> to be
>>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
>>>> can be
>>>> fixed after merging back.
>>>>
>>>> Please respond to this if there are any questions or if I need to
>>>> clarify the changes I made a bit more.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>> Scott Cain, Ph. D.
>>> cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From neetisomaiya at gmail.com  Thu Aug 30 00:47:53 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 30 Aug 2007 10:17:53 +0530
Subject: [Bioperl-l] kegg xml parsing
Message-ID: <764978cf0708292147q4ead37b0i782b83ecda8ce3da@mail.gmail.com>

Hi,

Has anyone used XML::Twig for parsing of kegg xml data?
I was looking for some small example code of the same.

Thanks.
-- 
-Neeti
Even my blood says, B positive

From sdavis2 at mail.nih.gov  Thu Aug 30 06:16:54 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 30 Aug 2007 06:16:54 -0400
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>
References: <46D5EF2B.5000101@jays.net>
	<0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>
Message-ID: <46D69916.4060202@mail.nih.gov>

Chris Fields wrote:
> On Aug 29, 2007, at 5:11 PM, Jay Hannah wrote:
> 
>> Please slap me if I'm hysterical.
>>
>> I'm seeking a broad bioinformatics search engine platform. I want to
>> take gobs of data in gobs of formats and allow people to search it on
>> the web.

Not sure how it might or might not meet your needs, but have you looked
at SRS (Sequence Retrieval System)?  I have never tried to use it,
personally, though.

Sean

From cjfields at uiuc.edu  Thu Aug 30 09:17:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 08:17:17 -0500
Subject: [Bioperl-l] remote SwissProt server problems
In-Reply-To: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>
References: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>
Message-ID: <62B4DE62-C11E-4E75-837C-6C1005FB12A4@uiuc.edu>

This should be fixed now (DBFetch-related tests pass, though MeSH  
tests are now failing!).

chris

On Aug 29, 2007, at 11:03 AM, Chris Fields wrote:

> Just as a notice, DBFetch is currently retrieving only single records
> for the UniProtKB database (where Bio::DB::SwissProt fetches
> sequences).  If anyone runs remote sevrer tests and DB.t in the test
> suite you'll see a failure towards the end which indicates this.
> I've posted a notice to the server help desk and will respond when I
> hear more.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cain.cshl at gmail.com  Thu Aug 30 10:39:59 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 10:39:59 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
Message-ID: <1188484799.2567.84.camel@localhost.localdomain>

Hi Chris,

I see--I was using as_text and getting the "Value: $value"; there are
places in my code where I have always used ->value and I thought that
the way it was working had changed.

What is the use case for having the as_text method work the way it does?

Thanks,
Scott


On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
> It shouldn't, that sounds like the output for add_text().  value()  
> should just return the scalar value.
> 
> As a note, I added a new method, display_text(), for all  
> Bio::AnnotationI classes which by default replicates the same output  
> that stringification overloads produced.  So you should be able to  
> explicitly call $ann->display_text for any Bio::AnnotationI where you  
> once used an implicit call:
> 
> # old
> print "$ann\n";
> 
> # new
> print $ann->display_text,"\n";
> 
> chris
> 
> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > Is there a reason that the value method of the
> > Bio::Annotation::SimpleValue (and possibly some of its siblings)
> > returning "Value: $value"?  It didn't used to have the "Value: "  
> > before,
> > did it?
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> >> I'll probably go ahead and start merging this stuff over to CVS HEAD
> >> then.  There haven't been any objections so far.
> >>
> >> The page I posted outlines the more critical fixes, primarily the
> >> changes to Bio::Ontology::Term methods (along with relevant code) due
> >> to inconsistencies in the interface.  The Bio::Annotation classes
> >> also now throw if you attempt to use them in an overloaded context.
> >> I also split off SeqFeature::Annotated tests into it's own test suite
> >> (SeqFeatAnnotated.t).
> >>
> >> Let me know if there are any problems along the way!
> >>
> >> chris
> >>
> >> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> >>
> >>> Hi Chris,
> >>>
> >>> I just wanted to let you know that I was out of town for a few
> >>> days, but
> >>> now I'm back and I'm doing testing of GMOD software based on the
> >>> branch
> >>> you are working on.  I'll let you know how it goes, but don't let me
> >>> stop you if you confident of your changes.  I'm sure whatever goes
> >>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> >>> there are plenty) and will require hopefully minimal changes on my
> >>> end.
> >>>
> >>> Thanks for your considerable efforts on this!  (Regardless of how  
> >>> much
> >>> work it makes for me :-)
> >>> Scott
> >>>
> >>>
> >>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
> >>>> probably
> >>>> start merging back to the main branch in the next day or two., as
> >>>> soon as interested parties (*cough*devs*cough*) look over the last
> >>>> batch of changes.
> >>>>
> >>>> http://www.bioperl.org/wiki/ 
> >>>> Feature_Annotation_rollback#Fourth_Round
> >>>>
> >>>> I have also added a small benchmark test which indicates a decrease
> >>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >>>> this will translate over to any Bio::SeqFeature::Generic-using  
> >>>> class
> >>>> (open mouth, prepare to insert foot....).
> >>>>
> >>>> It is also possible there are still some instances where  
> >>>> overloading
> >>>> is expected lurking about in the ~1000 or so modules, so I'll leave
> >>>> the exceptions I added to all Bio::AnnotationI; we can remove them
> >>>> down the line, maybe prior to rel1.6, after more tests are added or
> >>>> if they get particularly annoying.  My guess is I caught 99.99% of
> >>>> them (prepare to insert other foot....).
> >>>>
> >>>> The key change in this last round is the addition of several class
> >>>> *dbxref* methods to Bio::Ontology::Term and
> >>>> Bio::Annotation::OntologyTerm, all of which are capable of working
> >>>> with either DBLink instances or simple scalars.  This was primarily
> >>>> done in order to clear up inconsistencies in the older *dblink*
> >>>> methods, which were ambiguous (some indicates simple scalar
> >>>> arguments, others DBLink objects); operator overloading was used
> >>>> extensively in these cases, which led to several issues.  I have
> >>>> added deprecation warnings to the older methods which now map to
> >>>> using the newer methods.  All tests pass with the exception of a  
> >>>> few
> >>>> already failing on the MAIN branch; the single test which needs  
> >>>> to be
> >>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
> >>>> can be
> >>>> fixed after merging back.
> >>>>
> >>>> Please respond to this if there are any questions or if I need to
> >>>> clarify the changes I made a bit more.
> >>>>
> >>>> chris
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>> -- 
> >>> -------------------------------------------------------------------- 
> >>> --
> >>> --
> >>> Scott Cain, Ph. D.
> >>> cain at cshl.edu
> >>> GMOD Coordinator (http://www.gmod.org/)
> >>> 216-392-3087
> >>> Cold Spring Harbor Laboratory
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher
> >> Lab of Dr. Robert Switzer
> >> Dept of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                    
> > cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/f2f5159f/attachment.bin 

From cain.cshl at gmail.com  Thu Aug 30 11:46:24 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 11:46:24 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
Message-ID: <1188488785.2567.93.camel@localhost.localdomain>

Hi Chris,

Good news!  I only had to add a few defineds and a few display_texts and
I was able to successfully create a database and load the yeast GFF3
file.  While I want to do more testing with GFF from other sources,
clearly, I am 95% of the way there with relatively little work.

Nice job and Thanks!
Scott


On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
> It shouldn't, that sounds like the output for add_text().  value()  
> should just return the scalar value.
> 
> As a note, I added a new method, display_text(), for all  
> Bio::AnnotationI classes which by default replicates the same output  
> that stringification overloads produced.  So you should be able to  
> explicitly call $ann->display_text for any Bio::AnnotationI where you  
> once used an implicit call:
> 
> # old
> print "$ann\n";
> 
> # new
> print $ann->display_text,"\n";
> 
> chris
> 
> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > Is there a reason that the value method of the
> > Bio::Annotation::SimpleValue (and possibly some of its siblings)
> > returning "Value: $value"?  It didn't used to have the "Value: "  
> > before,
> > did it?
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> >> I'll probably go ahead and start merging this stuff over to CVS HEAD
> >> then.  There haven't been any objections so far.
> >>
> >> The page I posted outlines the more critical fixes, primarily the
> >> changes to Bio::Ontology::Term methods (along with relevant code) due
> >> to inconsistencies in the interface.  The Bio::Annotation classes
> >> also now throw if you attempt to use them in an overloaded context.
> >> I also split off SeqFeature::Annotated tests into it's own test suite
> >> (SeqFeatAnnotated.t).
> >>
> >> Let me know if there are any problems along the way!
> >>
> >> chris
> >>
> >> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> >>
> >>> Hi Chris,
> >>>
> >>> I just wanted to let you know that I was out of town for a few
> >>> days, but
> >>> now I'm back and I'm doing testing of GMOD software based on the
> >>> branch
> >>> you are working on.  I'll let you know how it goes, but don't let me
> >>> stop you if you confident of your changes.  I'm sure whatever goes
> >>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> >>> there are plenty) and will require hopefully minimal changes on my
> >>> end.
> >>>
> >>> Thanks for your considerable efforts on this!  (Regardless of how  
> >>> much
> >>> work it makes for me :-)
> >>> Scott
> >>>
> >>>
> >>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
> >>>> probably
> >>>> start merging back to the main branch in the next day or two., as
> >>>> soon as interested parties (*cough*devs*cough*) look over the last
> >>>> batch of changes.
> >>>>
> >>>> http://www.bioperl.org/wiki/ 
> >>>> Feature_Annotation_rollback#Fourth_Round
> >>>>
> >>>> I have also added a small benchmark test which indicates a decrease
> >>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >>>> this will translate over to any Bio::SeqFeature::Generic-using  
> >>>> class
> >>>> (open mouth, prepare to insert foot....).
> >>>>
> >>>> It is also possible there are still some instances where  
> >>>> overloading
> >>>> is expected lurking about in the ~1000 or so modules, so I'll leave
> >>>> the exceptions I added to all Bio::AnnotationI; we can remove them
> >>>> down the line, maybe prior to rel1.6, after more tests are added or
> >>>> if they get particularly annoying.  My guess is I caught 99.99% of
> >>>> them (prepare to insert other foot....).
> >>>>
> >>>> The key change in this last round is the addition of several class
> >>>> *dbxref* methods to Bio::Ontology::Term and
> >>>> Bio::Annotation::OntologyTerm, all of which are capable of working
> >>>> with either DBLink instances or simple scalars.  This was primarily
> >>>> done in order to clear up inconsistencies in the older *dblink*
> >>>> methods, which were ambiguous (some indicates simple scalar
> >>>> arguments, others DBLink objects); operator overloading was used
> >>>> extensively in these cases, which led to several issues.  I have
> >>>> added deprecation warnings to the older methods which now map to
> >>>> using the newer methods.  All tests pass with the exception of a  
> >>>> few
> >>>> already failing on the MAIN branch; the single test which needs  
> >>>> to be
> >>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
> >>>> can be
> >>>> fixed after merging back.
> >>>>
> >>>> Please respond to this if there are any questions or if I need to
> >>>> clarify the changes I made a bit more.
> >>>>
> >>>> chris
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>> -- 
> >>> -------------------------------------------------------------------- 
> >>> --
> >>> --
> >>> Scott Cain, Ph. D.
> >>> cain at cshl.edu
> >>> GMOD Coordinator (http://www.gmod.org/)
> >>> 216-392-3087
> >>> Cold Spring Harbor Laboratory
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher
> >> Lab of Dr. Robert Switzer
> >> Dept of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                    
> > cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/ec7a594e/attachment.bin 

From hlapp at gmx.net  Thu Aug 30 12:07:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:07:18 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188488785.2567.93.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
Message-ID: <0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:

> Good news!  I only had to add a few defineds and a few  
> display_texts and
> I was able to successfully create a database and load the yeast GFF3

Scott - I'm a little worried - what are you using the display_text()  
calls for? There is no method to set a property that would be  
returned here, so you only have control over that if you override the  
method in a custom AnnotationI class.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
OtZghop1tET5iMqnwXzL+lk=
=NVrK
-----END PGP SIGNATURE-----

From hlapp at gmx.net  Thu Aug 30 12:10:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:10:14 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188484799.2567.84.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188484799.2567.84.camel@localhost.localdomain>
Message-ID: <49824C75-3FA5-4E59-8F99-BC0E974E9652@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 10:39 AM, Scott Cain wrote:

> What is the use case for having the as_text method work the way it  
> does?

That's a bit nebulous as I tried to point out the other day. It's  
just a textual representation of the annotation, but you don't really  
have control over what the particular Annotation class considers to  
fulfill that purpose.

So, it's fine to expect a printable meaningful string to be returned,  
but don't try to parse it or rely on exactly what it is going to look  
like.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1uvnuV6N2JxL7qsRAn+dAKC9iLj93El38uv7kjprdZDo0sXC6wCgqwhm
0/tF89/FO1a4CWAf1bahd+8=
=I7SM
-----END PGP SIGNATURE-----

From hlapp at gmx.net  Thu Aug 30 12:20:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:20:18 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
Message-ID: <DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>


On Aug 29, 2007, at 11:23 PM, Chris Fields wrote:

>> Once this is merged back to the main trunk, I guess we need to give
>> Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it
>> makes real sense.
>
> It describes one method, ontology_term(), which returns a  
> Bio::Ontology::TermI.  This is similar to  
> SeqFeature::Annotated::type(), which returns a  
> Bio::Annotation::OntologyTerm (a Bio::Ontology::TermI).  My thought  
> is to simply deprecate type() in favor of  
> TypedSeqFeatureI::ontology_term().

I think we'll want to think about that. type() gives me some  
indication of what the returned value might represent, whereas  
ontology_term() only tells me about the type of the returned object.

You could make ontology_term() accept a context argument, such as

	my $feature_type = $typedFeat->ontology_term(-context => -type);

Or you could name the method(s) more explicitly, such as

	my $feature_type = $typedFeat->type_term();
	my $feature_source = $typedFeat->source_term();
	my @annTerms = $typedFeat->get_Annotations('Gene Ontology');

Am I making sense?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Thu Aug 30 12:28:47 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 12:28:47 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
Message-ID: <1188491327.2567.101.camel@localhost.localdomain>

Hi Hilmar,

I'm using it as Chris suggested: where I had be depending on ""
overloading.  I think in most places, I am using it on
Bio::Annotation::SimpleValue to get the string that is the simple value.
On more complex data types, I am using other methods built into those
classes to extract useful stuff for inserting into the database.

Scott


On Thu, 2007-08-30 at 12:07 -0400, Hilmar Lapp wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:
> 
> > Good news!  I only had to add a few defineds and a few  
> > display_texts and
> > I was able to successfully create a database and load the yeast GFF3
> 
> Scott - I'm a little worried - what are you using the display_text()  
> calls for? There is no method to set a property that would be  
> returned here, so you only have control over that if you override the  
> method in a custom AnnotationI class.
> 
> 	-hilmar
> - --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.3 (Darwin)
> 
> iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
> OtZghop1tET5iMqnwXzL+lk=
> =NVrK
> -----END PGP SIGNATURE-----
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/1d98e384/attachment.bin 

From hlapp at gmx.net  Thu Aug 30 12:52:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:52:14 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188491327.2567.101.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
Message-ID: <F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 12:28 PM, Scott Cain wrote:

> I think in most places, I am using it on
> Bio::Annotation::SimpleValue to get the string that is the simple  
> value.

You should be using $ann->value() for that, unless I'm missing  
something.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1vXCuV6N2JxL7qsRAkcJAKCICRtOSlPLVYYKCbOTvDIf4idb3wCgkxYM
seeaNvSsFY/4bHLGZ9dum2Q=
=E35w
-----END PGP SIGNATURE-----

From cain.cshl at gmail.com  Thu Aug 30 13:16:09 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 13:16:09 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
	<F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>
Message-ID: <1188494169.2567.109.camel@localhost.localdomain>

Well, in the instances where I was using it, ->value seems to work
exactly the same, so I changed it to value to be more consistent with
other code I'd written.  I'd used display_name without really thinking
about it.

Thanks,
Scott


On Thu, 2007-08-30 at 12:52 -0400, Hilmar Lapp wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> On Aug 30, 2007, at 12:28 PM, Scott Cain wrote:
> 
> > I think in most places, I am using it on
> > Bio::Annotation::SimpleValue to get the string that is the simple  
> > value.
> 
> You should be using $ann->value() for that, unless I'm missing  
> something.
> 
> 	-hilmar
> - --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.3 (Darwin)
> 
> iD8DBQFG1vXCuV6N2JxL7qsRAkcJAKCICRtOSlPLVYYKCbOTvDIf4idb3wCgkxYM
> seeaNvSsFY/4bHLGZ9dum2Q=
> =E35w
> -----END PGP SIGNATURE-----
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/4c383cd3/attachment.bin 

From cjfields at uiuc.edu  Thu Aug 30 13:27:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 12:27:46 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188491327.2567.101.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
Message-ID: <6E9B07D0-AB37-4439-AA9D-9268AB5A38C0@uiuc.edu>

display_text() is really a hack for explicitly getting the same  
output one would have expected from stringification overload for any  
Bio::AnnotationI (you can also use callbacks on it for customizing it  
if needed, but that's not important here).  It works depending on the  
context of what you're trying to accomplish, but it might be best to  
use value() specifically in places where you expect only using  
Bio::Annotation::Simple.

chris

On Aug 30, 2007, at 11:28 AM, Scott Cain wrote:

> Hi Hilmar,
>
> I'm using it as Chris suggested: where I had be depending on ""
> overloading.  I think in most places, I am using it on
> Bio::Annotation::SimpleValue to get the string that is the simple  
> value.
> On more complex data types, I am using other methods built into those
> classes to extract useful stuff for inserting into the database.
>
> Scott
>
>
>
> On Thu, 2007-08-30 at 12:07 -0400, Hilmar Lapp wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>>
>> On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:
>>
>>> Good news!  I only had to add a few defineds and a few
>>> display_texts and
>>> I was able to successfully create a database and load the yeast GFF3
>>
>> Scott - I'm a little worried - what are you using the display_text()
>> calls for? There is no method to set a property that would be
>> returned here, so you only have control over that if you override the
>> method in a custom AnnotationI class.
>>
>> 	-hilmar
>> - --
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.3 (Darwin)
>>
>> iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
>> OtZghop1tET5iMqnwXzL+lk=
>> =NVrK
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug 30 13:45:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 12:45:44 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188488785.2567.93.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
Message-ID: <B81A709F-5081-4EB0-8778-2ABEDB02BA86@uiuc.edu>

Sounds good but I have yet to commit some of the Ontology changes  
Hilmar and I discussed (whereupon our brace heroes deprecate dblinks  
methods in favor of dbxrefs).  These should be committed fairly soon  
(hour or two).

My guess is the change will be fairly transparent so shouldn't affect  
anything unless you have scripts using those methods directly.

chris

On Aug 30, 2007, at 10:46 AM, Scott Cain wrote:

> Hi Chris,
>
> Good news!  I only had to add a few defineds and a few  
> display_texts and
> I was able to successfully create a database and load the yeast GFF3
> file.  While I want to do more testing with GFF from other sources,
> clearly, I am 95% of the way there with relatively little work.
>
> Nice job and Thanks!
> Scott
>
>
> On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
>> It shouldn't, that sounds like the output for add_text().  value()
>> should just return the scalar value.
>>
>> As a note, I added a new method, display_text(), for all
>> Bio::AnnotationI classes which by default replicates the same output
>> that stringification overloads produced.  So you should be able to
>> explicitly call $ann->display_text for any Bio::AnnotationI where you
>> once used an implicit call:
>>
>> # old
>> print "$ann\n";
>>
>> # new
>> print $ann->display_text,"\n";
>>
>> chris
>>
>> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> Is there a reason that the value method of the
>>> Bio::Annotation::SimpleValue (and possibly some of its siblings)
>>> returning "Value: $value"?  It didn't used to have the "Value: "
>>> before,
>>> did it?
>>>
>>> Thanks,
>>> Scott
>>>
>>>
>>> On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
>>>> I'll probably go ahead and start merging this stuff over to CVS  
>>>> HEAD
>>>> then.  There haven't been any objections so far.
>>>>
>>>> The page I posted outlines the more critical fixes, primarily the
>>>> changes to Bio::Ontology::Term methods (along with relevant  
>>>> code) due
>>>> to inconsistencies in the interface.  The Bio::Annotation classes
>>>> also now throw if you attempt to use them in an overloaded context.
>>>> I also split off SeqFeature::Annotated tests into it's own test  
>>>> suite
>>>> (SeqFeatAnnotated.t).
>>>>
>>>> Let me know if there are any problems along the way!
>>>>
>>>> chris
>>>>
>>>> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
>>>>
>>>>> Hi Chris,
>>>>>
>>>>> I just wanted to let you know that I was out of town for a few
>>>>> days, but
>>>>> now I'm back and I'm doing testing of GMOD software based on the
>>>>> branch
>>>>> you are working on.  I'll let you know how it goes, but don't  
>>>>> let me
>>>>> stop you if you confident of your changes.  I'm sure whatever goes
>>>>> wrong, it will just point out holes in the FeatureIO tests (I'm  
>>>>> sure
>>>>> there are plenty) and will require hopefully minimal changes on my
>>>>> end.
>>>>>
>>>>> Thanks for your considerable efforts on this!  (Regardless of how
>>>>> much
>>>>> work it makes for me :-)
>>>>> Scott
>>>>>
>>>>>
>>>>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>>>>>> I'm now wrapping up the Feature/Annotation rollback.  I will
>>>>>> probably
>>>>>> start merging back to the main branch in the next day or two., as
>>>>>> soon as interested parties (*cough*devs*cough*) look over the  
>>>>>> last
>>>>>> batch of changes.
>>>>>>
>>>>>> http://www.bioperl.org/wiki/
>>>>>> Feature_Annotation_rollback#Fourth_Round
>>>>>>
>>>>>> I have also added a small benchmark test which indicates a  
>>>>>> decrease
>>>>>> in parsing time in SeqIO::genbank with all tests passing.  I  
>>>>>> expect
>>>>>> this will translate over to any Bio::SeqFeature::Generic-using
>>>>>> class
>>>>>> (open mouth, prepare to insert foot....).
>>>>>>
>>>>>> It is also possible there are still some instances where
>>>>>> overloading
>>>>>> is expected lurking about in the ~1000 or so modules, so I'll  
>>>>>> leave
>>>>>> the exceptions I added to all Bio::AnnotationI; we can remove  
>>>>>> them
>>>>>> down the line, maybe prior to rel1.6, after more tests are  
>>>>>> added or
>>>>>> if they get particularly annoying.  My guess is I caught  
>>>>>> 99.99% of
>>>>>> them (prepare to insert other foot....).
>>>>>>
>>>>>> The key change in this last round is the addition of several  
>>>>>> class
>>>>>> *dbxref* methods to Bio::Ontology::Term and
>>>>>> Bio::Annotation::OntologyTerm, all of which are capable of  
>>>>>> working
>>>>>> with either DBLink instances or simple scalars.  This was  
>>>>>> primarily
>>>>>> done in order to clear up inconsistencies in the older *dblink*
>>>>>> methods, which were ambiguous (some indicates simple scalar
>>>>>> arguments, others DBLink objects); operator overloading was used
>>>>>> extensively in these cases, which led to several issues.  I have
>>>>>> added deprecation warnings to the older methods which now map to
>>>>>> using the newer methods.  All tests pass with the exception of a
>>>>>> few
>>>>>> already failing on the MAIN branch; the single test which needs
>>>>>> to be
>>>>>> fixed is a round-tripping error in swiss.t (now a TODO), which
>>>>>> can be
>>>>>> fixed after merging back.
>>>>>>
>>>>>> Please respond to this if there are any questions or if I need to
>>>>>> clarify the changes I made a bit more.
>>>>>>
>>>>>> chris
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>> -- 
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> --
>>>>> --
>>>>> Scott Cain, Ph. D.
>>>>> cain at cshl.edu
>>>>> GMOD Coordinator (http://www.gmod.org/)
>>>>> 216-392-3087
>>>>> Cold Spring Harbor Laboratory
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>> Scott Cain, Ph. D.
>>> cain.cshl at gmail.com
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug 30 14:03:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 13:03:29 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
	<DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>
Message-ID: <D4E8E9D3-BB64-48C5-8273-5C6C04DC8DE9@uiuc.edu>


On Aug 30, 2007, at 11:20 AM, Hilmar Lapp wrote:

>> ...It describes one method, ontology_term(), which returns a  
>> Bio::Ontology::TermI.  This is similar to  
>> SeqFeature::Annotated::type(), which returns a  
>> Bio::Annotation::OntologyTerm (a Bio::Ontology::TermI).  My  
>> thought is to simply deprecate type() in favor of  
>> TypedSeqFeatureI::ontology_term().
>
> I think we'll want to think about that. type() gives me some  
> indication of what the returned value might represent, whereas  
> ontology_term() only tells me about the type of the returned object.
>
> You could make ontology_term() accept a context argument, such as
>
> 	my $feature_type = $typedFeat->ontology_term(-context => -type);
>
> Or you could name the method(s) more explicitly, such as
>
> 	my $feature_type = $typedFeat->type_term();
> 	my $feature_source = $typedFeat->source_term();
> 	my @annTerms = $typedFeat->get_Annotations('Gene Ontology');
>
> Am I making sense?
>
> 	-hilmar

I think so; I'll have to look at what is returned from type() in some  
more detail.

It appears that the two main culprits for passing strings off to  
Ontology::Term are the Bio::OntologyIO::obo and  
Bio::OntologyIO::dagflat parsers.  I can add some code in there to  
change those to DBLinks prior to creating Ontology::Term instances,  
which should clean that up.

chris

From cjfields at uiuc.edu  Thu Aug 30 20:57:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 19:57:15 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <46CF27F4.8030608@arcor.de>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>	<46CEAD83.2050904@arcor.de>	<9824900.1187973171940.JavaMail.ngmail@webmail17>	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
	<BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
	<46CF27F4.8030608@arcor.de>
Message-ID: <4ED2E2B0-8E36-4500-A4C9-B8C333E14614@uiuc.edu>


On Aug 24, 2007, at 1:48 PM, marian wrote:

> ...
> Bio::Microarray::Tools::MitoChip would be OK to me. I merely meant,  
> that it
> isnt an expression chip and you also wont/cant analyze expression  
> data with
> the tool I am talking about.
>
> Marian

Okay, I have everything working from bugzilla:

http://bugzilla.open-bio.org/show_bug.cgi?id=2332

I suppose what we need to do next is get a test script going.  I'll  
look at the script attached to see if we can get something going that  
is fairly quick.

chris

From avilella at gmail.com  Fri Aug 31 05:29:43 2007
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 31 Aug 2007 10:29:43 +0100
Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with
	exon boundaries
Message-ID: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com>

Hi,

Probably a bit of a long shot but does anyone have code for
displaying protein or CDS multiple sequence alignments with the exon
boundaries
of each gene in the alignment?

Something in the bioperl world without funky external dependencies. I think
it would
be an awesome addition to the howtos.

Currently, the Bio::Graphics howto has cdna to genome mapping scripts or
blast output scripts, but
I couldn't find code for dealing with multiple sequence alignments.

Cheers,

    Albert.

From neetisomaiya at gmail.com  Fri Aug 31 05:41:51 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 31 Aug 2007 15:11:51 +0530
Subject: [Bioperl-l] need help
Message-ID: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>

Hi,

I am trying to parse the compound (
ftp://ftp.genome.jp/pub/kegg/ligand/compound/compound) and glycan (
ftp://ftp.genome.jp/pub/kegg/ligand/glycan/glycan) files of KEGG using
bioperl.
I just want the kegg id of the compound/glycan and its names and synonyms if
any.
Bio::SeqIO is giving some problem, I am not able to fetch the id and name.
Can someone help me with this.

Thanks.

-- 
-Neeti
Even my blood says, B positive

From cjfields at uiuc.edu  Fri Aug 31 10:51:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 31 Aug 2007 09:51:51 -0500
Subject: [Bioperl-l] need help
In-Reply-To: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>
References: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>
Message-ID: <BD54A833-D2D3-4AE5-8517-BB060F3C132E@uiuc.edu>

I don't believe Bio::SeqIO::kegg will parse those files (they aren't  
sequence files).  The format it recognizes is:

http://www.bioperl.org/wiki/KEGG_sequence_format

for the files found in the subdirectories here:

ftp://ftp.genome.ad.jp/pub/kegg/genes/organisms

I would just build a custom parser if all you're interested in is id/ 
names/synonyms.  It'll be much faster.

chris

On Aug 31, 2007, at 4:41 AM, neeti somaiya wrote:

> Hi,
>
> I am trying to parse the compound (
> ftp://ftp.genome.jp/pub/kegg/ligand/compound/compound) and glycan (
> ftp://ftp.genome.jp/pub/kegg/ligand/glycan/glycan) files of KEGG using
> bioperl.
> I just want the kegg id of the compound/glycan and its names and  
> synonyms if
> any.
> Bio::SeqIO is giving some problem, I am not able to fetch the id  
> and name.
> Can someone help me with this.
>
> Thanks.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Wed Aug  1 01:45:45 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Wed, 1 Aug 2007 11:15:45 +0530 (IST)
Subject: [Bioperl-l] Perl 3D OpenGL
In-Reply-To: <04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu>
References: <152401c7d224$8e2455b0$6e4e7c0a@HPONE>
	<25A5F0A3-1CC3-46B5-8976-A24C451204E7@jays.net>
	<04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu>
Message-ID: <49637.192.168.1.1.1185947145.squirrel@mail.ncbs.res.in>

Hi,
Open-GL/3D contributions are always welcome !!!
What about Perl-OpenGL/3D implimentation of a web-based 3D-Viewer like Jmol.

 http://jmol.sourceforge.net/

(So we dont need to worry about Java installation and stuffs :) develop it
and deploy it in Perl - eternal happiness !!!)
-- 
SK
>
> On Jul 31, 2007, at 7:00 AM, Jay Hannah wrote:
>
>> On Jul 29, 2007, at 4:08 PM, Grafman Productions wrote:
>>> If this posting is inappropriate, please let me know - my apologies.
>>
>> Not at all. AFAIK this is the perfect place to discuss any
>> contributions you're motivated to make to the BioPerl project.
>>
>>> I recently came across an article on BioPerl, and it occurred to me
>>> that
>>> there might be some need for 3D rendering within your BioPerl
>>> project.
>>>
>>> I released a number of new/updated Perl OpenGL (POGL) modules this
>>> year,
>>> along with benchmarks that demonstrate that it performs comparably
>>> to C.
>>>
>>> If there's a need for 3D features within BioPerl, and if I can be
>>> of any
>>> assistance in helping to add such features, I would enjoy the
>>> opportunity.
>>
>> I know nothing about 3D modeling in biology, nor do I hang out with
>> any protein structure folks, but 3D always sounds sexy. -grin-
>>
>> If you're new to bioinformatics (I certainly am) you might want to
>> read this:
>>
>>    http://en.wikipedia.org/wiki/Protein_structure
>>
>> Because that's probably where your 3D work would be used. Especially
>> note the "Software" section, where you'll find some of the
>> "competition".  :)
>>
>> There's some cool stuff out there. I don't know what all would or
>> wouldn't be time well spent in Perl / BioPerl.
>>
>> HTH,
>>
>> Jay Hannah
>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
> I agree that protein structure is the best place for something like
> this.
>
> It's a wide open area as far as I'm concerned; in fact I would say
> that Bio::Structure is getting pretty dated, so if anyone wants to
> take it over, refactor the code, and so on I don't have a problem.
>
> chris
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From Alicia.Amadoz at uv.es  Wed Aug  1 03:13:11 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Wed, 1 Aug 2007 09:13:11 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
Message-ID: <1664224328amadoz@uv.es>

Hi, I would like to save my hit sequences from a blast result in a fasta
file. I am trying some things but I have problems using Bio::SearchIO
and Bio::SeqIO. Hope anyone could help me with this. Here is my current
code:

# my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
"fasta");
my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
         my $hseq = $hsp->hit_string();
         # $seq_out->write_seq($hseq);
         $seq_out->write_result($hseq);
      }
   }
}

Here the error is,

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: ResultWriter not defined.

I couldn't find any kind of documentation about ResultWriter.
Thanks in advance,
Alicia


From xianranli78 at yahoo.com.cn  Wed Aug  1 04:11:53 2007
From: xianranli78 at yahoo.com.cn (Xianran Li)
Date: Wed, 1 Aug 2007 16:11:53 +0800
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
References: <1664224328amadoz@uv.es>
Message-ID: <001101c7d413$a0d79aa0$ed07a8c0@BGI.LOCAL>

The $hseq->$hsp->hit_string() will return the string of hit sequence, rather than an objective of Bio::Seq. So may be you should construct a objective firstly, then you could use $seq_out->write_seq($hseq_obj) to write the seq into a fasta file.

# my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta");
  my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
         my $hseq = $hsp->hit_string(); 
            $hseq =~ s/-//g; #### remove the gap within the aligment
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
         # $seq_out->write_seq($hseq);
         $seq_out->write_result($hseq_obj);
      }
   }
}

Xianran
----- Original Message ----- 
From: "Alicia Amadoz" <Alicia.Amadoz at uv.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, August 01, 2007 3:13 PM
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file


> Hi, I would like to save my hit sequences from a blast result in a fasta
> file. I am trying some things but I have problems using Bio::SearchIO
> and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> code:
> 
> # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> "fasta");
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> while(my $result = $blast_report->next_result()) {
>    while(my $hit = $result->next_hit()) {
>       while(my $hsp = $hit->next_hsp()) {
>          my $hseq = $hsp->hit_string();
>          # $seq_out->write_seq($hseq);
>          $seq_out->write_result($hseq);
>       }
>    }
> }
> 
> Here the error is,
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: ResultWriter not defined.
> 
> I couldn't find any kind of documentation about ResultWriter.
> Thanks in advance,
> Alicia
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f???????


From Alicia.Amadoz at uv.es  Wed Aug  1 06:25:29 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Wed, 1 Aug 2007 12:25:29 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
Message-ID: <5927683277amadoz@uv.es>

Hi, I have tried what you suggested and I get also some errors.
With this code,

my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
	my $hseq = $hsp->hit_string(); 
        $hseq =~ s/-//g; #### remove the gap within the aligment
        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
        $seq_out->write_seq($hseq_obj);
      }
   }				
}

I have the following error:

Can't locate object method "write_seq" via package "Bio::SearchIO::fasta"

And using write_result methog with this code,

my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
	my $hseq = $hsp->hit_string(); 
        $hseq =~ s/-//g; #### remove the gap within the aligment
        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
        $seq_out->write_result($hseq_obj);
      }
   }				
}

I have again this kind of error:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: ResultWriter not defined.
STACK: Error::throw

So, what else can I try?? Thanks in advance,
Alicia


From neetisomaiya at gmail.com  Wed Aug  1 07:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Wed Aug  1 07:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Wed Aug  1 07:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive


From jay at jays.net  Wed Aug  1 09:30:50 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 1 Aug 2007 09:30:50 -0400 (EDT)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <5927683277amadoz@uv.es>
References: <5927683277amadoz@uv.es>
Message-ID: <Pine.LNX.4.64.0708010926370.3555@ferret.jays.net>

On Wed, 1 Aug 2007, Alicia Amadoz wrote:
> Hi, I have tried what you suggested and I get also some errors.
> With this code,
>
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> while(my $result = $blast_report->next_result()) {
>   while(my $hit = $result->next_hit()) {
>      while(my $hsp = $hit->next_hsp()) {
> 	my $hseq = $hsp->hit_string();
>        $hseq =~ s/-//g; #### remove the gap within the aligment
>        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
>        $seq_out->write_seq($hseq_obj);
>      }
>   }
> }
>
> I have the following error:
>
> Can't locate object method "write_seq" via package "Bio::SearchIO::fasta"

You don't want to write_seq() to a SearchIO, you want to write_seq() to a 
SeqIO. Try this:

my $seq_out = Bio::SeqIO->new(-file => ">$fasfilename", -format => "fasta");
while(my $result = $blast_report->next_result()) {
    while(my $hit = $result->next_hit()) {
       while(my $hsp = $hit->next_hsp()) {
 	my $hseq = $hsp->hit_string();
         $hseq =~ s/-//g; #### remove the gap within the aligment
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
         $seq_out->write_seq($hseq_obj);
       }
    }
}

(Untested.)

HTH,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Wed Aug  1 11:02:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 1 Aug 2007 10:02:07 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
Message-ID: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>

Neeti,

Only post to one list email address, namely the one I'm responding to  
and the one shown here:

http://bioperl.org/mailman/listinfo/bioperl-l

The others are aliases so you essentially posted three times.  As for  
your question: there was no attached script or any additional  
information (bioperl version would have also been nice), so we can't  
help you until we have something more to work with.

chris

On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:

> I have downloaded the omim.txt file from NCBI ftp site and I am  
> running my
> attached parser on this file, the parser run stops in between with  
> this :-
>
> ------------- EXCEPTION  -------------
> MSG: a part/organism must be assigned
> STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> STACK toplevel parse_omim_original.pl:47
>
> --------------------------------------
>
> What is the reason for this?
> Can anyone guide me please.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Wed Aug  1 20:50:06 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Thu, 2 Aug 2007 10:50:06 +1000
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <1664224328amadoz@uv.es>
References: <1664224328amadoz@uv.es>
Message-ID: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>

Alicia,

> Hi, I would like to save my hit sequences from a blast result in a fasta
> file. I am trying some things but I have problems using Bio::SearchIO
> and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> code:
> # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> "fasta");
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> ...
>        my $hseq = $hsp->hit_string();
>          # $seq_out->write_seq($hseq);
>          $seq_out->write_result($hseq);

You have encountered two common problems for BioPerl beginners:

1. "fasta" means two different things! In SearchIO it refers to the
output format of the "fasta" sequence alignment software. In SeqIO it
refers to a file format that stores just sequences. Confusing, I know.
You need SeqIO and write_seq, not SearchIO and write_result.

2. $hseq is a STRING which has the raw sequence letters in it.
However, the write_seq() method needs a Bio::Seq object (which has
extra details like the name and ID) not a raw string.

The example code Jay Hannah supplied in his reply looks pretty good,
you should try it.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University


From Alicia.Amadoz at uv.es  Thu Aug  2 03:06:54 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Thu, 2 Aug 2007 09:06:54 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
Message-ID: <3579584634amadoz@uv.es>

Hi, thanks for your help and suggestions. I have tried the example code
of Jay Hannah and it works perfectly. But what I need to save in fasta
format is the whole sequence in the database that is similar to my query
sequence. I don't understand very well the difference between
hit_string() and query_string(), are they the whole sequence that is
similiar (about hit_string), a part of the whole sequence or just the
part that is aligned to my query string? 

With the previous code what I have are different sequences in length
with the same id as my query string, so I am not sure that I am doing
what I need to do. Any light on this point?

Thank you very much for your help.
Alicia

> Alicia,
> 
> > Hi, I would like to save my hit sequences from a blast result in a fasta
> > file. I am trying some things but I have problems using Bio::SearchIO
> > and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> > code:
> > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> > "fasta");
> > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> > => "fasta");
> > ...
> >        my $hseq = $hsp->hit_string();
> >          # $seq_out->write_seq($hseq);
> >          $seq_out->write_result($hseq);
> 
> You have encountered two common problems for BioPerl beginners:
> 
> 1. "fasta" means two different things! In SearchIO it refers to the
> output format of the "fasta" sequence alignment software. In SeqIO it
> refers to a file format that stores just sequences. Confusing, I know.
> You need SeqIO and write_seq, not SearchIO and write_result.
> 
> 2. $hseq is a STRING which has the raw sequence letters in it.
> However, the write_seq() method needs a Bio::Seq object (which has
> extra details like the name and ID) not a raw string.
> 
> The example code Jay Hannah supplied in his reply looks pretty good,
> you should try it.
> 
> -- 
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> 
> 


From xianranli78 at yahoo.com.cn  Thu Aug  2 04:56:04 2007
From: xianranli78 at yahoo.com.cn (Xianran Li)
Date: Thu, 2 Aug 2007 16:56:04 +0800
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
	<3579584634amadoz@uv.es>
Message-ID: <003701c7d4e2$f7a34bc0$ed07a8c0@BGI.LOCAL>

----- Original Message ----- 
From: "Alicia Amadoz" <Alicia.Amadoz at uv.es>
To: "Torsten Seemann" <torsten.seemann at infotech.monash.edu.au>; <bioperl-l at bioperl.org>
Cc: <jay at jays.net>
Sent: Thursday, August 02, 2007 3:06 PM
Subject: Re: [Bioperl-l] trying to save blast hit sequences to fasta file


> Hi, thanks for your help and suggestions. I have tried the example code
> of Jay Hannah and it works perfectly. But what I need to save in fasta
> format is the whole sequence in the database that is similar to my query
> sequence. I don't understand very well the difference between
> hit_string() and query_string(), are they the whole sequence that is
> similiar (about hit_string), a part of the whole sequence or just the
> part that is aligned to my query string? 

The hit_string() returns the  aligned sequences of the subject in your database and the query_string() is the aligned sequences of the query. These two things will be the same unless there are some mutations and or gaps within the alignment. 

> 
> With the previous code what I have are different sequences in length
> with the same id as my query string, so I am not sure that I am doing
> what I need to do. Any light on this point?

Did you specify the $id before 
  
my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 

If you didn't, then all the sequences retrieved will get the same id. The following is a simply way to avoid this problem.

my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta");                                                           
my $i;                                                                    
while(my $result = $blast_report->next_result()) {                        
   while(my $hit = $result->next_hit()) {                                 
      while(my $hsp = $hit->next_hsp()) {                                 
            $i ++;                                                      
         my $hseq = $hsp->hit_string();                                   
            $hseq =~ s/-//g; #### remove the gap within the aligment      
         my $id = $i; ###### specifiy the id                            
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
         # $seq_out->write_seq($hseq);                                    
         $seq_out->write_result($hseq_obj);                               
      }                                                                   
   }                                                                      
}               


Xianran 

> 
> Thank you very much for your help.
> Alicia
> 
> > Alicia,
> > 
> > > Hi, I would like to save my hit sequences from a blast result in a fasta
> > > file. I am trying some things but I have problems using Bio::SearchIO
> > > and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> > > code:
> > > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> > > "fasta");
> > > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> > > => "fasta");
> > > ...
> > >        my $hseq = $hsp->hit_string();
> > >          # $seq_out->write_seq($hseq);
> > >          $seq_out->write_result($hseq);
> > 
> > You have encountered two common problems for BioPerl beginners:
> > 
> > 1. "fasta" means two different things! In SearchIO it refers to the
> > output format of the "fasta" sequence alignment software. In SeqIO it
> > refers to a file format that stores just sequences. Confusing, I know.
> > You need SeqIO and write_seq, not SearchIO and write_result.
> > 
> > 2. $hseq is a STRING which has the raw sequence letters in it.
> > However, the write_seq() method needs a Bio::Seq object (which has
> > extra details like the name and ID) not a raw string.
> > 
> > The example code Jay Hannah supplied in his reply looks pretty good,
> > you should try it.
> > 
> > -- 
> > --Torsten Seemann
> > --Victorian Bioinformatics Consortium, Monash University
> > 
> > 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f???????


From neetisomaiya at gmail.com  Thu Aug  2 02:20:33 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 2 Aug 2007 11:50:33 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
Message-ID: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>

Hi,

The script is attached with this mail.
I am using bioperl-1.4.

Regards,
Neeti.

On 8/1/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Neeti,
>
> Only post to one list email address, namely the one I'm responding to
> and the one shown here:
>
> http://bioperl.org/mailman/listinfo/bioperl-l
>
> The others are aliases so you essentially posted three times.  As for
> your question: there was no attached script or any additional
> information (bioperl version would have also been nice), so we can't
> help you until we have something more to work with.
>
> chris
>
> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>
> > I have downloaded the omim.txt file from NCBI ftp site and I am
> > running my
> > attached parser on this file, the parser run stops in between with
> > this :-
> >
> > ------------- EXCEPTION  -------------
> > MSG: a part/organism must be assigned
> > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > STACK toplevel parse_omim_original.pl:47
> >
> > --------------------------------------
> >
> > What is the reason for this?
> > Can anyone guide me please.
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
-Neeti
Even my blood says, B positive
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_omim_original.pl
Type: application/x-perl
Size: 5998 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/fbbee8db/attachment-0002.bin>

From neetisomaiya at gmail.com  Thu Aug  2 09:00:33 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 2 Aug 2007 18:30:33 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
Message-ID: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>

Also,
As per the following links we can fetch data from the genemap file as well
:-
http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/OMIMparser.pm

But when I am trying to do so in the exact manner as given in the above
link, I get no data. As in there are OMIM ids which are present in both the
omim.txt and genemap files, and for such cases when I parse and fetch data,
data from both files should be obtained, but I aint getting it.

For eg. while running the attached script, for OMIM id 100790, I get all
data from omim.txt but the cytoposition, gene symbol etc from genemap is not
coming, though it is present in the genemap file.

Please help me find what could be going wrong.

On 8/2/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> The script is attached with this mail.
> I am using bioperl-1.4.
>
> Regards,
> Neeti.
>
> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:
> >
> > Neeti,
> >
> > Only post to one list email address, namely the one I'm responding to
> > and the one shown here:
> >
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> > The others are aliases so you essentially posted three times.  As for
> > your question: there was no attached script or any additional
> > information (bioperl version would have also been nice), so we can't
> > help you until we have something more to work with.
> >
> > chris
> >
> > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
> >
> > > I have downloaded the omim.txt file from NCBI ftp site and I am
> > > running my
> > > attached parser on this file, the parser run stops in between with
> > > this :-
> > >
> > > ------------- EXCEPTION  -------------
> > > MSG: a part/organism must be assigned
> > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > > STACK toplevel parse_omim_original.pl:47
> > >
> > > --------------------------------------
> > >
> > > What is the reason for this?
> > > Can anyone guide me please.
> > >
> > > --
> > > -Neeti
> > > Even my blood says, B positive
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
>
>
> --
> -Neeti
> Even my blood says, B positive
>
>


-- 
-Neeti
Even my blood says, B positive
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_omim_original.pl
Type: application/x-perl
Size: 8750 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/6bdb009c/attachment-0002.bin>

From cjfields at uiuc.edu  Thu Aug  2 13:05:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 12:05:55 -0500
Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml
References: <38B65B2C-A36D-41FB-83C9-7D7B55156CCD@uiuc.edu>
Message-ID: <EF284983-9A37-4F0F-BF92-04C7804275A0@uiuc.edu>

For archiving purposes; of course I forgot to cc the list!

-c

Begin forwarded message:

> From: Chris Fields <cjfields at uiuc.edu>
> Date: August 2, 2007 12:04:59 PM CDT
> To: gyang at plantbio.uga.edu
> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
> with xml
>
> Guojun,
>
> Make sure to keep this on the mail list for archiving purposes.
>
> It could be that the RID is not being removed properly (if it isn't  
> removed then you will repeatedly retrieve your BLAST report).  The  
> new error you are seeing may be coming from whatever XML::SAX  
> backend parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat,  
> etc); it doesn't look bioperl-related and there is an eval which  
> catches this stuff in SearchIO::blastxml.  Does text parsing work?
>
> Could you directly send me your script or add it to a new bug  
> report as an attachment?
>
> http://www.bioperl.org/wiki/Bugs
>
> chris
>
> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote:
>
>> Hi,Chris,
>> I installed the latest version of bioperl, in addition to the  
>> repeated output problem, there are new problems with parsing:
>>
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>  No close tag marker [Ln: 4126, Col: 0]
>>
>> ---------------------------------------------------
>>
>> Would you please kindly give me a hint on this,
>> Thanks a lot,
>> Guojun
>>
>>
>> ----- Original Message -----
>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> To: gyang at plantbio.uga.edu
>> Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org]
>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
>> with xml
>>
>>
>>> Make sure to keep responses on the ail list.
>>>> You might want to run a full install, just in case.  If I remember
>>> correctly Sendu made some changes a while back in the BLAST-related
>>> modules which may be related to this.  At the very least install/
>>> upgrade all modules in Bio::Tools::Run.
>>>> chris
>>>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote:
>>>>> Thanks, Chris,
>>>> But when I replaced the old RemoteBlast.pm with the new one, I got
>>>> "can't locate the object method "retrieve_parameter"". Does this
>>>> mean I need to install something else?
>>>> Guojun
>>>>
>>>> ----- Original Message -----
>>>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>>>> To: gyang at plantbio.uga.edu
>>>> Cc: bioperl-l at bioperl.org
>>>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast
>>>> with xml
>>>>
>>>>
>>>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote:
>>>>>>> I am running remoteblast and using readmethod "xml", I  
>>>>>>> noticed that
>>>>>> it is printing the output repeatedly nonstop. It's like in a  
>>>>>> loop.
>>>>>> Did anybody notice this before? Can anybody help me getting  
>>>>>> out of
>>>>>> this?
>>>>>> Thanks a lot,
>>>>>>
>>>>>>
>>>>>> Guojun Yang
>>>>>> University of Georgia
>>>>>> Not seeing that using bioperl-live; you may need to update
>>>>> RemoteBlast.pm as this sounds similar to an issue that popped up
>>>>> earlier in the spring.
>>>>>> chris
>>>>>
>>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>>>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug  2 13:51:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 12:51:27 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>
Message-ID: <921F31D6-3CA9-483A-8AFF-B3555E9768C4@uiuc.edu>

Neeti,

The genemap wasn't loaded in all cases; don't know what the reasoning  
for it was, but it is fixed in CVS now  
(Bio::Phenotype::OMIM::OMIMparser, specifically).  I would recommend  
that you install a full upgrade to at least bioperl 1.5.2 before  
using this; I can't guarantee it will work with bioperl 1.4.

chris

On Aug 2, 2007, at 8:00 AM, neeti somaiya wrote:

> Also,
> As per the following links we can fetch data from the genemap file  
> as well
> :-
> http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/ 
> OMIMparser.pm
>
> But when I am trying to do so in the exact manner as given in the  
> above
> link, I get no data. As in there are OMIM ids which are present in  
> both the
> omim.txt and genemap files, and for such cases when I parse and  
> fetch data,
> data from both files should be obtained, but I aint getting it.
>
> For eg. while running the attached script, for OMIM id 100790, I  
> get all
> data from omim.txt but the cytoposition, gene symbol etc from  
> genemap is not
> coming, though it is present in the genemap file.
>
> Please help me find what could be going wrong.
>
> On 8/2/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Hi,
>>
>> The script is attached with this mail.
>> I am using bioperl-1.4.
>>
>> Regards,
>> Neeti.
>>
>> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:
>>>
>>> Neeti,
>>>
>>> Only post to one list email address, namely the one I'm  
>>> responding to
>>> and the one shown here:
>>>
>>> http://bioperl.org/mailman/listinfo/bioperl-l
>>>
>>> The others are aliases so you essentially posted three times.  As  
>>> for
>>> your question: there was no attached script or any additional
>>> information (bioperl version would have also been nice), so we can't
>>> help you until we have something more to work with.
>>>
>>> chris
>>>
>>> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>>>
>>>> I have downloaded the omim.txt file from NCBI ftp site and I am
>>>> running my
>>>> attached parser on this file, the parser run stops in between with
>>>> this :-
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: a part/organism must be assigned
>>>> STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
>>>> STACK toplevel parse_omim_original.pl:47
>>>>
>>>> --------------------------------------
>>>>
>>>> What is the reason for this?
>>>> Can anyone guide me please.
>>>>
>>>> --
>>>> -Neeti
>>>> Even my blood says, B positive
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>>
>>
>>
>> --
>> -Neeti
>> Even my blood says, B positive
>>
>>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
> <parse_omim_original.pl>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug  2 14:16:56 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 13:16:56 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
Message-ID: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>

Neeti,

Keep this on the list please.  I am unable to reproduce this using  
your script with or without using the optional genemap file.  You  
really should upgrade bioperl to 1.5.2 and try the fix first; this is  
something that may have been fixed post-bioperl 1.4.

chris

On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote:

> Waiting for your reply on the exception I had mentioned in my first  
> mail.
>
> Thanks.
>
> ---------- Forwarded message ----------
> From: neeti somaiya < neetisomaiya at gmail.com>
> Date: Aug 2, 2007 11:50 AM
> Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser
> To: bioperl-l at lists.open-bio.org
>
> Hi,
>
> The script is attached with this mail.
> I am using bioperl-1.4.
>
> Regards,
> Neeti.
>
>
> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti,
>
> Only post to one list email address, namely the one I'm responding to
> and the one shown here:
>
> http://bioperl.org/mailman/listinfo/bioperl-l
>
> The others are aliases so you essentially posted three times.  As for
> your question: there was no attached script or any additional
> information (bioperl version would have also been nice), so we can't
> help you until we have something more to work with.
>
> chris
>
> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>
> > I have downloaded the omim.txt file from NCBI ftp site and I am
> > running my
> > attached parser on this file, the parser run stops in between with
> > this :-
> >
> > ------------- EXCEPTION  -------------
> > MSG: a part/organism must be assigned
> > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > STACK toplevel parse_omim_original.pl:47
> >
> > --------------------------------------
> >
> > What is the reason for this?
> > Can anyone guide me please.
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
> <parse_omim_original.pl>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Thu Aug  2 21:03:36 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 3 Aug 2007 11:03:36 +1000
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <3579584634amadoz@uv.es>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
	<3579584634amadoz@uv.es>
Message-ID: <a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>

Alicia,

> Hi, thanks for your help and suggestions. I have tried the example code
> of Jay Hannah and it works perfectly. But what I need to save in fasta
> format is the whole sequence in the database that is similar to my query
> sequence.

Unfortunately the hit_string is only that part of the sequence in the
database that was similar enough to your query sequence. The BLAST
report does not have the whole hit sequence in it, only the locally
aligned part. SearchIO can only give you what it can get from the
BLAST report.

You will need to record the IDs of the database sequences you are
interested in, and write extra code to retrieve the WHOLE hit sequence
from your database.

--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University


From neetisomaiya at gmail.com  Fri Aug  3 01:46:32 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 3 Aug 2007 11:16:32 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
	<9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>
Message-ID: <764978cf0708022246v98abed6ue41233f6b27c5674@mail.gmail.com>

Hi,

Thanks a lot.
The exception is not coming after upgrade to bioperl-1.5.2
But the genemap data is still a problem.

You had mentioned that I should take Bio::Phenotype::OMIM::OMIMparser,
specifically from cvs. Where exactly can I get it?

Thanks,
Neeti.

On 8/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Neeti,
>
> Keep this on the list please.  I am unable to reproduce this using
> your script with or without using the optional genemap file.  You
> really should upgrade bioperl to 1.5.2 and try the fix first; this is
> something that may have been fixed post-bioperl 1.4.
>
> chris
>
> On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote:
>
> > Waiting for your reply on the exception I had mentioned in my first
> > mail.
> >
> > Thanks.
> >
> > ---------- Forwarded message ----------
> > From: neeti somaiya < neetisomaiya at gmail.com>
> > Date: Aug 2, 2007 11:50 AM
> > Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser
> > To: bioperl-l at lists.open-bio.org
> >
> > Hi,
> >
> > The script is attached with this mail.
> > I am using bioperl-1.4.
> >
> > Regards,
> > Neeti.
> >
> >
> > On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti,
> >
> > Only post to one list email address, namely the one I'm responding to
> > and the one shown here:
> >
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> > The others are aliases so you essentially posted three times.  As for
> > your question: there was no attached script or any additional
> > information (bioperl version would have also been nice), so we can't
> > help you until we have something more to work with.
> >
> > chris
> >
> > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
> >
> > > I have downloaded the omim.txt file from NCBI ftp site and I am
> > > running my
> > > attached parser on this file, the parser run stops in between with
> > > this :-
> > >
> > > ------------- EXCEPTION  -------------
> > > MSG: a part/organism must be assigned
> > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > > STACK toplevel parse_omim_original.pl:47
> > >
> > > --------------------------------------
> > >
> > > What is the reason for this?
> > > Can anyone guide me please.
> > >
> > > --
> > > -Neeti
> > > Even my blood says, B positive
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> >
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> >
> >
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > <parse_omim_original.pl>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
-Neeti
Even my blood says, B positive


From jay at jays.net  Fri Aug  3 10:23:11 2007
From: jay at jays.net (Jay Hannah)
Date: Fri, 03 Aug 2007 09:23:11 -0500
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>	<3579584634amadoz@uv.es>
	<a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
Message-ID: <46B33A4F.2010403@jays.net>

Torsten Seemann wrote:
>> Hi, thanks for your help and suggestions. I have tried the example code
>> of Jay Hannah and it works perfectly. But what I need to save in fasta
>> format is the whole sequence in the database that is similar to my query
>> sequence.
>>     
>
> Unfortunately the hit_string is only that part of the sequence in the
> database that was similar enough to your query sequence. The BLAST
> report does not have the whole hit sequence in it, only the locally
> aligned part. SearchIO can only give you what it can get from the
> BLAST report.
>
> You will need to record the IDs of the database sequences you are
> interested in, and write extra code to retrieve the WHOLE hit sequence
> from your database.
>   
This probably won't help, but my (extremely poorly documented) 
"SeqLab.net" project

   http://seqlab.net

is a framework that sits on top of BioPerl. The current cross_blast() 
stuff (http://seqlab.net/pods2html/tutorial.html) does this:

   GenBank -> FASTA -> formatdb -> "stand alone" NCBI BLAST -> reports

When the reports run they have simultaneous access to both the original 
Bio::Seq objects from the GenBank file and the Bio::SearchIO objects 
from the BLAST results, so it can kick out reports that understand the 
relationships between (and details of) the original sequences and HSPs 
simultaneously...

If you get stuck trying to do what Torsten suggests and have questions 
about SeqLab.net you could open a ticket with my group

   http://clab.ist.unomaha.edu/CLAB/index.php/RT

and I'll try to help.

Cheers,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From mbasu at mail.nih.gov  Fri Aug  3 14:55:57 2007
From: mbasu at mail.nih.gov (Malay)
Date: Fri, 03 Aug 2007 14:55:57 -0400
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <46B33A4F.2010403@jays.net>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>	<3579584634amadoz@uv.es>	<a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
	<46B33A4F.2010403@jays.net>
Message-ID: <46B37A3D.4070606@mail.nih.gov>

Jay Hannah wrote:
> Torsten Seemann wrote:
>>> Hi, thanks for your help and suggestions. I have tried the example code
>>> of Jay Hannah and it works perfectly. But what I need to save in fasta
>>> format is the whole sequence in the database that is similar to my query
>>> sequence.
>>>     
>> Unfortunately the hit_string is only that part of the sequence in the
>> database that was similar enough to your query sequence. The BLAST
>> report does not have the whole hit sequence in it, only the locally
>> aligned part. SearchIO can only give you what it can get from the
>> BLAST report.
>>
>> You will need to record the IDs of the database sequences you are
>> interested in, and write extra code to retrieve the WHOLE hit sequence
>> from your database.

I am not sure whether it has already been suggested or not but you can 
retrieve the full sequence from any blast database using "fastacmd", 
which is part of NCBI toolbox. Parse the "description" string from from 
the BLAST report and run:

fastacmd -d <database file> -s <description>

where, the argument of -s can be any unique string for the database.

-Malay


From cjfields at uiuc.edu  Mon Aug  6 13:49:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 6 Aug 2007 12:49:08 -0500
Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml
References: <1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu>
Message-ID: <B97BD1F9-05FE-4225-810F-5EA10AB2728B@uiuc.edu>

Wasn't paying attention! Forwarding this to the mail list in case  
anyone wanted the answer...

chris

Begin forwarded message:

> From: Chris Fields <cjfields at uiuc.edu>
> Date: August 6, 2007 12:10:37 PM CDT
> To: gyang at plantbio.uga.edu
> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
> with xml
>
> Guojun,
>
> Sorry about the long wait on this.  At this time RemoteBlast  
> doesn't automatically set the retrieval header to return XML when  
> setting the -reporttype parameter to 'xml' or 'blastxml'.  The  
> default is text output, so you are retrieving regular text BLAST  
> reports instead of XML, hence the reported XML parser failure (BTW,  
> you can see the plain text being returned in the debugging  
> output).  I'll look into a fix for that.
>
> In the meantime, you can do this manually by setting the following  
> key prior to submitting the BLAST run:
>
> $Bio::Tools::Run::RemoteBlast::RETRIEVALHEADER{'FORMAT_TYPE'} = 'XML';
>
> When I run your example with the above line added it works fine.   
> As an additional note, the CVS version of Bio::SearchIO::blastxml  
> now supports newer versions of XML::SAX::Expat; the problem there  
> was a bug in XML::SAX::Expat that killed parsing.
>
> Additional rant before I go back to work (you can skip this if  
> needed):  RemoteBlast is one of the most used modules in BioPerl,  
> but it is also the most problematic as NCBI keeps changing things  
> on their end (BLAST text output, prompts when returning RIDs,  
> etc).  It frankly isn't as well-maintained as we would like; this  
> is partly due to plans we have (but unfortunately haven't acted  
> upon) to merge RemoteBlast/StandAloneBlast so they have a similar  
> API and can be used for any BLAST program, including netblast.  If  
> someone wants to take this on at some point then they are more than  
> welcome!
>
> chris
>
> On Aug 3, 2007, at 10:08 AM, Guojun Yang wrote:
>
>> Thanks, Chris,
>> Attached are my script and the query file. I suspected that we  
>> need to add "remove RID... in the code", I tried putting romoving  
>> RID at the end of the parsing coding, but it seemed it removed it  
>> even before the output was processed.   I installed  
>> XML::SAX::Expat, the error became "XML::SAX::Expat is no longer  
>> supported...", so I installed ExpatXS, the error message becomes:
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>  no element found at line 4126, column 1, byte 186628 at /usr/lib/ 
>> perl5/site_perl/5.8.3/Bio/SearchIO/blastxml.pm line 304
>>
>>
>> Would you please try the script with the query file with the  
>> following input parameters, to see what happens on your machine (I  
>> want to make sure there is no installation problem on my machine).  
>> The search subroutine is where blast is performed, I did not  
>> include a romove RID there. Thanks again!
>>
>> master:/home/guojun # perl makcgi07.txt
>> Query file name:
>> kiddo.txt
>> Select a function: 1.member;2.RES; 3, long; 4.Anchor; 5.Associator.
>> 1
>> Type in the name of an organism, e.g. Oryza sativa.
>> Oryza sativa
>> Type in the organism to search for RES:
>> Your E_value:
>> 0.001
>> Size limit for ancestor element:
>> 4000
>> Flanking size for retrieved members:
>> 50
>> Tolerance for end mismatch:
>> 0
>>
>>
>>
>> Guojun From: Chris Fields [mailto:cjfields at uiuc.edu]
>> To: gyang at plantbio.uga.edu
>> Sent: Thu, 02 Aug 2007 13:04:59 -0400
>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
>> with xml
>>
>> Guojun,
>>
>> Make sure to keep this on the mail list for archiving purposes.
>>
>> It could be that the RID is not being removed properly (if it isn't
>> removed then you will repeatedly retrieve your BLAST report). The
>> new error you are seeing may be coming from whatever XML::SAX backend
>> parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat, etc); it
>> doesn't look bioperl-related and there is an eval which catches this
>> stuff in SearchIO::blastxml. Does text parsing work?
>>
>> Could you directly send me your script or add it to a new bug report
>> as an attachment?
>>
>> http://www.bioperl.org/wiki/Bugs
>>
>> chris
>>
>> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote:
>>
>> > Hi,Chris,
>> > I installed the latest version of bioperl, in addition to the
>> > repeated output problem, there are new problems with parsing:
>> >
>> >
>> > -------------------- WARNING ---------------------
>> > MSG: error in parsing a report:
>> > No close tag marker [Ln: 4126, Col: 0]
>> >
>> > ---------------------------------------------------
>> >
>> > Would you please kindly give me a hint on this,
>> > Thanks a lot,
>> > Guojun
>> >
>> >
>> > ----- Original Message -----
>> > From: Chris Fields [mailto:cjfields at uiuc.edu]
>> > To: gyang at plantbio.uga.edu
>> > Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org]
>> > Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast
>> > with xml
>> >
>> >
>> >> Make sure to keep responses on the ail list.
>> >>> You might want to run a full install, just in case. If I remember
>> >> correctly Sendu made some changes a while back in the BLAST- 
>> related
>> >> modules which may be related to this. At the very least install/
>> >> upgrade all modules in Bio::Tools::Run.
>> >>> chris
>> >>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote:
>> >>>> Thanks, Chris,
>> >>> But when I replaced the old RemoteBlast.pm with the new one, I  
>> got
>> >>> "can't locate the object method "retrieve_parameter"". Does this
>> >>> mean I need to install something else?
>> >>> Guojun
>> >>>
>> >>> ----- Original Message -----
>> >>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> >>> To: gyang at plantbio.uga.edu
>> >>> Cc: bioperl-l at bioperl.org
>> >>> Subject: Re: [Bioperl-l] nonstop repeated output from  
>> Remote_blast
>> >>> with xml
>> >>>
>> >>>
>> >>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote:
>> >>>>>> I am running remoteblast and using readmethod "xml", I noticed
>> >>>>>> that
>> >>>>> it is printing the output repeatedly nonstop. It's like in a  
>> loop.
>> >>>>> Did anybody notice this before? Can anybody help me getting  
>> out of
>> >>>>> this?
>> >>>>> Thanks a lot,
>> >>>>>
>> >>>>>
>> >>>>> Guojun Yang
>> >>>>> University of Georgia
>> >>>>> Not seeing that using bioperl-live; you may need to update
>> >>>> RemoteBlast.pm as this sounds similar to an issue that popped up
>> >>>> earlier in the spring.
>> >>>>> chris
>> >>>>
>> >>> Christopher Fields
>> >> Postdoctoral Researcher
>> >> Lab of Dr. Robert Switzer
>> >> Dept of Biochemistry
>> >> University of Illinois Urbana-Champaign
>> >>>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>
>>
>> <makcgi07.txt>
>> <kiddo.txt>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Alicia.Amadoz at uv.es  Tue Aug  7 04:20:12 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Tue, 7 Aug 2007 10:20:12 +0200 (CEST)
Subject: [Bioperl-l] error using standaloneblast through webserver, part II
Message-ID: <1387114447amadoz@uv.es>

Hi again, i'm trying to run a bioperl script in linux with
standaloneblast from a webserver but i now have another error. It is the
following:

[blastall] WARNING: Unable to open outfile_allseq.nin
[blastall] WARNING: 101: Unable to open outfile_allseq.nin

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d
 "/outfile_allseq"  -e  10  -i 
/tmp//alicia_2007_07_20/result_search_alicia_12_03_40.fasta  -o 
/tmp//alicia_2007_08_07/101_result_Local_Blast_alicia_09_56_47.out  -p 
blastn

My perl code is the following:

my $blastdatadir = $ARGV[9]; -> Here the value of the variable is ok

BEGIN { 
	$ENV{PATH} .= ':/usr/local/blast-2.2.16/bin'; # path where blastall bin
is located
	$ENV{BLASTDIR} = '/usr/local/blast-2.2.16/bin'; # path where blastall
bin is located
	$ENV{BLASTDATADIR} = $blastdatadir; # path where formated local
databases are located -> Here the value is empty
}   

I have tried without BEGIN { } so $ENV var has a correct value for
$blastdatadir but i get the same error. I have checked that formatdb was
done and all the files are correct.

Any idea or help to solve this problem? 

Thanks in advance. Regards,
Alicia


From mheusel at gmail.com  Tue Aug  7 04:45:33 2007
From: mheusel at gmail.com (Martin Heusel)
Date: Tue, 7 Aug 2007 10:45:33 +0200
Subject: [Bioperl-l] error using standaloneblast through webserver,
	part II
In-Reply-To: <1387114447amadoz@uv.es>
References: <1387114447amadoz@uv.es>
Message-ID: <6127fc200708070145keb750acycce8a43edd0f724d@mail.gmail.com>

> MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d
>  "/outfile_allseq"  -e  10  -i

I'm not familiar with all this, but it seems your script tries to
write in the systems root directory /

-d "/outfile_allseq"

that is normally not writable for normal users

is this the problem?

cu

Martin

-- 
+ openid: http://mhe.myopenid.com/
+ gpg   : http://user.cs.tu-berlin.de/~mhe/pub/martin.gpg
+ gpg fp: 4844 71B5 B4E4 3892 69CA  6EA5 6598 61BE 0021 94A2


From Alicia.Amadoz at uv.es  Tue Aug  7 07:08:12 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Tue, 7 Aug 2007 13:08:12 +0200 (CEST)
Subject: [Bioperl-l] error using standaloneblast through webserver,
	part II
In-Reply-To: <1387114447amadoz@uv.es>
References: <1387114447amadoz@uv.es>
Message-ID: <5825345446amadoz@uv.es>

Hi, i thought that it was enough with setting $ENV{BLASTDATADIR} and
standaloneblast would find the database. I have change it, setting
-database option of params with path_to_database+name_of_database and it
works ok.

Thanks for your help. Regards,
Alicia


From jason at bioperl.org  Wed Aug  8 15:16:07 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 8 Aug 2007 14:16:07 -0500
Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module
References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com>
Message-ID: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>

Young -
I'm forwarding to the list for more help.

Begin forwarded message:

> From: "Young Song" <youngcsong at gmail.com>
> Date: August 8, 2007 1:48:29 PM CDT
> To: jason at bioperl.org
> Subject: Question regarding Bio::GenBank module
>
> Hello,
>
>    I am currently located in Vancouver, Canada, and I actually have  
> some
> question based on the Bio::GenBank module for bioperl.  I read in the
> online document for the module (
> http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are  
> not
> supposed to spam the NCBI with multiple requests, which lead me to  
> think
> about the script that I wrote.  I am trying to extract some  
> information
> based on the fasta protein files located in the  NCBI's  database.   
> The
> script  reads  each '.faa' (Fasta Protein) file and takes in the  
> 'gi'  ID
> for each  sequence, and extracts several information, which looks like
> following output (please note that there are lot more gi's then I  
> am showing
> you right now):
>
> 10954456
> accesstion number: NP_047185.1
> dbsource: GenBank: NC_001911.1
> NP_047185.1
> starting pos. at genomic seq: 1488
> ending pos. at genomic seq: 1991
> strand: +
> description: putative membrane-associated protein
> organism: Buchnera aphidicola
> MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGLL 
> VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLCV 
> IIHLTFVLSAFGMAYIDKMSKKKHVLH
> ************************************************
> 10954457
> accesstion number: NP_047186.1
> dbsource: GenBank: NC_001911.1
> NP_047186.1
> starting pos. at genomic seq: 2158
> ending pos. at genomic seq: 2913
> strand: +
> description: putative replication-associated protein
> organism: Buchnera aphidicola
> MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHRA 
> CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKRK 
> FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKKI 
> LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK
>
>
>
>   Because there are lot of sequences I am dealing with here, I am  
> little bit
> worried that I may be causing harm to the NCBI server.  I just need  
> to know
> if this is the right approach to take, or if there is another  
> solution (I am
> little bit confused what you mean by "multiple requests" in the  
> document).
> Your reply would be very much appreciated.  Thank you in advance.
>
>   Sincerely,
>
>      Young C. Song

--
Jason Stajich
jason at bioperl.org


From cjfields at uiuc.edu  Wed Aug  8 15:41:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 8 Aug 2007 14:41:34 -0500
Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module
In-Reply-To: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>
References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com>
	<24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>
Message-ID: <FD7D1694-604A-4C8B-AC47-B31F306EA5B0@uiuc.edu>

NCBI eUtils (which Bio::DB::GenBank uses to get sequence data) has a  
list of user requirements:

http://www.ncbi.nlm.nih.gov/entrez/query/static/ 
eutils_help.html#UserSystemRequirements

The most important one is the 3 second timeout between requests, but  
the module already implements that policy so there isn't a real issue  
unless you deliberately mess with that setting.  NCBI has been known  
to block IPs which don't follow that particular rule.  Also, if you  
are planning making hundreds of requests you should consider running  
the script during low traffic times as indicated in the above link.

chris

On Aug 8, 2007, at 2:16 PM, Jason Stajich wrote:

> Young -
> I'm forwarding to the list for more help.
>
> Begin forwarded message:
>
>> From: "Young Song" <youngcsong at gmail.com>
>> Date: August 8, 2007 1:48:29 PM CDT
>> To: jason at bioperl.org
>> Subject: Question regarding Bio::GenBank module
>>
>> Hello,
>>
>>    I am currently located in Vancouver, Canada, and I actually have
>> some
>> question based on the Bio::GenBank module for bioperl.  I read in the
>> online document for the module (
>> http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are
>> not
>> supposed to spam the NCBI with multiple requests, which lead me to
>> think
>> about the script that I wrote.  I am trying to extract some
>> information
>> based on the fasta protein files located in the  NCBI's  database.
>> The
>> script  reads  each '.faa' (Fasta Protein) file and takes in the
>> 'gi'  ID
>> for each  sequence, and extracts several information, which looks  
>> like
>> following output (please note that there are lot more gi's then I
>> am showing
>> you right now):
>>
>> 10954456
>> accesstion number: NP_047185.1
>> dbsource: GenBank: NC_001911.1
>> NP_047185.1
>> starting pos. at genomic seq: 1488
>> ending pos. at genomic seq: 1991
>> strand: +
>> description: putative membrane-associated protein
>> organism: Buchnera aphidicola
>> MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGL 
>> L
>> VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLC 
>> V
>> IIHLTFVLSAFGMAYIDKMSKKKHVLH
>> ************************************************
>> 10954457
>> accesstion number: NP_047186.1
>> dbsource: GenBank: NC_001911.1
>> NP_047186.1
>> starting pos. at genomic seq: 2158
>> ending pos. at genomic seq: 2913
>> strand: +
>> description: putative replication-associated protein
>> organism: Buchnera aphidicola
>> MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHR 
>> A
>> CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKR 
>> K
>> FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKK 
>> I
>> LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK
>>
>>
>>
>>   Because there are lot of sequences I am dealing with here, I am
>> little bit
>> worried that I may be causing harm to the NCBI server.  I just need
>> to know
>> if this is the right approach to take, or if there is another
>> solution (I am
>> little bit confused what you mean by "multiple requests" in the
>> document).
>> Your reply would be very much appreciated.  Thank you in advance.
>>
>>   Sincerely,
>>
>>      Young C. Song
>
> --
> Jason Stajich
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From gyang at plantbio.uga.edu  Thu Aug  9 15:03:21 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Thu, 09 Aug 2007 15:03:21 -0400
Subject: [Bioperl-l] standalone blastall call crashed, please help
In-Reply-To: 1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu
Message-ID: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>

Hi, Chris,  
Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm?  Best, Guojun  
   
I set the blast env variables:  
   
BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
$PROGRAMDIR = $ENV{'BLASTDIR'} || '';
......  
   
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  -o  /tmp/4qSSO16EZP  -p  blastx   
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359
STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
STACK: main::ancestor makcgi07.txt:593
STACK: makcgi07.txt:208
  

sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  

my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
print $query->seq();
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('swissprot');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
*    my $blast_report = $factory->blastall($query);
    while (my $result = $blast_report->next_result) {
            while( my $hit = $result->next_hit()) {
                $hit_name=$hit->name;
                $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
                $name=$1;
                $desc = $hit->description();
                if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){
                     $AN=0;
                     $replica=0;
                     while ($ancestor_name[$AN]) {
                        $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name));
                         $AN+=1;
                     }
                        if ($replica==0) {
                        push @ancestor_name, $long_name;
                        push @ancestor_start, $long_start;
                        push @ancestor_end, $long_end;
                        push @desc, $desc;
                        push @hitname,$name;
                        }
                }
               }
              }}
return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
}


From harijay at gmail.com  Thu Aug  9 17:47:50 2007
From: harijay at gmail.com (hari jayaram)
Date: Thu, 9 Aug 2007 17:47:50 -0400
Subject: [Bioperl-l] newbie wants install help
Message-ID: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>

Hi I am trying to install bioperl as a non root user since I dont have root
access on the machine.

I was following the instructions as given on the wiki at
http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
I started from scratch using perl version v5.8.5 and used cpan to install
the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
was needed. Everything worked just fine
I could use cpan as a non root user following instructions given at
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html

But when I try to install bioperl using the instructions for non-root I get
an error when I build Module::Build because I am not root.
Iget the same Module::Build error when I try to install without CPAN using
command line script perl Build.PL --install_base option as given on the
wiki.

Is there a way out

Thanks for your help in advance
harijay
Brandeis University


Installing /usr/share/man/man3/Module::Build::Platform::VMS.3pm
Installing /usr/share/man/man3/Module::Build::Base.3pm
Installing /usr/share/man/man3/Module::Build::Authoring.3pm
Installing /usr/share/man/man3/Module::Build::Compat.3pm
mkdir /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi/auto/Module:
Permission denied at /usr/lib/perl5/5.8.5/ExtUtils/Install.pm line 207
Installing /usr/bin/config_data
make: *** [install] Error 255
  /usr/bin/make install  -- NOT OK
    You may have to su to root to install the package
Couldn't install Module::Build, giving up.
make: *** No targets specified and no makefile found.  Stop.
  /usr/bin/make  -- NOT OK
Running make test
  Can't test without successful make
Running make install
  make had returned bad status, install seems impossible


From bix at sendu.me.uk  Thu Aug  9 18:23:24 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 09 Aug 2007 23:23:24 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
Message-ID: <46BB93DC.9010608@sendu.me.uk>

hari jayaram wrote:
> Hi I am trying to install bioperl as a non root user since I dont have root
> access on the machine.
> 
> I was following the instructions as given on the wiki at
> http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
> I started from scratch using perl version v5.8.5 and used cpan to install
> the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
> was needed. Everything worked just fine
> I could use cpan as a non root user following instructions given at
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
> 
> But when I try to install bioperl using the instructions for non-root I get
> an error when I build Module::Build because I am not root.
> Iget the same Module::Build error when I try to install without CPAN using
> command line script perl Build.PL --install_base option as given on the
> wiki.

Follow the cpan instructions you found to install as non-root:

Bundle::CPAN

Failing that, you require at least:
Module::Build

Failing that:
http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#INSTALLING_BIOPERL_MODULES_THE_HARD_WAY
(it's actually the easiest way, go figure)


From bix at sendu.me.uk  Fri Aug 10 03:41:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 10 Aug 2007 08:41:29 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>	
	<46BB93DC.9010608@sendu.me.uk>
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
Message-ID: <46BC16A9.7090709@sendu.me.uk>

hari jayaram wrote:
> Hi Sendu ,

Hi, please post back to the list as well, so others can benefit.


> Well after going through a few attempts at installing Bundle::CPAN I 
> gave up.
> It always had weird timeout issues . ANd kept re-installing everything 
> on restarting the CPAN shell
> After a while I thought it did complete - since it retunred me to the shell
> 
> I tried the CPAN install of bioperl at that point
> 
> ANd bingo I got booted out at the exact same point when the Bioperl 
> install tried to re-install(?) Module:Build which failed as non root

Did you follow steps 7 and 8 of 
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?

If you managed to install Bundle::CPAN, when you now run 'cpan' it 
should start up and tell you its version number, which should be v1.9102 
or higher. If its lower, you didn't manage to install the latest CPAN, 
or you haven't managed to tell Perl where your newly installed modules are.


> I guess for all future modules I will adopt the option 3 you detailed , 
> i.e just have the modules sitting somewhere and use them from there
> 
> But I am still interested in getting it done right via CPAN.


From n.haigh at sheffield.ac.uk  Fri Aug 10 06:14:06 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 10 Aug 2007 11:14:06 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <46BC16A9.7090709@sendu.me.uk>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>		<46BB93DC.9010608@sendu.me.uk>	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
	<46BC16A9.7090709@sendu.me.uk>
Message-ID: <46BC3A6E.80302@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> hari jayaram wrote:
>> Hi Sendu ,
> 
> Hi, please post back to the list as well, so others can benefit.
> 
> 
>> Well after going through a few attempts at installing Bundle::CPAN I 
>> gave up.
>> It always had weird timeout issues . ANd kept re-installing everything 
>> on restarting the CPAN shell
>> After a while I thought it did complete - since it retunred me to the shell
>>
>> I tried the CPAN install of bioperl at that point
>>
>> ANd bingo I got booted out at the exact same point when the Bioperl 
>> install tried to re-install(?) Module:Build which failed as non root
> 
> Did you follow steps 7 and 8 of 
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?
> 
> If you managed to install Bundle::CPAN, when you now run 'cpan' it 
> should start up and tell you its version number, which should be v1.9102 
> or higher. If its lower, you didn't manage to install the latest CPAN, 
> or you haven't managed to tell Perl where your newly installed modules are.
> 
> 
>> I guess for all future modules I will adopt the option 3 you detailed , 
>> i.e just have the modules sitting somewhere and use them from there
>>
>> But I am still interested in getting it done right via CPAN.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

It will probably also help, if you post the commands you have run and
any output (truncated if it's really long), then we can follow what you
have tried and make some better suggestions.

Cheers
Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGvDpuczuW2jkwy2gRAjFjAJ0eG90cMfHrrIh7LbKWx1JN94kbXgCdGSbi
tMjQrZ/8EPc0wLiNAhYTr4Y=
=kXZ2
-----END PGP SIGNATURE-----


From mbasu at mail.nih.gov  Fri Aug 10 11:25:35 2007
From: mbasu at mail.nih.gov (Malay)
Date: Fri, 10 Aug 2007 11:25:35 -0400
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
Message-ID: <46BC836F.7010906@mail.nih.gov>

hari jayaram wrote:
> Hi I am trying to install bioperl as a non root user since I dont have root
> access on the machine.
> 
> I was following the instructions as given on the wiki at
> http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
> I started from scratch using perl version v5.8.5 and used cpan to install
> the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
> was needed. Everything worked just fine
> I could use cpan as a non root user following instructions given at
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
> 
> But when I try to install bioperl using the instructions for non-root I get
> an error when I build Module::Build because I am not root.
> Iget the same Module::Build error when I try to install without CPAN using
> command line script perl Build.PL --install_base option as given on the
> wiki.
> 
> Is there a way out
> 
> Thanks for your help in advance
> harijay
> Brandeis University
> 

This is related your situation and broadly applicable to all perl users 
in a non root situation. I can tell from my own experience the best way 
to handle your situation is to use your own Perl, if you are a dedicated 
perl developer. Just compile and install your own perl installation in 
any directory of you choice and put the "bin" directory in front of you 
path and off you go. The advantages are several fold. First, you get a 
very optimized, fast perl. The sysadmin might have just installed a 
binary run-of-the-mill perl version. Second, you get all the freedom of 
installing the very latest updates of all the modules. The sysadmins may 
be too busy man to update perl frequently. Third, a very common problem 
with production machine is that they follow strictly the perl 
installation instruction and avoid threaded perl, which clips your wings 
particularly, when almost all machines contain multiple processors.

The drawbacks are related to finding "/usr/bin/perl" in the shebang 
line. If you follow the perl way of installing any script, it will take 
care of it. When you develop, use the more portable way of

#!/usr/bin/env perl
BEGIN {$^W =1 } # Use it switch on compile time warnings (-w)

All the best,

Malay


-- 
Malay K Basu
www.malaybasu.net


From gyang at plantbio.uga.edu  Fri Aug 10 11:23:36 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Fri, 10 Aug 2007 11:23:36 -0400
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
 from StandAloneBlast
In-Reply-To: 20070809190321.191d0d4a@dogwood.plantbio.uga.edu
Message-ID: <20070810152336.898c3979@dogwood.plantbio.uga.edu>

Hi, Chris,  
Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run.  If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run.
Thanks,  
Guojun  
       _____  

  From: Guojun Yang [mailto:gyang at plantbio.uga.edu]
To: Chris Fields [mailto:cjfields at uiuc.edu]
Cc: bioperl-l at lists.open-bio.org
Sent: Thu, 09 Aug 2007 15:03:21 -0400
Subject: standalone blastall call crashed, please help

  
Hi, Chris,  
Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm?  Best, Guojun  
   
I set the blast env variables:  
   
BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
$PROGRAMDIR = $ENV{'BLASTDIR'} || '';
......  
   
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  -o  /tmp/4qSSO16EZP  -p  blastx   
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359
STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
STACK: main::ancestor makcgi07.txt:593
STACK: makcgi07.txt:208
  

sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  

my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
print $query->seq();
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('swissprot');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
*    my $blast_report = $factory->blastall($query);
    while (my $result = $blast_report->next_result) {
            while( my $hit = $result->next_hit()) {
                $hit_name=$hit->name;
                $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
                $name=$1;
                $desc = $hit->description();
                if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){
                     $AN=0;
                     $replica=0;
                     while ($ancestor_name[$AN]) {
                        $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name));
                         $AN+=1;
                     }
                        if ($replica==0) {
                        push @ancestor_name, $long_name;
                        push @ancestor_start, $long_start;
                        push @ancestor_end, $long_end;
                        push @desc, $desc;
                        push @hitname,$name;
                        }
                }
               }
              }}
return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
}


From cjfields at uiuc.edu  Fri Aug 10 12:17:38 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 10 Aug 2007 11:17:38 -0500
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
References: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
Message-ID: <56186844-3CB9-4968-B16F-FD5EE72865A2@uiuc.edu>

This should be filed as a bug if possible; could you do that?

http://www.bioperl.org/wiki/Bugs

Suggestions have been made many times previously that  
StandAloneBlast, RemoteBlast, etc be combined to use a common API,  
incorporate other BLAST implementations (i.e. WU-BLAST, NCBI's  
netblast, etc), and maybe utilize other cross-platform compatible  
means of running programs and passing off reports to parsers.  In  
fact, Jason, Roger Hall, Torsten, and I discussed tentative plans for  
plugin-able BLAST wrappers:

http://www.bioperl.org/wiki/Module:Bio::Tools::Run::RemoteBlast

Though they have never been acted upon.  If I get time towards the  
end of fall and manage to finish up some other projects I may try  
taking this on, maybe using the wiki to track progress.

chris

On Aug 10, 2007, at 10:23 AM, Guojun Yang wrote:

> Hi, Chris,
> Interestingly, I found the message in bioperl-l from Matthew Laird  
> 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES  
> run.  If one comments out this line in StandAloneBlast.pm, the  
> execution succeeds perfectly fine". It seemed to be mysterious when  
> I uncommented the " $self->throw("$executable call crashed: $? $!  
> $commandstring\n") unless ($status==0) ;" line, the blastall runs.  
> The only difference from what Matthew saw is that, when I did not  
> uncomment the line, blastall DID NOT run.
> Thanks,
> Guojun
>
> From: Guojun Yang [mailto:gyang at plantbio.uga.edu]
> To: Chris Fields [mailto:cjfields at uiuc.edu]
> Cc: bioperl-l at lists.open-bio.org
> Sent: Thu, 09 Aug 2007 15:03:21 -0400
> Subject: standalone blastall call crashed, please help
>
> Hi, Chris,
> Thanks a lot for your efforts. With your help, I am gaining more  
> confidence to fix the cgi code. While the remoteblast problem is  
> fixed now, I am caught in a local blast problem (see the error  
> message and subroutine). The line starting with * is line 593 in  
> the error message. I tried command line blastall, it works fine. I  
> set the permission to all the blast folders and files, it did not  
> help much. The same sequence and database works OK if I use command  
> line blastall. I used the seq object ref $query as query, the error  
> message gives "-i /tmp/...", does this look like an input problem?  
> The subroutine was working before early 2006 (on a different  
> machine), I am wondering whether this is due to changes in the  
> StandAloneBlast.pm?  Best, Guojun
>
> I set the blast env variables:
>
> BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
> BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
> BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
> $PROGRAMDIR = $ENV{'BLASTDIR'} || '';
> ......
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/ 
> usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  - 
> o  /tmp/4qSSO16EZP  -p  blastx
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/ 
> Root/Root.pm:359
> STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/ 
> site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
> STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/ 
> lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
> STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/ 
> site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
> STACK: main::ancestor makcgi07.txt:593
> STACK: makcgi07.txt:208
> sub ancestor {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
>
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"test");
> print $query->seq();
> my $len=$query->length();
> my $long_name=$_[1];
> my $long_start=$_[2];
> my $long_end=$_[3];
> @db=('swissprot');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new(-program =>  
> "blastx",
>                                                         -database  
> => "$db",
>                                                         -e => 1e-3,
>                                                         );
> *    my $blast_report = $factory->blastall($query);
>     while (my $result = $blast_report->next_result) {
>             while( my $hit = $result->next_hit()) {
>                 $hit_name=$hit->name;
>                 $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
>                 $name=$1;
>                 $desc = $hit->description();
>                 if ($desc =~ /.*{|\btransposon\b|\btransposase 
> \b|}.*/i){
>                      $AN=0;
>                      $replica=0;
>                      while ($ancestor_name[$AN]) {
>                         $replica=1 if (($ancestor_name[$AN] eq  
> $long_name) && ($hitname[$AN] eq $name));
>                          $AN+=1;
>                      }
>                         if ($replica==0) {
>                         push @ancestor_name, $long_name;
>                         push @ancestor_start, $long_start;
>                         push @ancestor_end, $long_end;
>                         push @desc, $desc;
>                         push @hitname,$name;
>                         }
>                 }
>                }
>               }}
> return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
> }
>
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From harijay at gmail.com  Fri Aug 10 13:09:32 2007
From: harijay at gmail.com (hari jayaram)
Date: Fri, 10 Aug 2007 13:09:32 -0400
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <46BC16A9.7090709@sendu.me.uk>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
	<46BB93DC.9010608@sendu.me.uk>
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
	<46BC16A9.7090709@sendu.me.uk>
Message-ID: <aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>

Hey all ,
Thanks for your help. Its working real well now.

Turns out I had not set my PERL5LIB environment variable correctly and it
was not finding the installed modules (thanks Sendu)

So the steps I followed were
1) Install CPAN as myself as detailed
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
Importantly the line which tells CPAN what prefix to use for all module
installs
PREFIX=~/perl5lib/ LIB=~/perl5lib/lib INSTALLMAN1DIR=~/perl5lib/man1
INSTALLMAN3DIR=~/perl5lib/man3

2) Set the Perl5LIB to /home/perl5lib/lib ( and not just /home/perl5lib) in
the shell . I use cshell so I edited .cshrc
setenv PERL5LIB /home/hari/perl5lib/lib
setenv MANPATH ${MANPATH}:/home/hari/perl5lib

3) Updated the system CPAN to latest version - this woked very well once the
perl5lib was installed ..only it took a while and sometimes stalled with
messages like done 31/34  But a CTRL C , got it going again

4) Made sure I was using the new CPAN v1.9102

5) Installed Bioperl with command
install S/SE/SENDU/bioperl-1.5.2_102.tar.gz

AND I was good to go..

I am thinking I will screencast this process for everyones benefit and put
it up on bioscreencast.com . If that will be useful for others.
Thanks to everyone on the group. Now the journey begins

Hari Jayaram


On 8/10/07, Sendu Bala <bix at sendu.me.uk> wrote:
> hari jayaram wrote:
> > Hi Sendu ,
>
> Hi, please post back to the list as well, so others can benefit.
>
>
> > Well after going through a few attempts at installing Bundle::CPAN I
> > gave up.
> > It always had weird timeout issues . ANd kept re-installing everything
> > on restarting the CPAN shell
> > After a while I thought it did complete - since it retunred me to the
shell
> >
> > I tried the CPAN install of bioperl at that point
> >
> > ANd bingo I got booted out at the exact same point when the Bioperl
> > install tried to re-install(?) Module:Build which failed as non root
>
> Did you follow steps 7 and 8 of
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?
>
> If you managed to install Bundle::CPAN, when you now run 'cpan' it
> should start up and tell you its version number, which should be v1.9102
> or higher. If its lower, you didn't manage to install the latest CPAN,
> or you haven't managed to tell Perl where your newly installed modules
are.
>
>
> > I guess for all future modules I will adopt the option 3 you detailed ,
> > i.e just have the modules sitting somewhere and use them from there
> >
> > But I am still interested in getting it done right via CPAN.
>


From torsten.seemann at infotech.monash.edu.au  Fri Aug 10 17:48:56 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 11 Aug 2007 07:48:56 +1000
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>
	<20070810152336.898c3979@dogwood.plantbio.uga.edu>
Message-ID: <a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>

> Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run.  If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run.

Yes, Matthew is one of the authors of PSORTB and I spent a bit of time
last year trying to fix this problem (unsuccessfully). The PSORTB docs
http://www.psort.org/downloads/index.html
explain how to get around this problem just as Guojun describes. I use
a custom BioPerl installation just for PSORTB!

 I was under the impression it was already filed as a bug, but my
searching indicates this is not so.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University


From cjfields at uiuc.edu  Fri Aug 10 18:04:20 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 10 Aug 2007 17:04:20 -0500
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>
References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>
	<20070810152336.898c3979@dogwood.plantbio.uga.edu>
	<a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>
Message-ID: <41A08079-6EEC-4B62-8104-C41E70C03083@uiuc.edu>


On Aug 10, 2007, at 4:48 PM, Torsten Seemann wrote:

>> Interestingly, I found the message in bioperl-l from Matthew Laird  
>> 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast  
>> DOES run.  If one comments out this line in StandAloneBlast.pm,  
>> the execution succeeds perfectly fine". It seemed to be mysterious  
>> when I uncommented the " $self->throw("$executable call crashed:  
>> $? $! $commandstring\n") unless ($status==0) ;" line, the blastall  
>> runs. The only difference from what Matthew saw is that, when I  
>> did not uncomment the line, blastall DID NOT run.
>
> Yes, Matthew is one of the authors of PSORTB and I spent a bit of time
> last year trying to fix this problem (unsuccessfully). The PSORTB docs
> http://www.psort.org/downloads/index.html
> explain how to get around this problem just as Guojun describes. I use
> a custom BioPerl installation just for PSORTB!
>
>  I was under the impression it was already filed as a bug, but my
> searching indicates this is not so.
>
> -- 
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Might be wise to go ahead and add it to bugzilla so we can track it,  
along with the workaround.

chris


From neetisomaiya at gmail.com  Mon Aug 13 06:29:39 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 13 Aug 2007 15:59:39 +0530
Subject: [Bioperl-l] Homologene parser?
Message-ID: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>

Hi,

Does anyone know of any Homologene parser, if available?
Please let me know.

Thanks and Regards,
Neeti.


-- 
-Neeti
Even my blood says, B positive


From shameer at ncbs.res.in  Mon Aug 13 07:07:45 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 16:37:45 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add
 direction to SeqFeature
In-Reply-To: <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
Message-ID: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>

Dear All,

I am generating images based on Transcription Factor binding site data
using bio::graphics module.
I created my images using program : version-2 
[http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
Stein ). I attaching one of the image with this mail.

I need to make 3 changes to this image

1. to color the 'scale'
Color the scale in two different colors ie, from start 1.0k - color blue
from 101 - till end of the scale green (I thoroghly checked the
Bio::Graphics document, I couldnt find an option to do this )

2. to sort the Transcription factors based on the z_score

3. to give forward/reverse [> or < ]direction for the black boxes

I would appreaciate if any one can give me some clues/link to accomplish
this :).
thanks in advance ,
Shameer

-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TF_top3.png
Type: image/png
Size: 2188 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070813/6a4423bd/attachment-0002.png>

From bix at sendu.me.uk  Mon Aug 13 09:11:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 14:11:50 +0100
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>	<1178028249.2644.13.camel@localhost.localdomain>	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
Message-ID: <46C05896.1010002@sendu.me.uk>

Shameer Khadar wrote:
> Dear All,
> 
> I am generating images based on Transcription Factor binding site data
> using bio::graphics module.
> I created my images using program : version-2 
> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
> Stein ). I attaching one of the image with this mail.
> 
> I need to make 3 changes to this image
> 
> 1. to color the 'scale'
> Color the scale in two different colors ie, from start 1.0k - color blue
> from 101 - till end of the scale green (I thoroghly checked the
> Bio::Graphics document, I couldnt find an option to do this )

The scale is just a scale and shouldn't need colouring. You can do what 
you want by having a blue 'upstream' feature and a green 'gene' feature 
in the first row.


> 2. to sort the Transcription factors based on the z_score

I don't know Bio::Graphics well enough, but am interested in the answer...


> 3. to give forward/reverse [> or < ]direction for the black boxes

Presumably you just change the glyph type of your binding sites to 
something that shows direction, like 'processed_transcript'. Someone 
else may have a more appropriate suggestion.

However, do your binding sites really have a direction? That is, do you 
really know which strand your transcription factor bound to?


From cjfields at uiuc.edu  Mon Aug 13 10:39:11 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 09:39:11 -0500
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
	add direction to SeqFeature
In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
Message-ID: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>


On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote:

> Dear All,
>
> I am generating images based on Transcription Factor binding site data
> using bio::graphics module.
> I created my images using program : version-2
> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
> Stein ). I attaching one of the image with this mail.
>
> I need to make 3 changes to this image
>
> 1. to color the 'scale'
> Color the scale in two different colors ie, from start 1.0k - color  
> blue
> from 101 - till end of the scale green (I thoroghly checked the
> Bio::Graphics document, I couldnt find an option to do this )

Much of the documentation you need is available via 'perldoc  
Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes.   
The above may be possible using two seqfeatures instead of one or  
maybe a split location with a callback (not sure, haven't tried  
either, mileage may vary, batteries not included, warranty void if  
packaging is opened, etc).  Might be worth checking out the POD for  
the arrow glyph to see what's possible.

> 2. to sort the Transcription factors based on the z_score

In Bio::Graphics::Panel POD under 'Glyph Options', there is  
documentation for 'sort_order' which accepts callbacks.  According to  
the docs you would basically do something like the following (the  
prototype is required; note the score):

   -sort_order => sub ($$) {
     my ($glyph1,$glyph2) = @_;
     my $a = $glyph1->feature;
     my $b = $glyph2->feature;
     ( $b->score/log($b->length)
           <=>
       $a->score/log($a->length) )
           ||
     ( $a->start <=> $b->start )
   }

Again, haven't tried.

> 3. to give forward/reverse [> or < ]direction for the black boxes

I think you first need to ensure the glyph will accept strandedness,  
though I think most do.  Then you would set either the 'strand_arrow'  
or 'stranded' option to 1 (they are synonyms).  Again, see  
Bio::Graphics::Panel POD under Glyph Options, specifically the  
parameter 'stranded' or 'strand_arrow'.

> I would appreaciate if any one can give me some clues/link to  
> accomplish
> this :).
> thanks in advance ,
> Shameer

No problem!

chris

> -- 
> Shameer Khadar
> Lab (# 25) The Computational Biology Group
> National Centre for Biological Sciences (TIFR)
> GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
> T - 91-080-23666001 EXT - 6251
> W - http://www.ncbs.res.in
> <TF_top3.png>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Mon Aug 13 10:47:35 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 20:17:35 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <46C05896.1010002@sendu.me.uk>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
Message-ID: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>

Dear Sendu,

Thanks for your reply.

>> I need to make 3 changes to this image
>>
>> 1. to color the 'scale'
>> Color the scale in two different colors ie, from start 1.0k - color blue
>> from 101 - till end of the scale green (I thoroghly checked the
>> Bio::Graphics document, I couldnt find an option to do this )
>
> The scale is just a scale and shouldn't need colouring. You can do what
> you want by having a blue 'upstream' feature and a green 'gene' feature
> in the first row.
Thanks for the point : 'The scale is just a scale...'.
But my idea is to differentiate the scale in to three to diffentiate
between 100bp upstream region, UTR and gene start site. starting point of
scale till 0k is the 100bp upstream. From 0k till end of the current_scale
is UTR, from the end of scale gene starts, since this is a bit tough to
distinguish, we thought of this coloring option. Addition of an extra
track may is an alternate option (I tried to convince our experimental
team by adding an extra track, but they want it this way :(..)

>
>> 2. to sort the Transcription factors based on the z_score
> I don't know Bio::Graphics well enough, but am interested in the answer...
>
It is possible, but sort_order option is available. I tried it a couple of
times but it is not  working.

>
>> 3. to give forward/reverse [> or < ]direction for the black boxes
>
> Presumably you just change the glyph type of your binding sites to
> something that shows direction, like 'processed_transcript'. Someone
> else may have a more appropriate suggestion.
Thanks, I will look in to it.

>
> However, do your binding sites really have a direction? That is, do you
> really know which strand your transcription factor bound to?
Yes, these info we collated from various experimental datasets.

-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From bix at sendu.me.uk  Mon Aug 13 11:01:43 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 16:01:43 +0100
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
Message-ID: <46C07257.1000308@sendu.me.uk>

Shameer Khadar wrote:
>> However, do your binding sites really have a direction? That is, do you
>> really know which strand your transcription factor bound to?
 >
> Yes, these info we collated from various experimental datasets.

Well, those datasets I'd like to see... What I was getting at is the 
strand probably isn't known at the experimental level, but to describe 
the site a strand has to be arbitrarily picked so you can write the 
sequence of the site down as a single string. Its probably the case that 
the strand information you have is just the way it happened to be 
reported in the literature and has no biological meaning.


From shameer at ncbs.res.in  Mon Aug 13 11:16:33 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 20:46:33 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>
Message-ID: <42833.192.168.1.1.1187018193.squirrel@mail.ncbs.res.in>

Chris,

Thanks for your detailed reply.
I will read up the docs and try different options using ur code snippets
as starting point. I will get back to the list with my results.

Thanks
-- 
Shameer

>
> On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote:
>
>> Dear All,
>>
>> I am generating images based on Transcription Factor binding site data
>> using bio::graphics module.
>> I created my images using program : version-2
>> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
>> Stein ). I attaching one of the image with this mail.
>>
>> I need to make 3 changes to this image
>>
>> 1. to color the 'scale'
>> Color the scale in two different colors ie, from start 1.0k - color
>> blue
>> from 101 - till end of the scale green (I thoroghly checked the
>> Bio::Graphics document, I couldnt find an option to do this )
>
> Much of the documentation you need is available via 'perldoc
> Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes.
> The above may be possible using two seqfeatures instead of one or
> maybe a split location with a callback (not sure, haven't tried
> either, mileage may vary, batteries not included, warranty void if
> packaging is opened, etc).  Might be worth checking out the POD for
> the arrow glyph to see what's possible.
>
>> 2. to sort the Transcription factors based on the z_score
>
> In Bio::Graphics::Panel POD under 'Glyph Options', there is
> documentation for 'sort_order' which accepts callbacks.  According to
> the docs you would basically do something like the following (the
> prototype is required; note the score):
>
>    -sort_order => sub ($$) {
>      my ($glyph1,$glyph2) = @_;
>      my $a = $glyph1->feature;
>      my $b = $glyph2->feature;
>      ( $b->score/log($b->length)
>            <=>
>        $a->score/log($a->length) )
>            ||
>      ( $a->start <=> $b->start )
>    }
>
> Again, haven't tried.
>
>> 3. to give forward/reverse [> or < ]direction for the black boxes
>
> I think you first need to ensure the glyph will accept strandedness,
> though I think most do.  Then you would set either the 'strand_arrow'
> or 'stranded' option to 1 (they are synonyms).  Again, see
> Bio::Graphics::Panel POD under Glyph Options, specifically the
> parameter 'stranded' or 'strand_arrow'.
>


-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From bix at sendu.me.uk  Mon Aug 13 11:47:10 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 16:47:10 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>	
	<46BB93DC.9010608@sendu.me.uk>	
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>	
	<46BC16A9.7090709@sendu.me.uk>
	<aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>
Message-ID: <46C07CFE.7020105@sendu.me.uk>

hari jayaram wrote:
> Hey all ,
> Thanks for your help. Its working real well now.
[snip]
> I am thinking I will screencast this process for everyones benefit and 
> put it up on bioscreencast.com <http://bioscreencast.com> . If that will 
> be useful for others.

I'm certain it will. That's a very interesting website. Thanks for 
taking the time, and I hope you find Bioperl useful.


From cjfields at uiuc.edu  Mon Aug 13 12:24:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 11:24:15 -0500
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
	add direction to SeqFeature
In-Reply-To: <46C07257.1000308@sendu.me.uk>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
Message-ID: <A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>


On Aug 13, 2007, at 10:01 AM, Sendu Bala wrote:

> Shameer Khadar wrote:
>>> However, do your binding sites really have a direction? That is,  
>>> do you
>>> really know which strand your transcription factor bound to?
>>
>> Yes, these info we collated from various experimental datasets.
>
> Well, those datasets I'd like to see... What I was getting at is the
> strand probably isn't known at the experimental level, but to describe
> the site a strand has to be arbitrarily picked so you can write the
> sequence of the site down as a single string. Its probably the case  
> that
> the strand information you have is just the way it happened to be
> reported in the literature and has no biological meaning.

It's subjective.  I can think of several cases where strandedness  
does matter and has meaning.  If the motif is related to how the gene  
is transcribed or post-transcriptionally regulated, for instance;  
elements which indicate start of transcription (-10/-35 or any sigma- 
factor-related promoter element in prokaryotes), end of transcription  
(poly-A signal, transcription terminators), modulation of translation  
(SECIS, IRES), or conserved DNA motifs which are transcribed prior to  
regulation (RNA-binding proteins like IRE).

chris


From amacgregor at ccg.murdoch.edu.au  Mon Aug 13 20:52:10 2007
From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor)
Date: Tue, 14 Aug 2007 08:52:10 +0800
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
Message-ID: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>

On 13/08/2007, at 6:29 PM, neeti somaiya wrote:

> Hi,
>
> Does anyone know of any Homologene parser, if available?
> Please let me know.
>
> Thanks and Regards,
> Neeti.

Hi Neeti,

Quite a long time ago now I wrote an Homologene parser and posted it  
to the mailing list:

<http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>

I don't know if this still works but you could use it as a starting  
point. There may also be something newer out there too, I don't know.  
If you search the mailing list archives you'll get a few messages  
around the topic.

Cheers, Andrew.


Andrew Macgregor
Centre for Comparative Genomics, Murdoch University
Email: amacgregor at ccg.murdoch.edu.au
Tel: (08) 9360 2961


From cjfields at uiuc.edu  Mon Aug 13 23:21:54 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 22:21:54 -0500
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
Message-ID: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>

It looks like Heikki responded and thought a good place for it would  
be Bio::SeqIO, but it didn't go anywhere I suppose.  I see that a few  
other posts suggest it could be placed in Bio::Cluster as well which  
I'm not familiar with.  We could add it in if you were still  
interested, just need to find a good place for it; might be nice to  
have a Parse::RecDescent-based parser.

chris

On Aug 13, 2007, at 7:52 PM, Andrew Macgregor wrote:

> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>
>> Hi,
>>
>> Does anyone know of any Homologene parser, if available?
>> Please let me know.
>>
>> Thanks and Regards,
>> Neeti.
>
> Hi Neeti,
>
> Quite a long time ago now I wrote an Homologene parser and posted it
> to the mailing list:
>
> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>
> I don't know if this still works but you could use it as a starting
> point. There may also be something newer out there too, I don't know.
> If you search the mailing list archives you'll get a few messages
> around the topic.
>
> Cheers, Andrew.
>
>
> Andrew Macgregor
> Centre for Comparative Genomics, Murdoch University
> Email: amacgregor at ccg.murdoch.edu.au
> Tel: (08) 9360 2961
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Tue Aug 14 03:46:19 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 14 Aug 2007 08:46:19 +0100
Subject: [Bioperl-l] Warnings/errors generated by Eclipse
Message-ID: <46C15DCB.80603@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've just been setting up Eclipse with the EPIC plugin, and it's
generating some errors and warnings about bioperl-live that I'd like to
pass by you.

I think most of the errors are along the lines of:
"Can't find 'build_params' in _build in
/usr/local/share/perl/5.8.8/Module/Build/Base.pm line 1011"

This occurs with files like:
t/Biblio_biofetch.t
t/seqread_fail.t

I think it's to do with the parameters passed to test_begin() or it
could be my setup of Eclipse?

Other highlighted problems are some of the scripts in the examples dir.
Some require modules that reside in the bioperl-run package. Would it be
wise to move these to the bioperl-run examples dir?

There may also be some problems with XML files in t/data e.g.
t/data/interpro_ebi.xml
There appears to be a typo on line 2. However, I'm not sure this is
up-to-date? I can comment on the others later if required.

Cheers
Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGwV3KczuW2jkwy2gRApM/AJ9abWl02CAJqDK2sEXEUEg8nGRC4ACdHcAb
nZmh+1dmtc1W9mThkUVKitw=
=5eXZ
-----END PGP SIGNATURE-----


From amacgregor at ccg.murdoch.edu.au  Tue Aug 14 01:14:58 2007
From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor)
Date: Tue, 14 Aug 2007 13:14:58 +0800
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
Message-ID: <C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>

On 14/08/2007, at 11:21 AM, Chris Fields wrote:

> It looks like Heikki responded and thought a good place for it  
> would be Bio::SeqIO, but it didn't go anywhere I suppose.  I see  
> that a few other posts suggest it could be placed in Bio::Cluster  
> as well which I'm not familiar with.  We could add it in if you  
> were still interested, just need to find a good place for it; might  
> be nice to have a Parse::RecDescent-based parser.
>
> chris
>

Hi Chris,

I was also doing some parsing of UniGene at the time but found  
RecDescent was too slow and went back to regexes. That code found  
it's way into Bio::Cluster. Occasionally I see a message with someone  
looking for a Homologene parser but not very often, so I'm not sure  
it is worth the effort of moving the code into bioperl.

Cheers, Andrew.


From neetisomaiya at gmail.com  Tue Aug 14 09:24:07 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Tue, 14 Aug 2007 18:54:07 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
Message-ID: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>

Hi Andrew,

I think the homologene data files have changed now on the ftp, from what you
had used.
It is now homologene.data and homologene.xml.
I tried using your parser, but because it was written on the file
hmlg.trip.ftp, it doesnt work anymore.

I came across a parser
http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
.
I am looking at it to see if it works for me. NOt sure if it will.

~Neeti.

On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>
> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>
> > Hi,
> >
> > Does anyone know of any Homologene parser, if available?
> > Please let me know.
> >
> > Thanks and Regards,
> > Neeti.
>
> Hi Neeti,
>
> Quite a long time ago now I wrote an Homologene parser and posted it
> to the mailing list:
>
> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>
> I don't know if this still works but you could use it as a starting
> point. There may also be something newer out there too, I don't know.
> If you search the mailing list archives you'll get a few messages
> around the topic.
>
> Cheers, Andrew.
>
>
> Andrew Macgregor
> Centre for Comparative Genomics, Murdoch University
> Email: amacgregor at ccg.murdoch.edu.au
> Tel: (08) 9360 2961
>
>
>
>


-- 
-Neeti
Even my blood says, B positive


From bix at sendu.me.uk  Tue Aug 14 10:57:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 14 Aug 2007 15:57:29 +0100
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
Message-ID: <46C1C2D9.6050409@sendu.me.uk>

I'm looking at what looks like a pretty major bug in Bio::SimpleAlign, 
but before I commit the fix I wanted to check my sanity/understanding.

My understanding is that an alignment may be built from just sub-parts 
of a number of sequences. So you give each sequence in the alignment a 
start and stop so you can later map back the aligned region to the 
original sequence. So, for example, the following should all pass:

diff -r1.56 SimpleAlign.t
459a460,540
 >
 >
 > # is _remove_col really working correctly?
 > my $a = Bio::LocatableSeq->new(-id => 'a', -seq => 
'atcgatcgatcgatcg', -start => 5, -end => 20);
 > my $b = Bio::LocatableSeq->new(-id => 'b', -seq => 
'-tcgatc-atcgatcg', -start => 30, -end => 43);
 > my $c = Bio::LocatableSeq->new(-id => 'c', -seq => 
'atcgatcgatc-atc-', -start => 50, -end => 63);
 > my $d = Bio::LocatableSeq->new(-id => 'd', -seq => 
'--cgatcgatcgat--', -start => 80, -end => 91);
 > my $e = Bio::LocatableSeq->new(-id => 'e', -seq => 
'-t-gatcgatcga-c-', -start => 100, -end => 111);
 > $aln = Bio::SimpleAlign->new();
 > $aln->add_seq($a);
 > $aln->add_seq($b);
 > $aln->add_seq($c);
 >
 > my $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 6;
 >               is $seq->end, 19;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 >       elsif ($seq->id eq 'b') {
 >               is $seq->start, 30;
 >               is $seq->end, 42;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 >       elsif ($seq->id eq 'c') {
 >               is $seq->start, 51;
 >               is $seq->end, 63;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 > }
 >
 > $aln->add_seq($d);
 > $aln->add_seq($e);
 > $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 8;
 >               is $seq->end, 17;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'b') {
 >               is $seq->start, 32;
 >               is $seq->end, 40;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'c') {
 >               is $seq->start, 53;
 >               is $seq->end, 61;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'd') {
 >               is $seq->start, 81;
 >               is $seq->end, 90;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'e') {
 >               is $seq->start, 101;
 >               is $seq->end, 110;
 >               is $seq->seq, 'gatcatca';
 >       }
 > }
 >
 > my $f = Bio::LocatableSeq->new(-id => 'f', -seq => 
'a-cgatcgatcgat-g', -start => 30, -end => 43);
 > $aln = Bio::SimpleAlign->new();
 > $aln->add_seq($a);
 > $aln->add_seq($f);
 >
 > $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 5;
 >               is $seq->end, 20;
 >               is $seq->seq, 'acgatcgatcgatg';
 >       }
 >       elsif ($seq->id eq 'f') {
 >               is $seq->start, 30;
 >               is $seq->end, 43;
 >               is $seq->seq, 'acgatcgatcgatg';
 >       }
 > }


But they don't. Once you remove certain columns the start and stop of 
the sequences in the alignment are no longer correct coordinates for the 
sub-sequence in the original sequence.

I propose the following patch to resolve this issue:

diff -r1.136 SimpleAlign.pm
1116c1116,1118
<
---
 >
 >     my $gap = $self->gap_char;
 >
1129,1137c1131,1147
<             my $spliced;
<             $spliced .= $start > 0 ? substr($sequence,0,$start) : '';
<             $spliced .= substr($sequence,$end+1,$seq->length-$end+1);
<             $sequence = $spliced;
<             if ($start == 1) {
<               $new_seq->start($end);
<             }
<             else {
<               $new_seq->start( $seq->start);
---
 >             my $orig = $sequence;
 >             my $head =  $start > 0 ? substr($sequence, 0, $start) : '';
 >             my $tail = ($end + 1) >= length($sequence) ? '' : 
substr($sequence, $end + 1);
 >             $sequence = $head.$tail;
 >             # start
 >             unless (defined $new_seq->start) {
 >                 if ($start == 0) {
 >                     my $start_adjust = () = substr($orig, 0, $end + 
1) =~ /$gap/g;
 >                     $new_seq->start($seq->start + $end + 1 - 
$start_adjust);
 >                 }
 >                 else {
 >                     my $start_adjust = $orig =~ /$gap+/;
 >                     if ($start_adjust) {
 >                         $start_adjust = $+[0] - 1 < $start;
 >                     }
 >                     $new_seq->start($seq->start + $start_adjust);
 >                 }
1140,1141c1150,1152
<             if($end >= $seq->end){
<              $new_seq->end( $start);
---
 >             if (($end + 1) >= length($orig)) {
 >                 my $end_adjust = () = substr($orig, $start) =~ /$gap/g;
 >                 $new_seq->end($seq->end - (length($orig) - $start) + 
$end_adjust);
1144c1155
<              $new_seq->end($seq->end);
---
 >                 $new_seq->end($seq->end);
1148c1159
<                 push @new, $new_seq;
---
 >               push @new, $new_seq;
1207,1209c1218,1234
<       # sort the positions to remove columns at the end 1st
<       @$positions = sort { $b->[0] <=> $a->[0] } @$positions;
<       $aln = $self->_remove_col($aln,$positions);
---
 >       # sort the positions
 >       @$positions = sort { $a->[0] <=> $b->[0] } @$positions;
 >
 >     my @remove;
 >     my $length = 0;
 >     foreach my $pos (@{$positions}) {
 >         my ($start, $end) = @{$pos};
 >
 >         #have to offset the start and end for subsequent removes
 >         $start-=$length;
 >         $end  -=$length;
 >         $length += ($end-$start+1);
 >         push @remove, [$start,$end];
 >     }
 >
 >     #remove the segments
 >     $aln = $#remove >= 0 ? $self->_remove_col($aln,\@remove) : $self;


This breaks 2 tests in SimpleAlign.t, but as far as I can tell, those 
tests expect the wrong answer. Changed to expect the correct answer, 
SimpleAlign.t and all other tests in the test suite pass.

diff -r1.56 SimpleAlign.t
214,215c214,215
<       "P84139/1-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
<       "P814153/1-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
---
 >       "P84139/2-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
 >       "P814153/2-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
229c229
<       "gb|443893|124775/1-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
---
 >       "gb|443893|124775/2-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",


Can someone triple-check my thinking and report back please?

Cheers,
Sendu.


From basu at pharm.sunysb.edu  Tue Aug 14 11:02:06 2007
From: basu at pharm.sunysb.edu (Siddhartha Basu)
Date: Tue, 14 Aug 2007 11:02:06 -0400
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
Message-ID: <46C1C3EE.4030006@pharm.sunysb.edu>

neeti somaiya wrote:
> Hi Andrew,
> 
> I think the homologene data files have changed now on the ftp, from what you
> had used.
> It is now homologene.data and homologene.xml.
> I tried using your parser, but because it was written on the file
> hmlg.trip.ftp, it doesnt work anymore.
> 
> I came across a parser
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> .
> I am looking at it to see if it works for me. NOt sure if it will.
> 
> ~Neeti.

Hi Neeti,
I have recently written a parser for 'homologene' xml data specific for 
my purpose. I am not sure whether it will suit your purpose but it could 
be extended for general purpose parsing, so i am putting it forward. 
Here is how it works .......

* It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
* It does SAX based parsing (currently uses XML::SAX::ExpatXS)
* Returns a graph(uses Graph module of perl) object where each node is a 
homologue entry with its corresponding entrez gene id. Each node also 
contain the following attributes ...
	* Refseq protein id.
	* Protein id (pid)
	* ncbi taxon id.
* The edge attribute contain information about the ortholog(true/false) 
relationship between two nodes.
* The rest of tags currently are not being extracted. However, parsing 
the rest of the tags should not be very difficult.

Generally i get homologene xml stream from an 'efetch' through 
Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and 
then works on it.

So, to make it more generic and work on local file

* We need another class that reads the chunk between 
<HG-Entry>.....</HG-Entry> and sends it to the parser.
* Add supports for most of the tags.
* Massage the data to a bioperl compatible object.

The first two i could work it out and for the last one i have to figure 
out the bioperl object that could be suitable (like  Bio::Cluster or 
Bio::NetWork::Node/Edge).

Let me know if it sounds interesting and i will send you the code.

-siddhartha


> 
> On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>>
>>> Hi,
>>>
>>> Does anyone know of any Homologene parser, if available?
>>> Please let me know.
>>>
>>> Thanks and Regards,
>>> Neeti.
>> Hi Neeti,
>>
>> Quite a long time ago now I wrote an Homologene parser and posted it
>> to the mailing list:
>>
>> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>>
>> I don't know if this still works but you could use it as a starting
>> point. There may also be something newer out there too, I don't know.
>> If you search the mailing list archives you'll get a few messages
>> around the topic.
>>
>> Cheers, Andrew.
>>
>>
>> Andrew Macgregor
>> Centre for Comparative Genomics, Murdoch University
>> Email: amacgregor at ccg.murdoch.edu.au
>> Tel: (08) 9360 2961
>>
>>
>>
>>
> 
> 


From cjfields at uiuc.edu  Tue Aug 14 12:33:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 14 Aug 2007 11:33:31 -0500
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
In-Reply-To: <46C1C2D9.6050409@sendu.me.uk>
References: <46C1C2D9.6050409@sendu.me.uk>
Message-ID: <B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>

Could you attach the scripts and patches to a bug report for tracking  
so anyone interested can double-check?  Having them in an email is  
problematic as the text in some clients wraps.

 From what I'm seeing I think we're in general agreement, though I'll  
reason through it to see if I'm following correctly.  The data in the  
SimpleAlign example you give is this:

a/5-20            atcgatcgatcgatcg
b/30-43           -tcgatc-atcgatcg
c/50-63           atcgatcgatc-atc-
                    ****** *** ***

Removing the gaps gives:

a/5-20            tcgatcatcatc
b/30-43           tcgatcatcatc
c/50-63           tcgatcatcatc
                   ************

The start/end is wrong, as you state.  Adjusting to map simple start/ 
ends to the original sequence won't work as we're removing gaps and  
residues in the LocatableSeqs along with it (ends and internal  
residues).  I guess if we want to map back to the original sequence  
accurately we would have to use split locations (not currently  
implemented with LocatableSeq) or maybe a cigar-like syntax against  
consensus (ugh), otherwise we wouldn't know where to map the relevant  
internal gaps (now missing from the alignment) w/o running a local  
alignment against the original sequence:

a/6-11;12-19      tcgatcatcatc
b/30-38;40-42     tcgatcatcatc
c/51-56;58-63     tcgatcatcatc
                   ************

That could get really hairy for long alignments.  We could also  
return multiple SimpleAligns which map correctly (ugh), but what we  
really want (and the API specifies) is a new single SimpleAlign.

It may come down to simply stating it 'voids the warranty' (so-to- 
speak) when modifications are made to alignments which remove/insert  
residues from LocatableSeqs via remove_gaps/remove_columns or  
similar, and either leave as is with relevant warnings or readjust  
start/end appropriately when LocatableSeq residues change.

gapless_a/1-12    tcgatcatcatc
gapless_b/1-12    tcgatcatcatc
gapless_c/1-12    tcgatcatcatc
                   ************

Not sure which is the best approach but anything would be better than  
giving an unexpectedly incorrect answer.

chris

On Aug 14, 2007, at 9:57 AM, Sendu Bala wrote:

> I'm looking at what looks like a pretty major bug in Bio::SimpleAlign,
> but before I commit the fix I wanted to check my sanity/understanding.
>
> My understanding is that an alignment may be built from just sub-parts
> of a number of sequences. So you give each sequence in the alignment a
> start and stop so you can later map back the aligned region to the
> original sequence. So, for example, the following should all pass:
>
> diff -r1.56 SimpleAlign.t
> 459a460,540
>>
>>
>> # is _remove_col really working correctly?
>> my $a = Bio::LocatableSeq->new(-id => 'a', -seq =>
> 'atcgatcgatcgatcg', -start => 5, -end => 20);
>> my $b = Bio::LocatableSeq->new(-id => 'b', -seq =>
> '-tcgatc-atcgatcg', -start => 30, -end => 43);
>> my $c = Bio::LocatableSeq->new(-id => 'c', -seq =>
> 'atcgatcgatc-atc-', -start => 50, -end => 63);
>> my $d = Bio::LocatableSeq->new(-id => 'd', -seq =>
> '--cgatcgatcgat--', -start => 80, -end => 91);
>> my $e = Bio::LocatableSeq->new(-id => 'e', -seq =>
> '-t-gatcgatcga-c-', -start => 100, -end => 111);
>> $aln = Bio::SimpleAlign->new();
>> $aln->add_seq($a);
>> $aln->add_seq($b);
>> $aln->add_seq($c);
>>
>> my $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 6;
>>               is $seq->end, 19;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>>       elsif ($seq->id eq 'b') {
>>               is $seq->start, 30;
>>               is $seq->end, 42;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>>       elsif ($seq->id eq 'c') {
>>               is $seq->start, 51;
>>               is $seq->end, 63;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>> }
>>
>> $aln->add_seq($d);
>> $aln->add_seq($e);
>> $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 8;
>>               is $seq->end, 17;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'b') {
>>               is $seq->start, 32;
>>               is $seq->end, 40;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'c') {
>>               is $seq->start, 53;
>>               is $seq->end, 61;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'd') {
>>               is $seq->start, 81;
>>               is $seq->end, 90;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'e') {
>>               is $seq->start, 101;
>>               is $seq->end, 110;
>>               is $seq->seq, 'gatcatca';
>>       }
>> }
>>
>> my $f = Bio::LocatableSeq->new(-id => 'f', -seq =>
> 'a-cgatcgatcgat-g', -start => 30, -end => 43);
>> $aln = Bio::SimpleAlign->new();
>> $aln->add_seq($a);
>> $aln->add_seq($f);
>>
>> $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 5;
>>               is $seq->end, 20;
>>               is $seq->seq, 'acgatcgatcgatg';
>>       }
>>       elsif ($seq->id eq 'f') {
>>               is $seq->start, 30;
>>               is $seq->end, 43;
>>               is $seq->seq, 'acgatcgatcgatg';
>>       }
>> }
>
>
> But they don't. Once you remove certain columns the start and stop of
> the sequences in the alignment are no longer correct coordinates  
> for the
> sub-sequence in the original sequence.
>
> I propose the following patch to resolve this issue:
>
> diff -r1.136 SimpleAlign.pm
> 1116c1116,1118
> <
> ---
>>
>>     my $gap = $self->gap_char;
>>
> 1129,1137c1131,1147
> <             my $spliced;
> <             $spliced .= $start > 0 ? substr($sequence,0,$start) :  
> '';
> <             $spliced .= substr($sequence,$end+1,$seq->length-$end 
> +1);
> <             $sequence = $spliced;
> <             if ($start == 1) {
> <               $new_seq->start($end);
> <             }
> <             else {
> <               $new_seq->start( $seq->start);
> ---
>>             my $orig = $sequence;
>>             my $head =  $start > 0 ? substr($sequence, 0,  
>> $start) : '';
>>             my $tail = ($end + 1) >= length($sequence) ? '' :
> substr($sequence, $end + 1);
>>             $sequence = $head.$tail;
>>             # start
>>             unless (defined $new_seq->start) {
>>                 if ($start == 0) {
>>                     my $start_adjust = () = substr($orig, 0, $end +
> 1) =~ /$gap/g;
>>                     $new_seq->start($seq->start + $end + 1 -
> $start_adjust);
>>                 }
>>                 else {
>>                     my $start_adjust = $orig =~ /$gap+/;
>>                     if ($start_adjust) {
>>                         $start_adjust = $+[0] - 1 < $start;
>>                     }
>>                     $new_seq->start($seq->start + $start_adjust);
>>                 }
> 1140,1141c1150,1152
> <             if($end >= $seq->end){
> <              $new_seq->end( $start);
> ---
>>             if (($end + 1) >= length($orig)) {
>>                 my $end_adjust = () = substr($orig, $start) =~ / 
>> $gap/g;
>>                 $new_seq->end($seq->end - (length($orig) - $start) +
> $end_adjust);
> 1144c1155
> <              $new_seq->end($seq->end);
> ---
>>                 $new_seq->end($seq->end);
> 1148c1159
> <                 push @new, $new_seq;
> ---
>>               push @new, $new_seq;
> 1207,1209c1218,1234
> <       # sort the positions to remove columns at the end 1st
> <       @$positions = sort { $b->[0] <=> $a->[0] } @$positions;
> <       $aln = $self->_remove_col($aln,$positions);
> ---
>>       # sort the positions
>>       @$positions = sort { $a->[0] <=> $b->[0] } @$positions;
>>
>>     my @remove;
>>     my $length = 0;
>>     foreach my $pos (@{$positions}) {
>>         my ($start, $end) = @{$pos};
>>
>>         #have to offset the start and end for subsequent removes
>>         $start-=$length;
>>         $end  -=$length;
>>         $length += ($end-$start+1);
>>         push @remove, [$start,$end];
>>     }
>>
>>     #remove the segments
>>     $aln = $#remove >= 0 ? $self->_remove_col($aln,\@remove) : $self;
>
>
> This breaks 2 tests in SimpleAlign.t, but as far as I can tell, those
> tests expect the wrong answer. Changed to expect the correct answer,
> SimpleAlign.t and all other tests in the test suite pass.
>
> diff -r1.56 SimpleAlign.t
> 214,215c214,215
> <       "P84139/1-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
> <       "P814153/1-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
> ---
>>       "P84139/2-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
>>       "P814153/2-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
> 229c229
> <       "gb|443893|124775/1-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
> ---
>>       "gb|443893|124775/2-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
>
>
> Can someone triple-check my thinking and report back please?
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Aug 14 13:13:30 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 14 Aug 2007 18:13:30 +0100
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
 columns?
In-Reply-To: <B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
References: <46C1C2D9.6050409@sendu.me.uk>
	<B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
Message-ID: <46C1E2BA.8060606@sendu.me.uk>

Chris Fields wrote:
> Could you attach the scripts and patches to a bug report for tracking
> so anyone interested can double-check?  Having them in an email is 
> problematic as the text in some clients wraps.

http://bugzilla.open-bio.org/show_bug.cgi?id=2344


> From what I'm seeing I think we're in general agreement, though I'll
>  reason through it to see if I'm following correctly.  The data in
> the SimpleAlign example you give is this:
> 
> a/5-20            atcgatcgatcgatcg
> b/30-43           -tcgatc-atcgatcg
> c/50-63           atcgatcgatc-atc-
>                    ****** *** ***
> 
> Removing the gaps gives:
> 
> a/5-20            tcgatcatcatc
> b/30-43           tcgatcatcatc
> c/50-63           tcgatcatcatc
>                   ************
> 
> The start/end is wrong, as you state.

Yes. For extra clarity, my thinking is that the correct answer is:

a/6-19            tcgatcatcatc
b/30-42           tcgatcatcatc
c/51-63           tcgatcatcatc
                   ************


> Adjusting to map simple start/ends to the original sequence won't
> work as we're removing gaps and residues in the LocatableSeqs along
> with it (ends and internal residues).  I guess if we want to map back
> to the original sequence accurately [snip]

What you say in the rest of your discussion is valid and deserves some 
thought/discussion, but for now just getting the start and end correct, 
ignoring any issues with internal residues, seems like a no-brainer.

For my own purposes that is all I need; having removed gaps I only need 
the start and end so I can take that region from each sequence and do a 
new alignment (for example).


BTW. Either my patch isn't quite perfect or there's another related bug 
I'm still tracking down. I'll commit when I've solved that, unless 
someone points out any mistakes in my thinking.


From basu at pharm.stonybrook.edu  Tue Aug 14 12:16:23 2007
From: basu at pharm.stonybrook.edu (Siddhartha Basu)
Date: Tue, 14 Aug 2007 12:16:23 -0400
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
Message-ID: <46C1D557.7090101@pharm.stonybrook.edu>

neeti somaiya wrote:
> Hi Andrew,
> 
> I think the homologene data files have changed now on the ftp, from what you
> had used.
> It is now homologene.data and homologene.xml.
> I tried using your parser, but because it was written on the file
> hmlg.trip.ftp, it doesnt work anymore.
> 
> I came across a parser
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> .
> I am looking at it to see if it works for me. NOt sure if it will.
> 
> ~Neeti.

Hi Neeti,
I have recently written a parser for 'homologene' xml data specific for
my purpose. I am not sure whether it will suit your purpose but it could
be extended for general purpose parsing, so i am putting it forward.
Here is how it works .......

* It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
* It does SAX based parsing (currently uses XML::SAX::ExpatXS)
* Returns a graph(uses Graph module of perl) object where each node is a
homologue entry with its corresponding entrez gene id. Each node also
contain the following attributes ...
	* Refseq protein id.
	* Protein id (pid)
	* ncbi taxon id.
* The edge attribute contain information about the ortholog(true/false)
relationship between two nodes.
* The rest of tags currently are not being extracted. However, parsing
the rest of the tags should not be very difficult.

Generally i get homologene xml stream from an 'efetch' through
Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and
then works on it.

So, to make it more generic and work on local file

* We need another class that reads the chunk between
<HG-Entry>.....</HG-Entry> and sends it to the parser.
* Add supports for most of the tags.
* Massage the data to a bioperl compatible object.

The first two i could work it out and for the last one i have to figure
out the bioperl object that could be suitable (like  Bio::Cluster or
Bio::NetWork::Node/Edge).

Let me know if it sounds interesting and i will send you the code.

-siddhartha


> 
> On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>>
>>> Hi,
>>>
>>> Does anyone know of any Homologene parser, if available?
>>> Please let me know.
>>>
>>> Thanks and Regards,
>>> Neeti.
>> Hi Neeti,
>>
>> Quite a long time ago now I wrote an Homologene parser and posted it
>> to the mailing list:
>>
>> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>>
>> I don't know if this still works but you could use it as a starting
>> point. There may also be something newer out there too, I don't know.
>> If you search the mailing list archives you'll get a few messages
>> around the topic.
>>
>> Cheers, Andrew.
>>
>>
>> Andrew Macgregor
>> Centre for Comparative Genomics, Murdoch University
>> Email: amacgregor at ccg.murdoch.edu.au
>> Tel: (08) 9360 2961
>>
>>
>>
>>
> 
> 


From cjfields at uiuc.edu  Tue Aug 14 13:19:59 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 14 Aug 2007 12:19:59 -0500
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
In-Reply-To: <46C1E2BA.8060606@sendu.me.uk>
References: <46C1C2D9.6050409@sendu.me.uk>
	<B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
	<46C1E2BA.8060606@sendu.me.uk>
Message-ID: <EE515FDC-2223-4D03-B819-3EA909539A61@uiuc.edu>


On Aug 14, 2007, at 12:13 PM, Sendu Bala wrote:
...

>
> Yes. For extra clarity, my thinking is that the correct answer is:
>
> a/6-19            tcgatcatcatc
> b/30-42           tcgatcatcatc
> c/51-63           tcgatcatcatc
>  ...
> What you say in the rest of your discussion is valid and deserves  
> some thought/discussion, but for now just getting the start and end  
> correct, ignoring any issues with internal residues, seems like a  
> no-brainer.
>
> For my own purposes that is all I need; having removed gaps I only  
> need the start and end so I can take that region from each sequence  
> and do a new alignment (for example).

It might be worth addressing the split location issue in the bug  
report before it gets lost in the ether.  Or maybe start a new one as  
an enhancement request.

> BTW. Either my patch isn't quite perfect or there's another related  
> bug I'm still tracking down. I'll commit when I've solved that,  
> unless someone points out any mistakes in my thinking.

Sounds fine by me.

chris


From gyang at plantbio.uga.edu  Tue Aug 14 15:01:07 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Tue, 14 Aug 2007 15:01:07 -0400
Subject: [Bioperl-l] the most weird thing  I've seen, help please
In-Reply-To: 41A08079-6EEC-4B62-8104-C41E70C03083@uiuc.edu
Message-ID: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>

Hi, all,  
I have two subroutines in my code. One is remoteblast and the other local blast. It works well.  
When I decided to change the remoteblast to local blast, I always get the following error. I downloaded nt database from NCBI as preformatted, but it works ok for both subroutines when I use command line blastall -p blastn.... I changed the db name to 'nt', 'nt.00', the same error message was returned. The error says: "program name was not given an argument", but I apparently gave it there.  Can anybody help me? The code for the two subrountines are very similar:  
   
sub search {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  
my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"query");
my $len=$query->length();
@db=('nt.nal');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new( -program =>"blastn",
                                                         -database =>"$db",
                                                         -e =>"$_[1]");
    my $rc = $factory->blastall($query);  
......  
   
   
sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  
my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('TNDB');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
    my $blast_report = $factory->blastall($query);

  
Thanks a lot!  
Guojun Yang  
Department of Plant Biology  
University of Georgia


From zhaodj at ioz.ac.cn  Wed Aug 15 04:05:36 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Wed, 15 Aug 2007 16:05:36 +0800 (CST)
Subject: [Bioperl-l] the most weird thing  I've seen, help please
In-Reply-To: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>
References: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>
Message-ID: <52820.159.226.67.49.1187165136.squirrel@mail.ioz.ac.cn>

Hi Guojun Yang,

I tested your code,modifying part of them. However,I did not
encounter the error.The modified code follows (see below and the
attachment). The codes run without any error on my Windows XP and
generates a file named lclblastResult.txt

In the codes I use the NCBI ecoli.nt database instead. Some
parameters change without affecting its function.

I think errors may happen in other part of your codes and more
details are needed.

-------code starts-------
#sub search {
use Bio::Tools::Run::StandAloneBlast;
use Bio::SearchIO::blast;

#my $query = Bio::Seq -> new ( -seq=>"$_[0]",
#                              -id=>"query");
my $query=Bio::Seq->new(-seq=>"ctgtattctgggatgca");
my $len=$query->length();

#@db=('nt.nal');
#foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new( -program
=>"blastn",
                                                         -database
=>'D:/blast/bin/ecoli.nt',
                                                         -e =>1,
														 -o=>'lclblastResult.txt');
my $rc = $factory->blastall($query);
-----code ends--------


On Wed, Aug 15, 2007 03:01, Guojun Yang wrote:
> Hi, all,
> I have two subroutines in my code. One is remoteblast and the
other
> local blast. It works well.
> When I decided to change the remoteblast to local blast, I always
get the following error. I downloaded nt database from NCBI as
> preformatted, but it works ok for both subroutines when I use
> command line blastall -p blastn.... I changed the db name to 'nt',
'nt.00', the same error message was returned. The error says:
> "program name was not given an argument", but I apparently gave it
there.  Can anybody help me? The code for the two subrountines are
very similar:
>
> sub search {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"query");
> my $len=$query->length();
> @db=('nt.nal');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new( -program
> =>"blastn",
>                                                          -database
> =>"$db",
>                                                          -e
> =>"$_[1]");
>     my $rc = $factory->blastall($query);
> ......
>
>
> sub ancestor {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"test");
> my $len=$query->length();
> my $long_name=$_[1];
> my $long_start=$_[2];
> my $long_end=$_[3];
> @db=('TNDB');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new(-program
=>
> "blastx",
>                                                         -database
=>
> "$db",
>                                                         -e =>
1e-3,
>                                                         );
>     my $blast_report = $factory->blastall($query);
>
>
> Thanks a lot!
> Guojun Yang
> Department of Plant Biology
> University of Georgia
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


-------------- next part --------------
A non-text attachment was scrubbed...
Name: lclblast.pl
Type: application/octet-stream
Size: 644 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/f40b2950/attachment-0002.obj>

From tania.oh at brasenose.oxford.ac.uk  Wed Aug 15 12:05:15 2007
From: tania.oh at brasenose.oxford.ac.uk (Tania Oh)
Date: Wed, 15 Aug 2007 17:05:15 +0100
Subject: [Bioperl-l] exonerate parser in bioperl-live fails when protein2dna
	comparison is performed
Message-ID: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>

Dear All,

I was trying to use the Bio::SearchIO::Alignment::Exonerate module to  
run and parse my exonerate output. But I've noticed that the parser  
which is actually Bio::SearchIO::Exonerate works if the model used in  
Exonerate is --model est2genome. I used exonerate with the model -- 
model protein2dna and the parser was unable to parse the hsps.


Below is a simple of code I used for testing the output from exonerate:

use Bio::SearchIO;
use strict;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exonerate.output.works
Type: application/octet-stream
Size: 6056 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/e4e43d75/attachment-0004.obj>
-------------- next part --------------
my $searchio = Bio::SearchIO->new(-file => 'test_data/ 
exonerate.output.dontwork
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exonerate.output.dontwork
Type: application/octet-stream
Size: 3283 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/e4e43d75/attachment-0005.obj>
-------------- next part --------------
',
                                    -format => 'exonerate');

   while( my $r = $searchio->next_result ) {
           while(my $hit = $r->next_hit){
                   while(my $hsp = $hit->next_hsp){
                           print $hsp->start. "\t". $hsp->end. "\n";
                   }
           }

     print $r->query_name, "\n";
   }


There are 2 files attached to show the examples of using either the  
est2genome or protein2dna model:
1. exonerate.output.works  - produced from the command line:
exonerate -q exonerate_cdna.fa -t exonerate_genomic.fa --model  
est2genome --bestn 1 > exonerate.output.works

2. exonerate.output.dontwork - produced from the command line:
exonerate -q test_aa.fa -t test_cds.fa --model protein2dna >  
exonerate.output.dontwork


Line 239 in Bio::searchIO::exonerate (cut and pasted below)

elsif(  s/^vulgar:\s+(\S+)\s+         # query sequence id
                  (\d+)\s+(\d+)\s+([\-\+])\s+   # query start-end-strand
                  (\S+)\s+                      # target sequence id
                  (\d+)\s+(\d+)\s+([\-\+])\s+   # target start-end- 
strand
                  (\d+)\s+                      # score
                  //ox ) {

parses the vulgar line of an --model est2genome exonerate output  
well. An example of the (complex) vulgar line which I've truncated  
for readability is:
vulgar: MUSSPSYN 3 1279 + 4.143962167-143965267 28 3074 + 6137 M 8 8  
G 0 1 M 231 231 5 0 2 I 0 253 3 0

whereas the vulgar line I've obtained from a --model protein2dna  
exonerate output is much simpler and the parser fails to pick it up:
vulgar: SJCHGC00851 0 204 . SJCHGC00851 2 614 + 1059 M 204 612

Has anyone encountered this situation before? I've not changed the  
parser as exonerate is widely used for it's est2genome model, and  
thought I'd run it pass the list to see if there is a work around  
solution.

many thanks in advance,
tania


From johnsonmar at mail.nih.gov  Wed Aug 15 12:47:10 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 12:47:10 -0400
Subject: [Bioperl-l] Need assistance with make error
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>

I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
Enterprise Linux 4, and the other running RHEL3.  I'm getting the
following 'make Error 255' when running make test.  I'm not sure what
this error indicates, and whether I should continue with a force
install?  Could you please advise.

 
Failed Test        Stat Wstat Total Fail  Failed  List of Failed

------------------------------------------------------------------------
-------

t/BioFetch_DB.t                  27    1   3.70%  8

t/EMBL_DB.t                      15    3  20.00%  6 13-14

t/Ontology.t          9  2304    50  100 200.00%  1-50

t/TreeIO.t                       41    1   2.44%  42

t/Variation_IO.t                 25    3  12.00%  15 20 25

t/simpleGOparser.t    9  2304    98  196 200.00%  1-98

120 subtests skipped.

Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
okay.

make: *** [test_dynamic] Error 255

 
Thanks,

 
Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/ <http://www.terpsys.com/> 

 
From arareko at campus.iztacala.unam.mx  Wed Aug 15 13:45:39 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 15 Aug 2007 12:45:39 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
Message-ID: <46C33BC3.9000409@campus.iztacala.unam.mx>

Which version of bioperl you're trying to install?

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.
> 
>  
> 
>  
> 
> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> 
> ------------------------------------------------------------------------
> -------
> 
> t/BioFetch_DB.t                  27    1   3.70%  8
> 
> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> 
> t/Ontology.t          9  2304    50  100 200.00%  1-50
> 
> t/TreeIO.t                       41    1   2.44%  42
> 
> t/Variation_IO.t                 25    3  12.00%  15 20 25
> 
> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> 
> 120 subtests skipped.
> 
> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
> okay.
> 
> make: *** [test_dynamic] Error 255
> 
>  
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/ <http://www.terpsys.com/> 
> 
>  
> 
>  
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From mbasu at mail.nih.gov  Wed Aug 15 13:55:50 2007
From: mbasu at mail.nih.gov (Malay)
Date: Wed, 15 Aug 2007 13:55:50 -0400
Subject: [Bioperl-l] Developer docs
Message-ID: <46C33E26.2050004@mail.nih.gov>

Hello All:

I apologize for not searching throughly. But I'd appreciate if someone 
point to a location where I can find any bioperl coding convention that 
I need follow for any code contribution to Bioperl.

-Malay

-- 
Malay K Basu
www.malaybasu.net


From arareko at campus.iztacala.unam.mx  Wed Aug 15 14:39:29 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 15 Aug 2007 13:39:29 -0500
Subject: [Bioperl-l] Developer docs
In-Reply-To: <46C33E26.2050004@mail.nih.gov>
References: <46C33E26.2050004@mail.nih.gov>
Message-ID: <46C34861.8090400@campus.iztacala.unam.mx>

You may want to bookmark this one:

http://bioperl.org/wiki/Developer_Information#BioPerl_Code

Mauricio.

Malay wrote:
> Hello All:
> 
> I apologize for not searching throughly. But I'd appreciate if someone 
> point to a location where I can find any bioperl coding convention that 
> I need follow for any code contribution to Bioperl.
> 
> -Malay
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From johnsonmar at mail.nih.gov  Wed Aug 15 15:01:23 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 15:01:23 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <46C33BC3.9000409@campus.iztacala.unam.mx>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>

This is version 1.4.

Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx] 
Sent: Wednesday, August 15, 2007 1:46 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

Which version of bioperl you're trying to install?

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.
> 
>  
> 
>  
> 
> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> 
> ------------------------------------------------------------------------
> -------
> 
> t/BioFetch_DB.t                  27    1   3.70%  8
> 
> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> 
> t/Ontology.t          9  2304    50  100 200.00%  1-50
> 
> t/TreeIO.t                       41    1   2.44%  42
> 
> t/Variation_IO.t                 25    3  12.00%  15 20 25
> 
> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> 
> 120 subtests skipped.
> 
> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
> okay.
> 
> make: *** [test_dynamic] Error 255
> 
>  
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/ <http://www.terpsys.com/> 
> 
>  
> 
>  
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Wed Aug 15 16:25:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 15:25:30 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>
Message-ID: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>

You'll definitely want to update to the latest (v 1.5.2).  We hope to  
get a new stable release out sometime soon and possibly move to a  
more regular release cycle.

chris

On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> This is version 1.4.
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx]
> Sent: Wednesday, August 15, 2007 1:46 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> Which version of bioperl you're trying to install?
>
> Johnson, Mary (NIH/NCI) [C] wrote:
>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>> following 'make Error 255' when running make test.  I'm not sure what
>> this error indicates, and whether I should continue with a force
>> install?  Could you please advise.
>>
>>
>>
>>
>>
>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>
>> --------------------------------------------------------------------- 
>> ---
>> -------
>>
>> t/BioFetch_DB.t                  27    1   3.70%  8
>>
>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>
>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>
>> t/TreeIO.t                       41    1   2.44%  42
>>
>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>
>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>
>> 120 subtests skipped.
>>
>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,  
>> 98.14%
>> okay.
>>
>> make: *** [test_dynamic] Error 255
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> MAURICIO HERRERA CUADRA
> arareko at campus.iztacala.unam.mx
> Laboratorio de Gen?tica
> Unidad de Morfofisiolog?a y Funci?n
> Facultad de Estudios Superiores Iztacala, UNAM
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonmar at mail.nih.gov  Wed Aug 15 16:32:43 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 16:32:43 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>

I saw the 1.5.2 version, but it stated that this was a developer release and that 1.4 was the latest stable version, so I went with 1.4.  I'll give 1.5.2 a try.

Thanks,


Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu] 
Sent: Wednesday, August 15, 2007 4:26 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

You'll definitely want to update to the latest (v 1.5.2).  We hope to  
get a new stable release out sometime soon and possibly move to a  
more regular release cycle.

chris

On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> This is version 1.4.
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx]
> Sent: Wednesday, August 15, 2007 1:46 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> Which version of bioperl you're trying to install?
>
> Johnson, Mary (NIH/NCI) [C] wrote:
>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>> following 'make Error 255' when running make test.  I'm not sure what
>> this error indicates, and whether I should continue with a force
>> install?  Could you please advise.
>>
>>
>>
>>
>>
>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>
>> --------------------------------------------------------------------- 
>> ---
>> -------
>>
>> t/BioFetch_DB.t                  27    1   3.70%  8
>>
>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>
>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>
>> t/TreeIO.t                       41    1   2.44%  42
>>
>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>
>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>
>> 120 subtests skipped.
>>
>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,  
>> 98.14%
>> okay.
>>
>> make: *** [test_dynamic] Error 255
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> MAURICIO HERRERA CUADRA
> arareko at campus.iztacala.unam.mx
> Laboratorio de Gen?tica
> Unidad de Morfofisiolog?a y Funci?n
> Facultad de Estudios Superiores Iztacala, UNAM
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Aug 15 16:40:32 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 15:40:32 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
Message-ID: <E16950D3-9F60-4862-9325-57CA26107649@uiuc.edu>

The term 'stable' is relative in this case; tons of bugs fixes were  
incorporated in the 1.5.2 release.  There are a few dev-specific  
issues we'll need to resolve prior to a new release; once those are  
out of the way we'll try to get a new 'stable' out.

chris

On Aug 15, 2007, at 3:32 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> I saw the 1.5.2 version, but it stated that this was a developer  
> release and that 1.4 was the latest stable version, so I went with  
> 1.4.  I'll give 1.5.2 a try.
>
> Thanks,
>
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> You'll definitely want to update to the latest (v 1.5.2).  We hope to
> get a new stable release out sometime soon and possibly move to a
> more regular release cycle.
>
> chris
>
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
>
>> This is version 1.4.
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/
>>
>>
>>
>> -----Original Message-----
>> From: Mauricio Herrera Cuadra  
>> [mailto:arareko at campus.iztacala.unam.mx]
>> Sent: Wednesday, August 15, 2007 1:46 PM
>> To: Johnson, Mary (NIH/NCI) [C]
>> Cc: bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Need assistance with make error
>>
>> Which version of bioperl you're trying to install?
>>
>> Johnson, Mary (NIH/NCI) [C] wrote:
>>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>>> following 'make Error 255' when running make test.  I'm not sure  
>>> what
>>> this error indicates, and whether I should continue with a force
>>> install?  Could you please advise.
>>>
>>>
>>>
>>>
>>>
>>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> ---
>>> -------
>>>
>>> t/BioFetch_DB.t                  27    1   3.70%  8
>>>
>>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>>
>>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>>
>>> t/TreeIO.t                       41    1   2.44%  42
>>>
>>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>>
>>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>>
>>> 120 subtests skipped.
>>>
>>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,
>>> 98.14%
>>> okay.
>>>
>>> make: *** [test_dynamic] Error 255
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Mary Johnson
>>>
>>> Sr. Network Engineer
>>>
>>> National Cancer Institute Center for Bioinformatics
>>> Contractor, TerpSys
>>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> MAURICIO HERRERA CUADRA
>> arareko at campus.iztacala.unam.mx
>> Laboratorio de Gen?tica
>> Unidad de Morfofisiolog?a y Funci?n
>> Facultad de Estudios Superiores Iztacala, UNAM
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Kevin.M.Brown at asu.edu  Wed Aug 15 16:54:04 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 15 Aug 2007 13:54:04 -0700
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
References: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>
	<EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
Message-ID: <1A4207F8295607498283FE9E93B775B40386D612@EX02.asurite.ad.asu.edu>

It technically is a developer release, but given the age of the 1.4 release it is better because of fixes for things like doing webblasts and other improvements and I've found that it is reliable in the results that come out of the various objects that I've had to use in my current projects.

> I saw the 1.5.2 version, but it stated that this was a 
> developer release and that 1.4 was the latest stable version, 
> so I went with 1.4.  I'll give 1.5.2 a try.
> 
> Thanks,
> 
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics 
> Contractor, TerpSys http://www.terpsys.com/
> 
>  
> 
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
> 
> You'll definitely want to update to the latest (v 1.5.2).  We 
> hope to get a new stable release out sometime soon and 
> possibly move to a more regular release cycle.
> 
> chris
> 
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
> 
> > This is version 1.4.
> >
> > Mary Johnson
> >
> > Sr. Network Engineer
> >
> > National Cancer Institute Center for Bioinformatics Contractor, 
> > TerpSys http://www.terpsys.com/
> >
> >
> >
> > -----Original Message-----
> > From: Mauricio Herrera Cuadra 
> [mailto:arareko at campus.iztacala.unam.mx]
> > Sent: Wednesday, August 15, 2007 1:46 PM
> > To: Johnson, Mary (NIH/NCI) [C]
> > Cc: bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] Need assistance with make error
> >
> > Which version of bioperl you're trying to install?
> >
> > Johnson, Mary (NIH/NCI) [C] wrote:
> >> I'm trying to install bioperl on 2 Linux servers - 1 
> running Redhat 
> >> Enterprise Linux 4, and the other running RHEL3.  I'm getting the 
> >> following 'make Error 255' when running make test.  I'm 
> not sure what 
> >> this error indicates, and whether I should continue with a force 
> >> install?  Could you please advise.
> >>
> >>
> >>
> >>
> >>
> >> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> >>
> >> 
> ---------------------------------------------------------------------
> >> ---
> >> -------
> >>
> >> t/BioFetch_DB.t                  27    1   3.70%  8
> >>
> >> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> >>
> >> t/Ontology.t          9  2304    50  100 200.00%  1-50
> >>
> >> t/TreeIO.t                       41    1   2.44%  42
> >>
> >> t/Variation_IO.t                 25    3  12.00%  15 20 25
> >>
> >> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> >>
> >> 120 subtests skipped.
> >>
> >> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 
> >> 98.14% okay.
> >>
> >> make: *** [test_dynamic] Error 255
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Thanks,
> >>
> >>
> >>
> >> Mary Johnson
> >>
> >> Sr. Network Engineer
> >>
> >> National Cancer Institute Center for Bioinformatics Contractor, 
> >> TerpSys http://www.terpsys.com/ <http://www.terpsys.com/>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > --
> > MAURICIO HERRERA CUADRA
> > arareko at campus.iztacala.unam.mx
> > Laboratorio de Gen?tica
> > Unidad de Morfofisiolog?a y Funci?n
> > Facultad de Estudios Superiores Iztacala, UNAM
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bix at sendu.me.uk  Wed Aug 15 16:50:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 15 Aug 2007 21:50:02 +0100
Subject: [Bioperl-l] Developer docs
In-Reply-To: <46C34861.8090400@campus.iztacala.unam.mx>
References: <46C33E26.2050004@mail.nih.gov>
	<46C34861.8090400@campus.iztacala.unam.mx>
Message-ID: <46C366FA.40609@sendu.me.uk>

Mauricio Herrera Cuadra wrote:
> You may want to bookmark this one:
> 
> http://bioperl.org/wiki/Developer_Information#BioPerl_Code

Yup. The important one is http://bioperl.org/wiki/Bioperl_Best_Practices 
, which I've just updated with the latest info on writing test scripts.


From bix at sendu.me.uk  Wed Aug 15 16:54:45 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 15 Aug 2007 21:54:45 +0100
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
Message-ID: <46C36815.5010908@sendu.me.uk>

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.

Unless you know you really must install Bioperl 1.4, install 1.5.2 instead.

http://www.bioperl.org/wiki/Release_1.5.2

If you use the Build.PL installation, at the very least you certainly 
won't get a make error.

http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#PRELIMINARY_PREPARATION


From cjfields at uiuc.edu  Wed Aug 15 17:16:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 16:16:27 -0500
Subject: [Bioperl-l] exonerate parser in bioperl-live fails when
	protein2dna comparison is performed
In-Reply-To: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>
References: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>
Message-ID: <F853DDF2-3165-4F88-A087-744D60682104@uiuc.edu>

I can confirm this with bioperl-live.  Bio::SearchIO::exonerate docs  
indicate protein2genome and est2genome model output is supported but  
doesn't specifically indicate that it can parse any other output.   
You can add an enhancement request to bugzilla indicating this  
deficiency or, if you are inclined, add the functionality yourself  
and donate the code.

chris

On Aug 15, 2007, at 11:05 AM, Tania Oh wrote:

> Dear All,
>
> I was trying to use the Bio::SearchIO::Alignment::Exonerate module  
> to run and parse my exonerate output. But I've noticed that the  
> parser which is actually Bio::SearchIO::Exonerate works if the  
> model used in Exonerate is --model est2genome. I used exonerate  
> with the model --model protein2dna and the parser was unable to  
> parse the hsps.
>
>
> Below is a simple of code I used for testing the output from  
> exonerate:
>
> use Bio::SearchIO;
> use strict;
> <exonerate.output.works>
> my $searchio = Bio::SearchIO->new(-file => 'test_data/ 
> exonerate.output.dontwork
> <exonerate.output.dontwork>
> ',
>                                    -format => 'exonerate');
>
>   while( my $r = $searchio->next_result ) {
>           while(my $hit = $r->next_hit){
>                   while(my $hsp = $hit->next_hsp){
>                           print $hsp->start. "\t". $hsp->end. "\n";
>                   }
>           }
>
>     print $r->query_name, "\n";
>   }
>
>
> There are 2 files attached to show the examples of using either the  
> est2genome or protein2dna model:
> 1. exonerate.output.works  - produced from the command line:
> exonerate -q exonerate_cdna.fa -t exonerate_genomic.fa --model  
> est2genome --bestn 1 > exonerate.output.works
>
> 2. exonerate.output.dontwork - produced from the command line:
> exonerate -q test_aa.fa -t test_cds.fa --model protein2dna >  
> exonerate.output.dontwork
>
>
> Line 239 in Bio::searchIO::exonerate (cut and pasted below)
>
> elsif(  s/^vulgar:\s+(\S+)\s+         # query sequence id
>                  (\d+)\s+(\d+)\s+([\-\+])\s+   # query start-end- 
> strand
>                  (\S+)\s+                      # target sequence id
>                  (\d+)\s+(\d+)\s+([\-\+])\s+   # target start-end- 
> strand
>                  (\d+)\s+                      # score
>                  //ox ) {
>
> parses the vulgar line of an --model est2genome exonerate output  
> well. An example of the (complex) vulgar line which I've truncated  
> for readability is:
> vulgar: MUSSPSYN 3 1279 + 4.143962167-143965267 28 3074 + 6137 M 8  
> 8 G 0 1 M 231 231 5 0 2 I 0 253 3 0
>
> whereas the vulgar line I've obtained from a --model protein2dna  
> exonerate output is much simpler and the parser fails to pick it up:
> vulgar: SJCHGC00851 0 204 . SJCHGC00851 2 614 + 1059 M 204 612
>
> Has anyone encountered this situation before? I've not changed the  
> parser as exonerate is widely used for it's est2genome model, and  
> thought I'd run it pass the list to see if there is a work around  
> solution.
>
> many thanks in advance,
> tania
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonmar at mail.nih.gov  Wed Aug 15 17:45:36 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 17:45:36 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <E16950D3-9F60-4862-9325-57CA26107649@uiuc.edu>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805716@NIHCESMLBX11.nih.gov>

Version 1.5.2 worked fine!  Thanks to all of you for your quick response.  I wish all of our vendors were that quick in getting back to me:)


Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu] 
Sent: Wednesday, August 15, 2007 4:41 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

The term 'stable' is relative in this case; tons of bugs fixes were  
incorporated in the 1.5.2 release.  There are a few dev-specific  
issues we'll need to resolve prior to a new release; once those are  
out of the way we'll try to get a new 'stable' out.

chris

On Aug 15, 2007, at 3:32 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> I saw the 1.5.2 version, but it stated that this was a developer  
> release and that 1.4 was the latest stable version, so I went with  
> 1.4.  I'll give 1.5.2 a try.
>
> Thanks,
>
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> You'll definitely want to update to the latest (v 1.5.2).  We hope to
> get a new stable release out sometime soon and possibly move to a
> more regular release cycle.
>
> chris
>
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
>
>> This is version 1.4.
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/
>>
>>
>>
>> -----Original Message-----
>> From: Mauricio Herrera Cuadra  
>> [mailto:arareko at campus.iztacala.unam.mx]
>> Sent: Wednesday, August 15, 2007 1:46 PM
>> To: Johnson, Mary (NIH/NCI) [C]
>> Cc: bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Need assistance with make error
>>
>> Which version of bioperl you're trying to install?
>>
>> Johnson, Mary (NIH/NCI) [C] wrote:
>>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>>> following 'make Error 255' when running make test.  I'm not sure  
>>> what
>>> this error indicates, and whether I should continue with a force
>>> install?  Could you please advise.
>>>
>>>
>>>
>>>
>>>
>>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> ---
>>> -------
>>>
>>> t/BioFetch_DB.t                  27    1   3.70%  8
>>>
>>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>>
>>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>>
>>> t/TreeIO.t                       41    1   2.44%  42
>>>
>>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>>
>>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>>
>>> 120 subtests skipped.
>>>
>>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,
>>> 98.14%
>>> okay.
>>>
>>> make: *** [test_dynamic] Error 255
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Mary Johnson
>>>
>>> Sr. Network Engineer
>>>
>>> National Cancer Institute Center for Bioinformatics
>>> Contractor, TerpSys
>>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> MAURICIO HERRERA CUADRA
>> arareko at campus.iztacala.unam.mx
>> Laboratorio de Gen?tica
>> Unidad de Morfofisiolog?a y Funci?n
>> Facultad de Estudios Superiores Iztacala, UNAM
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From neetisomaiya at gmail.com  Thu Aug 16 00:22:18 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 09:52:18 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <46C1D557.7090101@pharm.stonybrook.edu>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
	<46C1D557.7090101@pharm.stonybrook.edu>
Message-ID: <764978cf0708152122oba56e13qef83544cdde7e795@mail.gmail.com>

Hi Siddhartha,

Thanks a lot for your mail.
It would be great if you could send me your parser, I will see how I can
modify it for my purpose.

Thanks and Regards,
Neeti.

On 8/14/07, Siddhartha Basu <basu at pharm.stonybrook.edu> wrote:
>
> neeti somaiya wrote:
> > Hi Andrew,
> >
> > I think the homologene data files have changed now on the ftp, from what
> you
> > had used.
> > It is now homologene.data and homologene.xml.
> > I tried using your parser, but because it was written on the file
> > hmlg.trip.ftp, it doesnt work anymore.
> >
> > I came across a parser
> >
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> > .
> > I am looking at it to see if it works for me. NOt sure if it will.
> >
> > ~Neeti.
>
> Hi Neeti,
> I have recently written a parser for 'homologene' xml data specific for
> my purpose. I am not sure whether it will suit your purpose but it could
> be extended for general purpose parsing, so i am putting it forward.
> Here is how it works .......
>
> * It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
> * It does SAX based parsing (currently uses XML::SAX::ExpatXS)
> * Returns a graph(uses Graph module of perl) object where each node is a
> homologue entry with its corresponding entrez gene id. Each node also
> contain the following attributes ...
>         * Refseq protein id.
>         * Protein id (pid)
>         * ncbi taxon id.
> * The edge attribute contain information about the ortholog(true/false)
> relationship between two nodes.
> * The rest of tags currently are not being extracted. However, parsing
> the rest of the tags should not be very difficult.
>
> Generally i get homologene xml stream from an 'efetch' through
> Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and
> then works on it.
>
> So, to make it more generic and work on local file
>
> * We need another class that reads the chunk between
> <HG-Entry>.....</HG-Entry> and sends it to the parser.
> * Add supports for most of the tags.
> * Massage the data to a bioperl compatible object.
>
> The first two i could work it out and for the last one i have to figure
> out the bioperl object that could be suitable (like  Bio::Cluster or
> Bio::NetWork::Node/Edge).
>
> Let me know if it sounds interesting and i will send you the code.
>
> -siddhartha
>
>
> >
> > On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
> >> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
> >>
> >>> Hi,
> >>>
> >>> Does anyone know of any Homologene parser, if available?
> >>> Please let me know.
> >>>
> >>> Thanks and Regards,
> >>> Neeti.
> >> Hi Neeti,
> >>
> >> Quite a long time ago now I wrote an Homologene parser and posted it
> >> to the mailing list:
> >>
> >> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
> >>
> >> I don't know if this still works but you could use it as a starting
> >> point. There may also be something newer out there too, I don't know.
> >> If you search the mailing list archives you'll get a few messages
> >> around the topic.
> >>
> >> Cheers, Andrew.
> >>
> >>
> >> Andrew Macgregor
> >> Centre for Comparative Genomics, Murdoch University
> >> Email: amacgregor at ccg.murdoch.edu.au
> >> Tel: (08) 9360 2961
> >>
> >>
> >>
> >>
> >
> >
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Thu Aug 16 01:56:21 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 11:26:21 +0530
Subject: [Bioperl-l] PDB Parser
Message-ID: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>

Hi,

After a lot of search I could find this link from where PDB files can be
downloaded :
ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/
Is there any other link where one can download all pdb data from?

I tried using Bio::Structure::IO::pdb with some code like :-
use Bio::Structure::IO;

    $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
                                   -format => 'pdb');

    while ( my $struc = $in->next_structure() ) {
       print "Structure ", $struc->id,"\n";
    }

It works well. But I am not able to find documentation of other methods
which will give me various specific details available in a pdb file, right
from title, keywords, references to structure details, atoms, coordinates
etc. There must be different methods to fetch and parse each of this data
from a pdb file, right? Where can I find the details? Any example code of
the same would also be of great use.

Thanks and Regards,
Neeti Somaiya.

-- 
-Neeti
Even my blood says, B positive


From hrh at sanger.ac.uk  Thu Aug 16 04:48:16 2007
From: hrh at sanger.ac.uk (Hans Rudolf Hotz)
Date: Thu, 16 Aug 2007 09:48:16 +0100 (BST)
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0708160942310.14241@deskpro50.dynamic.sanger.ac.uk>


On Thu, 16 Aug 2007, neeti somaiya wrote:

> Hi,
>
> After a lot of search I could find this link from where PDB files can be
> downloaded :
> ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/
> Is there any other link where one can download all pdb data from?

try: ftp://pdb.protein.osaka-u.ac.jp/v3/pub/pdb/   or
      ftp://ftp.ebi.ac.uk/pub/databases/rcsb/pdb-remediated/

it is not BioPerl, but James Tisdall's book: O'Reilly: "Begiining Perl for 
Bioinformatics" has a nice introduction into parsing PDB files


Regards, Hans


>
> I tried using Bio::Structure::IO::pdb with some code like :-
> use Bio::Structure::IO;
>
>    $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>                                   -format => 'pdb');
>
>    while ( my $struc = $in->next_structure() ) {
>       print "Structure ", $struc->id,"\n";
>    }
>
> It works well. But I am not able to find documentation of other methods
> which will give me various specific details available in a pdb file, right
> from title, keywords, references to structure details, atoms, coordinates
> etc. There must be different methods to fetch and parse each of this data
> from a pdb file, right? Where can I find the details? Any example code of
> the same would also be of great use.
>
> Thanks and Regards,
> Neeti Somaiya.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.


From neetisomaiya at gmail.com  Thu Aug 16 05:30:42 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 15:00:42 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
	<C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>
Message-ID: <764978cf0708160230o4ade944er8c8529199f3a0262@mail.gmail.com>

Hi,

For now I am using the homologene parser available here :-
http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
,
for parsing the homologene.data file. But the README at the ftp site says
HOMOLOGENE.XML has much more data, I am still to see how to parse this one.

~Neeti.


On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>
> On 14/08/2007, at 11:21 AM, Chris Fields wrote:
>
> > It looks like Heikki responded and thought a good place for it
> > would be Bio::SeqIO, but it didn't go anywhere I suppose.  I see
> > that a few other posts suggest it could be placed in Bio::Cluster
> > as well which I'm not familiar with.  We could add it in if you
> > were still interested, just need to find a good place for it; might
> > be nice to have a Parse::RecDescent-based parser.
> >
> > chris
> >
>
> Hi Chris,
>
> I was also doing some parsing of UniGene at the time but found
> RecDescent was too slow and went back to regexes. That code found
> it's way into Bio::Cluster. Occasionally I see a message with someone
> looking for a Homologene parser but not very often, so I'm not sure
> it is worth the effort of moving the code into bioperl.
>
> Cheers, Andrew.
>


-- 
-Neeti
Even my blood says, B positive


From bix at sendu.me.uk  Thu Aug 16 05:59:08 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 16 Aug 2007 10:59:08 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
Message-ID: <46C41FEC.2000206@sendu.me.uk>

neeti somaiya wrote:
> I tried using Bio::Structure::IO::pdb with some code like :-
> use Bio::Structure::IO;
> 
>     $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>                                    -format => 'pdb');
> 
>     while ( my $struc = $in->next_structure() ) {
>        print "Structure ", $struc->id,"\n";
>     }
> 
> It works well. But I am not able to find documentation of other methods
> which will give me various specific details available in a pdb file, right
> from title, keywords, references to structure details, atoms, coordinates
> etc. There must be different methods to fetch and parse each of this data
> from a pdb file, right? Where can I find the details?

$struct is a Bio::Structure::Entry, so look at the docs for that:
http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html

You'll probably want to look at the docs for the other Structure modules 
as well:
http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html


I agree, the documentation in this area could be improved. 
Bio::Structure::StructureI could actually contain something, and 
Bio::Structure should actually exist or not be referenced in the docs.


From ewijaya at gmail.com  Thu Aug 16 00:18:57 2007
From: ewijaya at gmail.com (Edward Wijaya)
Date: Thu, 16 Aug 2007 12:18:57 +0800
Subject: [Bioperl-l] How to create contrasting colors in every singe track -
	Bio::Graphics
Message-ID: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>

Dear experts,

I am trying to draw a figures that shows binding sites hits for various
program (see attached) for example.

Now, I have a problem in creating contrasting colour for each of
the Programs (MEME, AlignACE, etc).  I want to avoid "graded segments",
so that I can have more contrasting color, e.g: red, blue, yellow, etc.

Can anybody suggest how can we achieve that?

My full source code can be found here: http://dpaste.com/16985/
The portion of the script is this:

__BEGIN__
    my %prog_color = (
        "Actual"   => 800000,
        "ALIGNACE" => 230000,
        "BP"       => 80000,
        "MDSCAN"   => 5000,
        "MITRA"    => 10000,
        "MTSAMP"   => 200000,
        "SPACE"    => 40000,
        "NONE"     => 0,
    );

    foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
        my $track = $panel->add_track(
            -glyph     => 'graded_segments',
            -key       => "SEQ " . $seqid,
            -connector => "dashed",
            -label     => 1,
            -fontcolor => 'red',
            -bgcolor   => 'blue',
            -bump      => +1,
            -height    => 8,
            -min_score => 0,
            -max_score => 500000
        );
# rest of the script
__END__

Regards,
Edward
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hits.png
Type: image/png
Size: 2509 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070816/31057225/attachment-0002.png>

From pratchusha.kamireddy at aamu.edu  Wed Aug 15 23:45:22 2007
From: pratchusha.kamireddy at aamu.edu (pratchusha kamireddy)
Date: Wed, 15 Aug 2007 22:45:22 -0500 (CDT)
Subject: [Bioperl-l] Request for Activeperl software
Message-ID: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>

Hello
  I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am working under Dr.Kantety in Plant and Soil Science Department.I am the beginner to learn perl programming. I need Activeperl software to run the perl programs. Can you help me in this regard like: where can I dowmload this software, how can i Install this and how can i use this. I am eagerlu waiting for your reply.Please help me in this regard.
   Thanking you
   Pratchusha Kamireddy


From spiros at lokku.com  Thu Aug 16 09:32:05 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Thu, 16 Aug 2007 14:32:05 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
References: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <bba689ec0708160632w315b00d5na3bf55d97ac03728@mail.gmail.com>

Hi,

You can download ActivePerl from ActiveStates website at

http://www.activestate.com/Products/ActivePerl/

Get a book: http://www.oreilly.com/catalog/lperl3/

Visit:

http://perl-begin.org/
http://learn.perl.org/

Usenet:

http://www.nntp.perl.org/group/perl.beginners/

Spiros

On 8/16/07, pratchusha kamireddy <pratchusha.kamireddy at aamu.edu> wrote:
> Hello
>   I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am working under Dr.Kantety in Plant and Soil Science Department.I am the beginner to learn perl programming. I need Activeperl software to run the perl programs. Can you help me in this regard like: where can I dowmload this software, how can i Install this and how can i use this. I am eagerlu waiting for your reply.Please help me in this regard.
>    Thanking you
>    Pratchusha Kamireddy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From razi.khaja at gmail.com  Thu Aug 16 09:37:09 2007
From: razi.khaja at gmail.com (Razi Khaja)
Date: Thu, 16 Aug 2007 09:37:09 -0400
Subject: [Bioperl-l] How to create contrasting colors in every singe
	track - Bio::Graphics
In-Reply-To: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
References: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
Message-ID: <62e9dabc0708160637o36380ecbv69fe479d0a26989d@mail.gmail.com>

You would probably want to consider a "Graph-Coloring" algorithm in
order to optimally pick contrasting colors for the features being
displayed.  This might be overkill for what your trying to accomplish
and may not be possible (depending on how many features you have in
your dataset ... ie. how big your graph is).

In anycase, some resources are:
http://en.wikipedia.org/wiki/Graph_coloring
http://web.cs.ualberta.ca/~joe/Coloring/

If your problem is simpler, see the modifications to your program Ive
made below:

Razi Khaja

On 8/16/07, Edward Wijaya <ewijaya at gmail.com> wrote:
> Dear experts,
>
> I am trying to draw a figures that shows binding sites hits for various
> program (see attached) for example.
>
> Now, I have a problem in creating contrasting colour for each of
> the Programs (MEME, AlignACE, etc).  I want to avoid "graded segments",
> so that I can have more contrasting color, e.g: red, blue, yellow, etc.
>
> Can anybody suggest how can we achieve that?
>
> My full source code can be found here: http://dpaste.com/16985/
> The portion of the script is this:
>
> __BEGIN__
>     my %prog_color = (
>         "Actual"   => 800000,
>         "ALIGNACE" => 230000,
>         "BP"       => 80000,
>         "MDSCAN"   => 5000,
>         "MITRA"    => 10000,
>         "MTSAMP"   => 200000,
>         "SPACE"    => 40000,
>         "NONE"     => 0,
>     );
>
       my %color = ( 'MEME' => 'red', 'ALIGNACE => 'blue');

>     foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
           my( @feild ) = split( /\s+/, $nlist{$seqid} );
           my $prog_name = $feild[3];

>         my $track = $panel->add_track(
>             -glyph     => 'graded_segments',
>             -key       => "SEQ " . $seqid,
>             -connector => "dashed",
>             -label     => 1,
>             -fontcolor => 'red',
               -bgcolor   => $color{ $prog_name },
>             -bump      => +1,
>             -height    => 8,
>             -min_score => 0,
>             -max_score => 500000
>         );
> # rest of the script
> __END__
>
> Regards,
> Edward
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From bix at sendu.me.uk  Thu Aug 16 09:49:52 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 16 Aug 2007 14:49:52 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
References: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <46C45600.4040906@sendu.me.uk>

pratchusha kamireddy wrote:
> I am Pratchusha Kamireddy doing masters in Alabama A&M University. I
> am working under Dr.Kantety in Plant and Soil Science Department.I am
> the beginner to learn perl programming. I need Activeperl software to
> run the perl programs. Can you help me in this regard like: where can
> I dowmload this software, how can i Install this and how can i use
> this. I am eagerlu waiting for your reply.Please help me in this
> regard.

Firstly, Google is your friend:
http://www.google.co.uk/search?q=activeperl

The first hit is the correct one:

http://www.activestate.com/Products/activeperl/


I suppose your next question will be how to install Bioperl (if not, 
you're in the wrong place):

http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
(which also tells you where to get ActivePerl from)


From cjfields at uiuc.edu  Thu Aug 16 10:11:22 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:11:22 -0500
Subject: [Bioperl-l] How to create contrasting colors in every singe
	track - Bio::Graphics
In-Reply-To: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
References: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
Message-ID: <F3E88224-4AA2-451B-97FE-5DED15015FA2@uiuc.edu>


On Aug 15, 2007, at 11:18 PM, Edward Wijaya wrote:

> Dear experts,
>
> I am trying to draw a figures that shows binding sites hits for  
> various
> program (see attached) for example.
>
> Now, I have a problem in creating contrasting colour for each of
> the Programs (MEME, AlignACE, etc).  I want to avoid "graded  
> segments",
> so that I can have more contrasting color, e.g: red, blue, yellow,  
> etc.
>
> Can anybody suggest how can we achieve that?
>
> My full source code can be found here: http://dpaste.com/16985/
> The portion of the script is this:
>
> __BEGIN__
>     my %prog_color = (
>         "Actual"   => 800000,
>         "ALIGNACE" => 230000,
>         "BP"       => 80000,
>         "MDSCAN"   => 5000,
>         "MITRA"    => 10000,
>         "MTSAMP"   => 200000,
>         "SPACE"    => 40000,
>         "NONE"     => 0,
>     );
>
>     foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
>         my $track = $panel->add_track(
>             -glyph     => 'graded_segments',
>             -key       => "SEQ " . $seqid,
>             -connector => "dashed",
>             -label     => 1,
>             -fontcolor => 'red',
>             -bgcolor   => 'blue',
>             -bump      => +1,
>             -height    => 8,
>             -min_score => 0,
>             -max_score => 500000
>         );
> # rest of the script
> __END__
>
> Regards,
> Edward

I think you have two options:

1) Split the seqfeatures into different tracks based on the source  
(AlignACE, MP, etc), then give each it's own graded segment color.  I  
like this personally as it doesn't glob various results together onto  
one track and (at least to me) is easier to maintain.  It also allows  
one more flexibility in using varying scoring schemes.
2) Use a callback for bgcolor which changes the color explicitly  
based on the source/score.

The GenBank/EMBL section of the Bio::Graphics HOWTO reveals how to  
add different tracks, and there are several scattered examples on how  
to use callbacks.

http://www.bioperl.org/wiki/HOWTO:Graphics

chris


From cjfields at uiuc.edu  Thu Aug 16 10:12:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:12:30 -0500
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C41FEC.2000206@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
Message-ID: <5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>


On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:

> neeti somaiya wrote:
>> I tried using Bio::Structure::IO::pdb with some code like :-
>> use Bio::Structure::IO;
>>
>>     $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>>                                    -format => 'pdb');
>>
>>     while ( my $struc = $in->next_structure() ) {
>>        print "Structure ", $struc->id,"\n";
>>     }
>>
>> It works well. But I am not able to find documentation of other  
>> methods
>> which will give me various specific details available in a pdb  
>> file, right
>> from title, keywords, references to structure details, atoms,  
>> coordinates
>> etc. There must be different methods to fetch and parse each of  
>> this data
>> from a pdb file, right? Where can I find the details?
>
> $struct is a Bio::Structure::Entry, so look at the docs for that:
> http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
>
> You'll probably want to look at the docs for the other Structure  
> modules
> as well:
> http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
>
>
> I agree, the documentation in this area could be improved.
> Bio::Structure::StructureI could actually contain something, and
> Bio::Structure should actually exist or not be referenced in the docs.

There was a discussion a while back on refactoring the code within  
Bio::Structure to better deal with HETATM and other stuff.  As far as  
I'm concerned it's open for anyone wanted to tinker with it.

chris


From cjfields at uiuc.edu  Thu Aug 16 10:37:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:37:31 -0500
Subject: [Bioperl-l] Announcement: infernal/erpin/rnamotif parsers
Message-ID: <7CE60504-FA1A-4AFF-A02E-036B8E37C3F9@uiuc.edu>

To anyone using the aforementioned parsers:

I don't plan on continuing development of the Bio::Tools-related  
Infernal, RNAMotif, and ERPIN parsers at this time unless there is  
substantial interest in doing so.  Instead, I plan on focusing my  
efforts on the Bio::SearchIO-based parsers as I feel they are much  
better at representing the data present in the output.  In my opinion  
having two sets of parsers that accomplish essentially the same task  
is redundant and non-productive.  Again, if there is considerable  
interest in keeping them I suggest responding to this message,  
otherwise I would consider them deprecated and removed completely by  
rel 1.7 (maybe sooner).

Infernal: It's very likely that a new stable version (v. 1.0) of  
Infernal will be released in the near future.  I may upgrade the  
Bio::SearchIO-based parser in the meantime to parse the latest  
Infernal output (v 0.81), but I don't plan on supporting pre-1.0  
releases once the final version is out.  Infernal has been in  
developer release for some time now and the program output has  
changed dramatically over time; however, the format is expected to  
solidify once a stable release is made, which makes supporting the  
parser much easier over time.

Questions?  Gripes?

chris


From awitney at sgul.ac.uk  Thu Aug 16 10:07:02 2007
From: awitney at sgul.ac.uk (Adam Witney)
Date: Thu, 16 Aug 2007 15:07:02 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <C2EA1896.17575%awitney@sgul.ac.uk>


This would be the best place to start

http://www.activeperl.org/

Or more specifically for the language:

http://www.activeperl.org/store/activeperl/download/

(Which will require you to register with them)

adam


On 16/8/07 04:45, "pratchusha kamireddy" <pratchusha.kamireddy at aamu.edu>
wrote:

> Hello
>   I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am
> working under Dr.Kantety in Plant and Soil Science Department.I am the
> beginner to learn perl programming. I need Activeperl software to run the perl
> programs. Can you help me in this regard like: where can I dowmload this
> software, how can i Install this and how can i use this. I am eagerlu waiting
> for your reply.Please help me in this regard.
>    Thanking you
>    Pratchusha Kamireddy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From muratem at eng.uah.edu  Thu Aug 16 15:10:34 2007
From: muratem at eng.uah.edu (muratem at eng.uah.edu)
Date: Thu, 16 Aug 2007 14:10:34 -0500 (CDT)
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
Message-ID: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>

Hello

This might not be the correct list for this particular problem, but
hopefully someone can help. I am trying to install ...staden::read on a
Mac OS X 10.4. I tried installing cpan but it wouldn't work so I went to
the manual methods. Perl is on the system and appears to be installed
correctly for a Mac. Bioperl 1.5.2 was installed via fink and appears to
be OK also. I'm trying to install the Bio::SeqIO::staden::read module. I
downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the usual
perl Makefile.PL and make and get:

newyork:/usr/local/bioperl-ext-1.5.1 root# make
Makefile:1148: *** multiple target patterns.  Stop.

A snippet from the Makefile...

   1148 pm_to_blib: $(TO_INST_PM)
   1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
   1150           Bio/Ext/Align/libs/hscore.h
$(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
   1151           Bio/Ext/Align/libs/probability.c
$(INST_LIB)/Bio/Ext/Align/libs/probability.c \
   1152           Bio/Ext/Align/libs/linesubs.h
$(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
   1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/test.pl \
   1154           Bio/Ext/Align/libs/wiseoverlay.h
$(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
   1155           Bio/Ext/Align/libs/proteinsw.h
$(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
   1156           Bio/Ext/Align/libs/wisebase.h
$(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
   1157           Bio/Ext/Align/libs/seqaligndisplay.h
$(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
   1158           Bio/Ext/Align/libs/dyna.h
$(INST_LIB)/Bio/Ext/Align/libs/dyna.h \

The README says you don't have to build the whole package, so I descended
to the staden directory and did a Make and didn't get any problems
reported. But when I did a make test I get:

newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
"test_harness(0, '../blib/lib', '../blib/arch')" test.pl
test....Had problems bootstrapping Inline module 'Bio::SeqIO::staden::read'

Can't load
'/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle'
for module Bio::SeqIO::staden::read:
dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle,
2): Symbol not found: _curl_easy_init
  Referenced from:
/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle
  Expected in: dynamic lookup
 at /Library/Perl/5.8.6/Inline.pm line 500


 at test.pl line 0
INIT failed--call queue aborted, <DATA> line 1.
test....dubious
        Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 1-94
        Failed 94/94 tests, 0.00% okay
Failed Test Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
test.pl      255 65280    94  188 200.00%  1-94
Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00% okay.
make: *** [test_dynamic] Error 2

The missing symbol is apparently from libcurl. I have both libcurl.2.dylib
and libcurl.3.dylib with copies in multiple locations including /usr/lib,
/usr/local/lib and the usual Mac directories. I used the Mac otool to look
at the externals in read.bundle and it references libz.1.dylib and
libSystem.B.dylib. Could this be a case where there should have been a
link to libcurl and wasn't?

I've searched the list and see only the Inline versioning problem (which I
had and fixed). Has anybody seen this problem before or built the module
on a Mac? How did you do it? Is this a question for the Staden list on
sourceforge?

Thanks

Mike


From cjfields at uiuc.edu  Thu Aug 16 15:55:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 14:55:05 -0500
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
In-Reply-To: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
References: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
Message-ID: <9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>


On Aug 16, 2007, at 2:10 PM, muratem at eng.uah.edu wrote:

> Hello
>
> This might not be the correct list for this particular problem, but
> hopefully someone can help. I am trying to install ...staden::read  
> on a
> Mac OS X 10.4. I tried installing cpan but it wouldn't work so I  
> went to
> the manual methods. Perl is on the system and appears to be installed
> correctly for a Mac. Bioperl 1.5.2 was installed via fink and  
> appears to
> be OK also. I'm trying to install the Bio::SeqIO::staden::read  
> module. I
> downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the  
> usual
> perl Makefile.PL and make and get:
>
> newyork:/usr/local/bioperl-ext-1.5.1 root# make
> Makefile:1148: *** multiple target patterns.  Stop.
>
> A snippet from the Makefile...
>
>    1148 pm_to_blib: $(TO_INST_PM)
>    1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
> 'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
>    1150           Bio/Ext/Align/libs/hscore.h
> $(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
>    1151           Bio/Ext/Align/libs/probability.c
> $(INST_LIB)/Bio/Ext/Align/libs/probability.c \
>    1152           Bio/Ext/Align/libs/linesubs.h
> $(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
>    1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/ 
> test.pl \
>    1154           Bio/Ext/Align/libs/wiseoverlay.h
> $(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
>    1155           Bio/Ext/Align/libs/proteinsw.h
> $(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
>    1156           Bio/Ext/Align/libs/wisebase.h
> $(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
>    1157           Bio/Ext/Align/libs/seqaligndisplay.h
> $(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
>    1158           Bio/Ext/Align/libs/dyna.h
> $(INST_LIB)/Bio/Ext/Align/libs/dyna.h \
>
> The README says you don't have to build the whole package, so I  
> descended
> to the staden directory and did a Make and didn't get any problems
> reported. But when I did a make test I get:
>
> newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
> "test_harness(0, '../blib/lib', '../blib/arch')" test.pl
> test....Had problems bootstrapping Inline module  
> 'Bio::SeqIO::staden::read'
>
> Can't load
> '/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/ 
> Bio/SeqIO/staden/read/read.bundle'
> for module Bio::SeqIO::staden::read:
> dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/ 
> auto/Bio/SeqIO/staden/read/read.bundle,
> 2): Symbol not found: _curl_easy_init
>   Referenced from:
> /usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/ 
> SeqIO/staden/read/read.bundle
>   Expected in: dynamic lookup
>  at /Library/Perl/5.8.6/Inline.pm line 500
>
>
>  at test.pl line 0
> INIT failed--call queue aborted, <DATA> line 1.
> test....dubious
>         Test returned status 255 (wstat 65280, 0xff00)
> DIED. FAILED tests 1-94
>         Failed 94/94 tests, 0.00% okay
> Failed Test Stat Wstat Total Fail  Failed  List of Failed
> ---------------------------------------------------------------------- 
> ---------
> test.pl      255 65280    94  188 200.00%  1-94
> Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00%  
> okay.
> make: *** [test_dynamic] Error 2
>
> The missing symbol is apparently from libcurl. I have both libcurl. 
> 2.dylib
> and libcurl.3.dylib with copies in multiple locations including / 
> usr/lib,
> /usr/local/lib and the usual Mac directories. I used the Mac otool  
> to look
> at the externals in read.bundle and it references libz.1.dylib and
> libSystem.B.dylib. Could this be a case where there should have been a
> link to libcurl and wasn't?
>
> I've searched the list and see only the Inline versioning problem  
> (which I
> had and fixed). Has anybody seen this problem before or built the  
> module
> on a Mac? How did you do it? Is this a question for the Staden list on
> sourceforge?
>
> Thanks
>
> Mike

Haven't seen the problem you list.  I have installed it on Mac OS X  
(intel) w/o problems so I know it works; at least all tests passed  
though I remember Inline complaining for some reason.

You should try using bioperl-ext from CVS (it is really 1.5.1 but  
with updated docs and maybe a change or two).  The process is a  
little tricky but is documented in the README in the package.  You'll  
need the old io_lib (1.8.12 or earlier) from Staden if memory serves.

chris


From zhaodj at ioz.ac.cn  Thu Aug 16 22:13:16 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 17 Aug 2007 10:13:16 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
Message-ID: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>

Dear list members,

I have a question about the methods of bioperl objects.It is how and
where we can get the whole methods of a bioperl object.

Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
this object, some sample codes are given.The following five clauses
are excerpted from the synopsis.
(1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
(2)while ( my @rids = $factory->each_rid ) {
(3)$factory->remove_rid($rid);
(4)my $rc = $factory->retrieve_blast($rid);
(5)my $r = $factory->submit_blast($input);

The five clauses use five methods of the RemoteBlast object,i.e.
(1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
(5)submit_blast. However,I only find part of them(45) are listed in
the appendix while others(123) are absent. Are there some more
methods not explictly declared? I don't know.This will lead to the
partial understanding and utilization of the module.Therefore I come
here for the way to get the full methods of a bioperl object.

Thanks!
-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From zhaodj at ioz.ac.cn  Thu Aug 16 22:13:16 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 17 Aug 2007 10:13:16 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
Message-ID: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>

Dear list members,

I have a question about the methods of bioperl objects.It is how and
where we can get the whole methods of a bioperl object.

Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
this object, some sample codes are given.The following five clauses
are excerpted from the synopsis.
(1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
(2)while ( my @rids = $factory->each_rid ) {
(3)$factory->remove_rid($rid);
(4)my $rc = $factory->retrieve_blast($rid);
(5)my $r = $factory->submit_blast($input);

The five clauses use five methods of the RemoteBlast object,i.e.
(1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
(5)submit_blast. However,I only find part of them(45) are listed in
the appendix while others(123) are absent. Are there some more
methods not explictly declared? I don't know.This will lead to the
partial understanding and utilization of the module.Therefore I come
here for the way to get the full methods of a bioperl object.

Thanks!
-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From neetisomaiya at gmail.com  Fri Aug 17 02:23:08 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 17 Aug 2007 11:53:08 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
Message-ID: <764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>

Hi,

My main concern is just the pdb id and title. PDB id I am able to fetch
easily, but is there a method which can give me the title of the PDB
structure?

Like for example from the following :-

HEADER    DNA/RNA                                 05-DEC-94   100D
TITLE     CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
TITLE    2 R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO
TITLE    3 PHOSPHATE ONLY AND MINOR GROOVE TERTIARY BASE-PAIRING
COMPND    MOL_ID: 1;
COMPND   2 MOLECULE: DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*GP*)-
COMPND   3 R(*G)-3');
COMPND   4 CHAIN: A, B;
.
.
.
.

I just want "CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO PHOSPHATE ONLY AND
MINOR GROOVE TERTIARY BASE-PAIRING".

Thanks,
Neeti.

On 8/16/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
> On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:
>
> > neeti somaiya wrote:
> >> I tried using Bio::Structure::IO::pdb with some code like :-
> >> use Bio::Structure::IO;
> >>
> >>     $in  = Bio::Structure::IO->new(-file => " pdb100d.ent",
> >>                                    -format => 'pdb');
> >>
> >>     while ( my $struc = $in->next_structure() ) {
> >>        print "Structure ", $struc->id,"\n";
> >>     }
> >>
> >> It works well. But I am not able to find documentation of other
> >> methods
> >> which will give me various specific details available in a pdb
> >> file, right
> >> from title, keywords, references to structure details, atoms,
> >> coordinates
> >> etc. There must be different methods to fetch and parse each of
> >> this data
> >> from a pdb file, right? Where can I find the details?
> >
> > $struct is a Bio::Structure::Entry, so look at the docs for that:
> > http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
> >
> > You'll probably want to look at the docs for the other Structure
> > modules
> > as well:
> > http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
> >
> >
> > I agree, the documentation in this area could be improved.
> > Bio::Structure::StructureI could actually contain something, and
> > Bio::Structure should actually exist or not be referenced in the docs.
>
> There was a discussion a while back on refactoring the code within
> Bio::Structure to better deal with HETATM and other stuff.  As far as
> I'm concerned it's open for anyone wanted to tinker with it.
>
> chris
>


-- 
-Neeti
Even my blood says, B positive


From alexl at users.sourceforge.net  Fri Aug 17 03:22:16 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 00:22:16 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
Message-ID: <cg3ayi39sn.fsf@allele2.localdomain>

Hi all,

I'd like to clarify the license of bioperl.  Currently the LICENSE
only includes the text of the Artistic artist.  But the wiki
http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F
says:

 BioPerl is licensed under the same terms as Perl itself which is the
 Perl Artistic License (see
 http://www.perl.com/pub/a/language/misc/Artistic.html or
 http://www.opensource.org/licenses/artistic-license.html

and most of the modules in the source say:

 "You may distribute this module under the same terms as perl itself"

But the current distribution of Perl is actually dually-licensed under
the GPL or Artistic licenses (so the wiki is technically out of sync
with the "same terms as Perl itself"), see:

 http://dev.perl.org/licenses/

I assume that the intent of the bioperl authors is to license with the
same terms as Perl's *current* license (which would mean bioperl is
really effectively dually-licensed under the GPL or Artistic license).
If so, it would be good if the LICENSE text and the wiki were updated
to reflect this.

Also some of the source modules say "under the same terms as perl
itself", but then only mention the Artistic license.

This has important ramifications for distribution: I maintain the
Fedora package for bioperl and I have currently listed the license of
bioperl as "GPL or Artistic".  But if bioperl were distributed under
the Artistic license only then I would have to pull the package from
the distribution, because the Artistic 1.0 (original)-only license is
deprecated (but "GPL or Artistic" is OK):

http://fedoraproject.org/wiki/Licensing#head-d8cc605dd386091c8b6be97b8a43fb6a5d624ae1

Thanks!

Alex


From alexl at users.sourceforge.net  Fri Aug 17 03:42:07 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 00:42:07 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <cg3ayi39sn.fsf@allele2.localdomain> (Alex Lancaster's message of
	"Fri\, 17 Aug 2007 00\:22\:16 -0700")
References: <cg3ayi39sn.fsf@allele2.localdomain>
Message-ID: <nrsl6i1ub4.fsf@allele2.localdomain>

>>>>> "AL" == Alex Lancaster  writes:

[...]

AL> I assume that the intent of the bioperl authors is to license with
AL> the same terms as Perl's *current* license (which would mean
AL> bioperl is really effectively dually-licensed under the GPL or
AL> Artistic license).  If so, it would be good if the LICENSE text
AL> and the wiki were updated to reflect this.

Also note that since Perl's license is a dual-license "GPL or
Artistic" then people aren't required to submit their modifications
back to the bioperl distribution because they can choose to follow the
Artistic (rather than the GPL) license which doesn't require
modifications to be submitted back.  This means the point:

 "If you fix bugs, please let us know about them. This is not the GPL
 license so you are not required to submit the code fixes, but in the
 spirit of making a better product we hope you'll contribute back to
 the community any insight or code improvements."

listed here:

 http://www.bioperl.org/wiki/Licensing_BioPerl

would still stand, because you can choose the Artistic license, but
you could modify the clause to say:

 "If you fix bugs, please let us know about them. Because Bioperl is
 dual-licensed under the GPL or Artistic licenses, you can choose the
 Artistic license, which means that you are not required to submit the
 code fixes, but in the spirit of making a better product we hope
 you'll contribute back to the community any insight or code
 improvements."


From n.haigh at sheffield.ac.uk  Fri Aug 17 06:27:43 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 17 Aug 2007 11:27:43 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <46C5781F.60301@sheffield.ac.uk>

De-Jian,ZHAO wrote:
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
>   


You should check out the Deobfuscator at:
http://bioperl.org/cgi-bin/deob_interface.cgi

Search and choose the object of choice. e.g. Bio::Tools::Run::RemoteBlast

You will be provided a list of methods available to that object,
including all the methods up the inheritance hierarchy. Unfortunately,
some bioperl modules are documented more thoroughly than others.

Nath


From neetisomaiya at gmail.com  Fri Aug 17 06:42:09 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 17 Aug 2007 16:12:09 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
Message-ID: <764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>

Hi,

I have done it currently as follows :

 while ( my $struc = $in->next_structure() )
                {
                        my $title;

                        my $pdb_id = $struc->id;
                        print "Structure ", $pdb_id,"\n";

                        my $ac = $struc->annotation();

                        foreach my $key ( $ac->get_all_annotation_keys() )
                        {
                                if($key eq "title")
                                {
                                        my @values =
$ac->get_Annotations($key);
                                        foreach my $value (@values)
                                        {
                                                $title = $value->as_text;
                                                chomp($title);
                                                if($title =~ /Value\: (.*)/)
                                                {
                                                        $title = $1;
                                                }
                                                $title =~ s/\s+/ /g;

                                                print "Title ",$title,"\n";
                                                last;
                                        }
                                        last;
                                }
                  }
}

Is this ok?

On 8/17/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> My main concern is just the pdb id and title. PDB id I am able to fetch
> easily, but is there a method which can give me the title of the PDB
> structure?
>
> Like for example from the following :-
>
> HEADER    DNA/RNA                                 05-DEC-94   100D
> TITLE     CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
> TITLE    2 R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO
> TITLE    3 PHOSPHATE ONLY AND MINOR GROOVE TERTIARY BASE-PAIRING
> COMPND    MOL_ID: 1;
> COMPND   2 MOLECULE: DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*GP*)-
> COMPND   3 R(*G)-3');
> COMPND   4 CHAIN: A, B;
> .
> .
> .
> .
>
> I just want "CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
> R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO PHOSPHATE ONLY AND
> MINOR GROOVE TERTIARY BASE-PAIRING".
>
> Thanks,
> Neeti.
>
> On 8/16/07, Chris Fields <cjfields at uiuc.edu> wrote:
> >
> >
> > On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:
> >
> > > neeti somaiya wrote:
> > >> I tried using Bio::Structure::IO::pdb with some code like :-
> > >> use Bio::Structure::IO;
> > >>
> > >>     $in  = Bio::Structure::IO->new(-file => " pdb100d.ent",
> > >>                                    -format => 'pdb');
> > >>
> > >>     while ( my $struc = $in->next_structure() ) {
> > >>        print "Structure ", $struc->id,"\n";
> > >>     }
> > >>
> > >> It works well. But I am not able to find documentation of other
> > >> methods
> > >> which will give me various specific details available in a pdb
> > >> file, right
> > >> from title, keywords, references to structure details, atoms,
> > >> coordinates
> > >> etc. There must be different methods to fetch and parse each of
> > >> this data
> > >> from a pdb file, right? Where can I find the details?
> > >
> > > $struct is a Bio::Structure::Entry, so look at the docs for that:
> > > http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
> > >
> > > You'll probably want to look at the docs for the other Structure
> > > modules
> > > as well:
> > > http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
> > >
> > >
> > > I agree, the documentation in this area could be improved.
> > > Bio::Structure::StructureI could actually contain something, and
> > > Bio::Structure should actually exist or not be referenced in the docs.
> >
> >
> > There was a discussion a while back on refactoring the code within
> > Bio::Structure to better deal with HETATM and other stuff.  As far as
> > I'm concerned it's open for anyone wanted to tinker with it.
> >
> > chris
> >
>
>
>
> --
> -Neeti
> Even my blood says, B positive
>


-- 
-Neeti
Even my blood says, B positive


From n.haigh at sheffield.ac.uk  Fri Aug 17 06:27:43 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 17 Aug 2007 11:27:43 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <46C5781F.60301@sheffield.ac.uk>

De-Jian,ZHAO wrote:
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
>   


You should check out the Deobfuscator at:
http://bioperl.org/cgi-bin/deob_interface.cgi

Search and choose the object of choice. e.g. Bio::Tools::Run::RemoteBlast

You will be provided a list of methods available to that object,
including all the methods up the inheritance hierarchy. Unfortunately,
some bioperl modules are documented more thoroughly than others.

Nath


From bix at sendu.me.uk  Fri Aug 17 09:35:01 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 17 Aug 2007 14:35:01 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	
	<46C41FEC.2000206@sendu.me.uk>	
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>	
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
	<764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
Message-ID: <46C5A405.2070005@sendu.me.uk>

neeti somaiya wrote:
> Hi,
> 
> I have done it currently as follows :
[snip]
> Is this ok?

If it works, of course. There seems to be some redundant code there, 
however. I'm guessing this would be better (assuming your code worked in 
the first place):

while (my $struc = $in->next_structure()) {
     my $pdb_id = $struc->id;
     print "Structure ", $pdb_id,"\n";

     my $ac = $struc->annotation();
     my ($title) = $ac->get_Annotations('title');
     $title = $title->as_text;
     chomp($title);
     if ($title =~ /Value\: (.*)/) {
         $title = $1;
     }
     $title =~ s/\s+/ /g;

     print "Title ",$title,"\n";
}


From muratem at eng.uah.edu  Fri Aug 17 10:03:22 2007
From: muratem at eng.uah.edu (Mike Muratet)
Date: Fri, 17 Aug 2007 09:03:22 -0500 (CDT)
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
In-Reply-To: <9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>
References: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
	<9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>
Message-ID: <Pine.GSO.4.60.0708170902570.23859@eng.uah.edu>


On Thu, 16 Aug 2007, Chris Fields wrote:

> Date: Thu, 16 Aug 2007 14:55:05 -0500
> From: Chris Fields <cjfields at uiuc.edu>
> To: muratem at eng.uah.edu
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
> 
>
> On Aug 16, 2007, at 2:10 PM, muratem at eng.uah.edu wrote:
>
>> Hello
>> 
>> This might not be the correct list for this particular problem, but
>> hopefully someone can help. I am trying to install ...staden::read on a
>> Mac OS X 10.4. I tried installing cpan but it wouldn't work so I went to
>> the manual methods. Perl is on the system and appears to be installed
>> correctly for a Mac. Bioperl 1.5.2 was installed via fink and appears to
>> be OK also. I'm trying to install the Bio::SeqIO::staden::read module. I
>> downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the usual
>> perl Makefile.PL and make and get:
>> 
>> newyork:/usr/local/bioperl-ext-1.5.1 root# make
>> Makefile:1148: *** multiple target patterns.  Stop.
>> 
>> A snippet from the Makefile...
>> 
>>    1148 pm_to_blib: $(TO_INST_PM)
>>    1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
>> 'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
>>    1150           Bio/Ext/Align/libs/hscore.h
>> $(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
>>    1151           Bio/Ext/Align/libs/probability.c
>> $(INST_LIB)/Bio/Ext/Align/libs/probability.c \
>>    1152           Bio/Ext/Align/libs/linesubs.h
>> $(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
>>    1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/test.pl 
>> \
>>    1154           Bio/Ext/Align/libs/wiseoverlay.h
>> $(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
>>    1155           Bio/Ext/Align/libs/proteinsw.h
>> $(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
>>    1156           Bio/Ext/Align/libs/wisebase.h
>> $(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
>>    1157           Bio/Ext/Align/libs/seqaligndisplay.h
>> $(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
>>    1158           Bio/Ext/Align/libs/dyna.h
>> $(INST_LIB)/Bio/Ext/Align/libs/dyna.h \
>> 
>> The README says you don't have to build the whole package, so I descended
>> to the staden directory and did a Make and didn't get any problems
>> reported. But when I did a make test I get:
>> 
>> newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
>> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
>> "test_harness(0, '../blib/lib', '../blib/arch')" test.pl
>> test....Had problems bootstrapping Inline module 
>> 'Bio::SeqIO::staden::read'
>> 
>> Can't load
>> '/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/ 
>> Bio/SeqIO/staden/read/read.bundle'
>> for module Bio::SeqIO::staden::read:
>> dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/ 
>> auto/Bio/SeqIO/staden/read/read.bundle,
>> 2): Symbol not found: _curl_easy_init
>>   Referenced from:
>> /usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/ 
>> SeqIO/staden/read/read.bundle
>>   Expected in: dynamic lookup
>>  at /Library/Perl/5.8.6/Inline.pm line 500
>> 
>> 
>>  at test.pl line 0
>> INIT failed--call queue aborted, <DATA> line 1.
>> test....dubious
>>         Test returned status 255 (wstat 65280, 0xff00)
>> DIED. FAILED tests 1-94
>>         Failed 94/94 tests, 0.00% okay
>> Failed Test Stat Wstat Total Fail  Failed  List of Failed
>> ---------------------------------------------------------------------- 
>> ---------
>> test.pl      255 65280    94  188 200.00%  1-94
>> Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00% okay.
>> make: *** [test_dynamic] Error 2
>> 
>> The missing symbol is apparently from libcurl. I have both libcurl.2.dylib
>> and libcurl.3.dylib with copies in multiple locations including /usr/lib,
>> /usr/local/lib and the usual Mac directories. I used the Mac otool to look
>> at the externals in read.bundle and it references libz.1.dylib and
>> libSystem.B.dylib. Could this be a case where there should have been a
>> link to libcurl and wasn't?
>> 
>> I've searched the list and see only the Inline versioning problem (which I
>> had and fixed). Has anybody seen this problem before or built the module
>> on a Mac? How did you do it? Is this a question for the Staden list on
>> sourceforge?
>> 
>> Thanks
>> 
>> Mike
>
> Haven't seen the problem you list.  I have installed it on Mac OS X (intel) 
> w/o problems so I know it works; at least all tests passed though I remember 
> Inline complaining for some reason.
>
> You should try using bioperl-ext from CVS (it is really 1.5.1 but with 
> updated docs and maybe a change or two).  The process is a little tricky but 
> is documented in the README in the package.  You'll need the old io_lib 
> (1.8.12 or earlier) from Staden if memory serves.
>
> chris
>

Thanks, I'll give that a try.

Mike


From alexl at users.sourceforge.net  Fri Aug 17 11:23:33 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 08:23:33 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	(Kevin Brown's message of "Fri\, 17 Aug 2007 08\:11\:40 -0700")
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
Message-ID: <n9ir7e18y2.fsf@allele2.localdomain>

>>>>> "KB" == Kevin Brown  writes:

[...]

>> Also note that since Perl's license is a dual-license "GPL or
>> Artistic" then people aren't required to submit their modifications
>> back to the bioperl distribution because they can choose to follow
>> the Artistic (rather than the GPL) license which doesn't require
>> modifications to be submitted back.  This means the point:

KB> You aren't required to submit patches even under the GPL.  If I
KB> make changes and don't distribute them then I have no requirement
KB> to reveal my changes to the bioperl source code.  Also the GPL
KB> does not require that the code be made freely available to all,
KB> just that users of GPL'd software can request the source from the
KB> vendor/distributor and should not find lots of little hoops to
KB> jump through to get it.  You can even charge to get access if that
KB> charge is to cover the cost of the expense to get it (such as the
KB> cost of a cd + mail delivery charge).

Sure, I was just pointing out that you can avoid even these things if
you choose the Artistic license.  I have no problem with the GPL, but
some people do.  The other possibility (if the current Perl "GPL or
Artistic" is not a possibility) is simply upgrading to the "Artistic
2.0" license adopted by the Perl Foundation for Perl 6 and later (I
think?):

http://www.perlfoundation.org/artistic_license_2_0

it's a GPL-compatible free software license.

Alex


From Kevin.M.Brown at asu.edu  Fri Aug 17 11:11:40 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 17 Aug 2007 08:11:40 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <nrsl6i1ub4.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
Message-ID: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>

> AL> I assume that the intent of the bioperl authors is to 
> license with 
> AL> the same terms as Perl's *current* license (which would 
> mean bioperl 
> AL> is really effectively dually-licensed under the GPL or Artistic 
> AL> license).  If so, it would be good if the LICENSE text 
> and the wiki 
> AL> were updated to reflect this.
> 
> Also note that since Perl's license is a dual-license "GPL or 
> Artistic" then people aren't required to submit their 
> modifications back to the bioperl distribution because they 
> can choose to follow the Artistic (rather than the GPL) 
> license which doesn't require modifications to be submitted 
> back.  This means the point:

You aren't required to submit patches even under the GPL.  If I make
changes and don't distribute them then I have no requirement to reveal
my changes to the bioperl source code.  Also the GPL does not require
that the code be made freely available to all, just that users of GPL'd
software can request the source from the vendor/distributor and should
not find lots of little hoops to jump through to get it.  You can even
charge to get access if that charge is to cover the cost of the expense
to get it (such as the cost of a cd + mail delivery charge).


From cjfields at uiuc.edu  Fri Aug 17 12:07:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 17 Aug 2007 11:07:47 -0500
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <n9ir7e18y2.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
Message-ID: <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>


On Aug 17, 2007, at 10:23 AM, Alex Lancaster wrote:

>>>>>> "KB" == Kevin Brown  writes:
>
> [...]
>
>>> Also note that since Perl's license is a dual-license "GPL or
>>> Artistic" then people aren't required to submit their modifications
>>> back to the bioperl distribution because they can choose to follow
>>> the Artistic (rather than the GPL) license which doesn't require
>>> modifications to be submitted back.  This means the point:
>
> KB> You aren't required to submit patches even under the GPL.  If I
> KB> make changes and don't distribute them then I have no requirement
> KB> to reveal my changes to the bioperl source code.  Also the GPL
> KB> does not require that the code be made freely available to all,
> KB> just that users of GPL'd software can request the source from the
> KB> vendor/distributor and should not find lots of little hoops to
> KB> jump through to get it.  You can even charge to get access if that
> KB> charge is to cover the cost of the expense to get it (such as the
> KB> cost of a cd + mail delivery charge).
>
> Sure, I was just pointing out that you can avoid even these things if
> you choose the Artistic license.  I have no problem with the GPL, but
> some people do.  The other possibility (if the current Perl "GPL or
> Artistic" is not a possibility) is simply upgrading to the "Artistic
> 2.0" license adopted by the Perl Foundation for Perl 6 and later (I
> think?):
>
> http://www.perlfoundation.org/artistic_license_2_0
>
> it's a GPL-compatible free software license.
>
> Alex

Switching to Artistic 2.0 is probably the best way to go.  We'll need  
a more involved discussion but I don't think there'll be too many  
objections.  You mention GPL-compatibility; is that for v2 and v3?

chris


From gonzaled at tcd.ie  Fri Aug 17 13:03:35 2007
From: gonzaled at tcd.ie (David Gonzalez)
Date: Fri, 17 Aug 2007 18:03:35 +0100
Subject: [Bioperl-l] Bio::SeqIO::swiss species parsing bug?
Message-ID: <46C5D4E7.6000605@tcd.ie>

	Hi,

	I had a problem with a swissprot file in which the genus and species
were being left undefined, and I believe it could be a bug in the
swiss.pm module.


	When I tried to parse the file with Bio::SeqIO, I got the following
error messages:

Use of uninitialized value in pattern match (m//) at
/sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 965, <GEN0> line 12.
Use of uninitialized value in string eq at
/sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 967, <GEN0> line 12.

	The fields I wanted from the file (gene_id , etc.. ) were fine however,
so it was being parsed.

	I checked the output with Data::Dumper and I found the following in the
species entry; the species is left undefined, and the common name is absent.

 	'species' => bless( {
                             '_ncbi_taxid' => 'Not',
                             '_classification' => [
                                                   	undef,
                                                   	undef,
                                                   	'Aedes',
                                                  						    	'Culicini',
                                                        'Culicinae',
                                                        'Culicidae',
                                                        'Culicoidea',
                                                        'Nematocera',
                                                        'Diptera',
                                                        'Endopterygota',
                                                        'Neoptera',
                                                        'Pterygota',
                                                        'Insecta',
                                                        'Hexapoda',
                               							'Arthropoda',
                                         							'Metazoa',
                                                        'Eukaryota'
                                                            ]
                                     }, 'Bio::Species' ),

	The species line in the file is formatted according to the swissprot
specifications and includes a common name

OS   Aedes aegypti (yellow fever mosquito)
OC   Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; Neoptera;
OC   Endopterygota; Diptera; Nematocera; Culicoidea; Culicidae; Culicinae;
OC   Culicini; Aedes.
OX   NCBI_TaxID=Not defined;

	I think the problem is in the line 905 of the swiss.pm file:

902	if(/^OS\s+(\S.+)/ && (! defined($binomial))) {
903	    $osline .= " " if $osline;
904	    $osline .= $1;
905	    if($osline =~ s/(,|, and|\.)$//) {
906		($binomial, $descr) = $osline =~ /(\S[^\(]+)(.*)/;
907             ($ns_name) = $binomial;
908             $ns_name =~ s/\s+$//; #####


	The problem seems to be that there are no punctuation signs, so 905
returns false. The swissprot format does not require the line to end in
'.' I think although it normally does. By just removing the requirement
for the substitution the output of Data::Dumper seemed normal

	....
	'_common_name' => 'yellow fever mosquito',
        '_ncbi_taxid' => 'Not',
        '_classification' => [
                              'aegypti',
                              'Aedes',
                              'Culicini',
	....

	I am using the fink installed bioperl:
	bioperl-pm586   1.4-5   Perl module for biology

	I don't know if this has  been reported/solved in the newer versions of
bioperl.

	David

-- 
David Gonzalez Knowles
Smurfit Institute of Genetics
Trinity College
Dublin


From cjfields at uiuc.edu  Fri Aug 17 13:20:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 17 Aug 2007 12:20:21 -0500
Subject: [Bioperl-l] Bio::SeqIO::swiss species parsing bug?
In-Reply-To: <46C5D4E7.6000605@tcd.ie>
References: <46C5D4E7.6000605@tcd.ie>
Message-ID: <04912FDE-2AA4-414C-9CE4-A0BA5E9C89C9@uiuc.edu>


On Aug 17, 2007, at 12:03 PM, David Gonzalez wrote:

> 	Hi,
>
> 	I had a problem with a swissprot file in which the genus and species
> were being left undefined, and I believe it could be a bug in the
> swiss.pm module.
>
>
> 	When I tried to parse the file with Bio::SeqIO, I got the following
> error messages:
>
> Use of uninitialized value in pattern match (m//) at
> /sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 965, <GEN0> line 12.
> Use of uninitialized value in string eq at
> /sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 967, <GEN0> line 12.
> ...
> 	I am using the fink installed bioperl:
> 	bioperl-pm586   1.4-5   Perl module for biology
>
> 	I don't know if this has  been reported/solved in the newer  
> versions of
> bioperl.
>
> 	David
>
> -- 
> David Gonzalez Knowles
> Smurfit Institute of Genetics
> Trinity College
> Dublin

That looks like bioperl 1.4, which is several years old.  You should  
update to the latest official release (1.5.2), then see if the  
problem persists.

chris


From alexl at users.sourceforge.net  Sat Aug 18 07:33:34 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Sat, 18 Aug 2007 04:33:34 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> (Chris Fields's
	message of "Fri\, 17 Aug 2007 11\:07\:47 -0500")
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
Message-ID: <8td4xlyt4h.fsf@allele2.localdomain>

>>>>> "CF" == Chris Fields  writes:

[...]

>> Sure, I was just pointing out that you can avoid even these things
>> if you choose the Artistic license.  I have no problem with the
>> GPL, but some people do.  The other possibility (if the current
>> Perl "GPL or Artistic" is not a possibility) is simply upgrading to
>> the "Artistic 2.0" license adopted by the Perl Foundation for Perl
>> 6 and later (I think?):

>> http://www.perlfoundation.org/artistic_license_2_0

>> it's a GPL-compatible free software license.

CF> Switching to Artistic 2.0 is probably the best way to go.  We'll
CF> need a more involved discussion but I don't think there'll be too
CF> many objections.  You mention GPL-compatibility; is that for v2
CF> and v3?

IANAL, but looking at:

http://www.perlfoundation.org/artistic_2_0_notes

http://www.gnu.org/licenses/license-list.html (scroll down to
"Artistic 2.0")

it looks like you can choose any GPL license (i.e. v1 to v3).

I was really more concerned with clarifying what the bioperl license
was *right now*, because "the same license as Perl" implies the
so-called "disjunctive" "GPL or Artistic license":

http://www.gnu.org/licenses/license-list.html#PerlLicense

which is what I've marked the Fedora package as (since it listed "the
same license as Perl" in most of the source files), which is fine for
Fedora.

Fedora may possibly (still under discussion I believe) require removal
of any package that is licensed under the original (1.0) Artistic
alone and it would be a real shame if that required bioperl being
pulled from the repo.  I imagine the intent of the bioperl
contributors is that it should be under the same terms as Perl,
whatever that happens to be (which just happens to be GPL or Artistic,
which is fine).  A clarification to that effect would be useful.

Cheers,
Alex


From zhaodj at ioz.ac.cn  Sat Aug 18 11:06:41 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Sat, 18 Aug 2007 23:06:41 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <46C5781F.60301@sheffield.ac.uk>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
	<46C5781F.60301@sheffield.ac.uk>
Message-ID: <52869.159.226.67.49.1187449601.squirrel@mail.ioz.ac.cn>

Thank you,Nathan.
The Deobfuscator is very helpful.

On Fri, Aug 17, 2007 18:27, Nathan Haigh wrote:
> De-Jian,ZHAO wrote:
>> Dear list members,
>>
>> I have a question about the methods of bioperl objects.It is how
>> and
>> where we can get the whole methods of a bioperl object.
>>
>> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
>> this object, some sample codes are given.The following five
>> clauses
>> are excerpted from the synopsis.
>> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>> (2)while ( my @rids = $factory->each_rid ) {
>> (3)$factory->remove_rid($rid);
>> (4)my $rc = $factory->retrieve_blast($rid);
>> (5)my $r = $factory->submit_blast($input);
>>
>> The five clauses use five methods of the RemoteBlast object,i.e.
>> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
>> (5)submit_blast. However,I only find part of them(45) are listed
>> in
>> the appendix while others(123) are absent. Are there some more
>> methods not explictly declared? I don't know.This will lead to the
>> partial understanding and utilization of the module.Therefore I
>> come
>> here for the way to get the full methods of a bioperl object.
>>
>> Thanks!
>>
>
>
> You should check out the Deobfuscator at:
> http://bioperl.org/cgi-bin/deob_interface.cgi
>
> Search and choose the object of choice. e.g.
> Bio::Tools::Run::RemoteBlast
>
> You will be provided a list of methods available to that object,
> including all the methods up the inheritance hierarchy.
> Unfortunately,
> some bioperl modules are documented more thoroughly than others.
>
> Nath
>


From hlapp at gmx.net  Sat Aug 18 12:13:28 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 18 Aug 2007 12:13:28 -0400
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <8td4xlyt4h.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
	<8td4xlyt4h.fsf@allele2.localdomain>
Message-ID: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>


On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote:

> I imagine the intent of the bioperl
> contributors is that it should be under the same terms as Perl,
> whatever that happens to be (which just happens to be GPL or Artistic,
> which is fine).

I fully agree.

>   A clarification to that effect would be useful.

Agreed, too. Would you mind changing that language on the wiki, since  
you seem to have a fairly good grasp on the issue?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Aug 18 12:42:04 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 18 Aug 2007 11:42:04 -0500
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
	<8td4xlyt4h.fsf@allele2.localdomain>
	<8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>
Message-ID: <D3B67BC2-CB56-420F-B4E3-E0A57FEA7E80@uiuc.edu>


On Aug 18, 2007, at 11:13 AM, Hilmar Lapp wrote:

>
> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote:
>
>> I imagine the intent of the bioperl
>> contributors is that it should be under the same terms as Perl,
>> whatever that happens to be (which just happens to be GPL or  
>> Artistic,
>> which is fine).
>
> I fully agree.
>
>>   A clarification to that effect would be useful.
>
> Agreed, too. Would you mind changing that language on the wiki, since
> you seem to have a fairly good grasp on the issue?
>
> 	-hilmar

Looks like the modules mostly state 'You may distribute this module  
under the same terms as perl itself', but there are likely a few  
which need to be changed.  Might be worth running a quick code audit  
to see what's present.

chris


From avilella at gmail.com  Sat Aug 18 16:38:10 2007
From: avilella at gmail.com (Albert Vilella)
Date: Sat, 18 Aug 2007 21:38:10 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <358f4d650708181338s5a5caadbscfa85786327f4304@mail.gmail.com>

I particularly like to code and debug at the same time. When you are using
the perl debugger, you can do an:

<DB> m $object

and it will show up all the information and methods for that object.

Cheers,

    Albert.

On 8/17/07, De-Jian,ZHAO <zhaodj at ioz.ac.cn> wrote:
>
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
> --
> De-Jian Zhao
> Institute of Zoology,Chinese Academy of Sciences
> +86-10-64807217
> zhaodj at ioz.ac.cn
>
>
>
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From avilella at gmail.com  Sat Aug 18 16:38:10 2007
From: avilella at gmail.com (Albert Vilella)
Date: Sat, 18 Aug 2007 21:38:10 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <358f4d650708181338s5a5caadbscfa85786327f4304@mail.gmail.com>

I particularly like to code and debug at the same time. When you are using
the perl debugger, you can do an:

<DB> m $object

and it will show up all the information and methods for that object.

Cheers,

    Albert.

On 8/17/07, De-Jian,ZHAO <zhaodj at ioz.ac.cn> wrote:
>
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
> --
> De-Jian Zhao
> Institute of Zoology,Chinese Academy of Sciences
> +86-10-64807217
> zhaodj at ioz.ac.cn
>
>
>
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From neetisomaiya at gmail.com  Mon Aug 20 00:33:17 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 10:03:17 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C5A405.2070005@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
	<764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
Message-ID: <764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>

Hi,

Thanks for the responses.
Another question I had was, I am interested only in pdb id and title, and
for this I am downloading and unzipping each of the full pdb structure
files, parsing to get just id and title. Is there any other data source
which can give me just id and title of pdb structures, without me having to
download the full file of each structre?

Thanks,
Neeti.

On 8/17/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> neeti somaiya wrote:
> > Hi,
> >
> > I have done it currently as follows :
> [snip]
> > Is this ok?
>
> If it works, of course. There seems to be some redundant code there,
> however. I'm guessing this would be better (assuming your code worked in
> the first place):
>
> while (my $struc = $in->next_structure()) {
>      my $pdb_id = $struc->id;
>      print "Structure ", $pdb_id,"\n";
>
>      my $ac = $struc->annotation();
>      my ($title) = $ac->get_Annotations('title');
>      $title = $title->as_text;
>      chomp($title);
>      if ($title =~ /Value\: (.*)/) {
>          $title = $1;
>      }
>      $title =~ s/\s+/ /g;
>
>      print "Title ",$title,"\n";
> }
>


-- 
-Neeti
Even my blood says, B positive


From jaudall at gmail.com  Mon Aug 20 00:39:18 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Sun, 19 Aug 2007 21:39:18 -0700
Subject: [Bioperl-l] concatenating aln splices
Message-ID: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>

Based on several criteria, I've extracted several splices from a
single alignment and I'm trying to concatenate my selected sequences
together.  Unfortunately, one of the sequences in the original
alignment only has gap characters for one or more of the splices.  I'd
like to keep the gap splices because other downstream aligned bases
depend on them.  I get these two warning messages splicing my
alignments together:

-------------------- WARNING ---------------------
MSG: Got a sequence with no letters in it cannot guess alphabet []
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Slice [232-233] of sequence [X2A/1-202] contains no residues.
Sequence excluded from the new alignment.
---------------------------------------------------

and now because of missing gaps, I get this error when trying to
concatenate them:

-------------------- WARNING ---------------------
MSG: expecting 236 not 203 from X2A
---------------------------------------------------

------------- EXCEPTION  -------------
MSG: All sequences in the alignment must be the same length
STACK Bio::AlignIO::phylip::write_aln
/sw/lib/perl5/5.8.6/Bio/AlignIO/phylip.pm:292

I don't mind the warnings, in fact I like them, but is there a way to
stop the splice function from removing the 'gap' sequence from the
alignment?  Perhaps catching the warning and inserting the gaps
afterwards might work, but I'm wondering if there's is a simpler
modification of SimpleAlign.pm that might work.  Any thoughts?

Josh


From bix at sendu.me.uk  Mon Aug 20 03:43:45 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 20 Aug 2007 08:43:45 +0100
Subject: [Bioperl-l] concatenating aln splices
In-Reply-To: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
References: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
Message-ID: <46C94631.2060704@sendu.me.uk>

Joshua Udall wrote:
> Based on several criteria, I've extracted several splices from a
> single alignment and I'm trying to concatenate my selected sequences
> together.  Unfortunately, one of the sequences in the original
> alignment only has gap characters for one or more of the splices.  I'd
> like to keep the gap splices because other downstream aligned bases
> depend on them.
[snip]
> I don't mind the warnings, in fact I like them, but is there a way to
> stop the splice function from removing the 'gap' sequence from the
> alignment?  Perhaps catching the warning and inserting the gaps
> afterwards might work, but I'm wondering if there's is a simpler
> modification of SimpleAlign.pm that might work.  Any thoughts?

Let us see some code, so we can get a better idea of what you're doing 
and what you've tried.

You can avoid losing sequences during a slice by not doing a slice. 
Instead, remove_columns(). This way you don't have to splice alignments 
together; you go from original alignment to 'spliced' version in one step.


From Oliver.Wafzig at sygnis.de  Mon Aug 20 04:42:55 2007
From: Oliver.Wafzig at sygnis.de (Oliver Wafzig)
Date: Mon, 20 Aug 2007 10:42:55 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
Message-ID: <200708201042.55292.Oliver.Wafzig@sygnis.de>

On Monday 20 August 2007 06:33, neeti somaiya wrote:
> Another question I had was, I am interested only in pdb id and title, and
> for this I am downloading and unzipping each of the full pdb structure
> files, parsing to get just id and title. Is there any other data source

Hi Neeti,
this is a non bioperl way to download the data.
Use the SRS server on the EBI page to download only id and title lines from 
pdb.

1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
2) Search for 'PDB' on the 'library page' and select it.
3) Use the standard query form. Select 'id' in the dropdown list and 
insert '*' (wildcard).
4) Create a view by selecting 'ID' and 'Title', then click the search button.
5) Click the save results button.
6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of entries 
to download' field. Press 'save'.

If the download is slow, read the 'download tips' on the download page and 
split the results in chunks. 

-- 
Oliver


From neetisomaiya at gmail.com  Mon Aug 20 09:05:01 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 18:35:01 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <200708201042.55292.Oliver.Wafzig@sygnis.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
Message-ID: <764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>

Thanks for your response.
Actually I am looking for something standalone and not on the web, as in
something which I can download onto my machine and parse later to get id and
title.

On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>
> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> > Another question I had was, I am interested only in pdb id and title,
> and
> > for this I am downloading and unzipping each of the full pdb structure
> > files, parsing to get just id and title. Is there any other data source
>
> Hi Neeti,
> this is a non bioperl way to download the data.
> Use the SRS server on the EBI page to download only id and title lines
> from
> pdb.
>
> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
> 2) Search for 'PDB' on the 'library page' and select it.
> 3) Use the standard query form. Select 'id' in the dropdown list and
> insert '*' (wildcard).
> 4) Create a view by selecting 'ID' and 'Title', then click the search
> button.
> 5) Click the save results button.
> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> entries
> to download' field. Press 'save'.
>
> If the download is slow, read the 'download tips' on the download page and
> split the results in chunks.
>
> --
> Oliver
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive


From bernd at kirx.de  Mon Aug 20 12:57:28 2007
From: bernd at kirx.de (Bernd Mueller)
Date: Mon, 20 Aug 2007 18:57:28 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
Message-ID: <46C9C7F8.3020608@kirx.de>

Hello,

Maybe you wanna try the Database-EUtilities module from bioperl. They 
are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

I tried them for a similar search on pubmed but without any reasonable 
results because my target was too focused.

 From EUtilities documentation on 
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases

"Protein Database

The Protein database contains sequence data from the translated coding 
regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein 
sequences submitted to Protein Information Resource (PIR), SWISS-PROT, 
Protein Research Foundation (PRF), and Protein Data Bank (PDB) 
(sequences from solved structures). "

So PDB is included in eutilities from NCBI.

Regards,
Bernd

neeti somaiya wrote:
> Thanks for your response.
> Actually I am looking for something standalone and not on the web, as in
> something which I can download onto my machine and parse later to get id and
> title.
> 
> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>> Another question I had was, I am interested only in pdb id and title,
>> and
>>> for this I am downloading and unzipping each of the full pdb structure
>>> files, parsing to get just id and title. Is there any other data source
>> Hi Neeti,
>> this is a non bioperl way to download the data.
>> Use the SRS server on the EBI page to download only id and title lines
>> from
>> pdb.
>>
>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>> 2) Search for 'PDB' on the 'library page' and select it.
>> 3) Use the standard query form. Select 'id' in the dropdown list and
>> insert '*' (wildcard).
>> 4) Create a view by selecting 'ID' and 'Title', then click the search
>> button.
>> 5) Click the save results button.
>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>> entries
>> to download' field. Press 'save'.
>>
>> If the download is slow, read the 'download tips' on the download page and
>> split the results in chunks.
>>
>> --
>> Oliver
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> 
> 
> 

-- 
Dipl.-Inform.(FH)
Bernd Mueller
phone: +49 179 2336692
email: bernd at kirx.de


From neetisomaiya at gmail.com  Mon Aug 20 13:39:01 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 23:09:01 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C9C7F8.3020608@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
Message-ID: <764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>

Hi,

Thanks for all the responses.
I got the solution from RCBS people :-

Dear Dr. Somaiya,

Thank you for your email message.

Please try the following:
1) Go to http://www.pdb.org/pdb/statistics/holdings.do and select the
number in the bottom right corner of the table (currently 45213).
2) From the menu on the left select 'Tabulate'>>'Custom Report' and
under 'Primary Citation' select 'Title'
3) At the bottom, select 'Create Report' and then one of the 'Download'
options.

Please let us know if we can be of additional assistance.

Sincerely,
Rachel Green

On 8/20/07, Bernd Mueller <bernd at kirx.de> wrote:
>
> Hello,
>
> Maybe you wanna try the Database-EUtilities module from bioperl. They
> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> I tried them for a similar search on pubmed but without any reasonable
> results because my target was too focused.
>
> From EUtilities documentation on
>
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases
>
> "Protein Database
>
> The Protein database contains sequence data from the translated coding
> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> (sequences from solved structures). "
>
> So PDB is included in eutilities from NCBI.
>
> Regards,
> Bernd
>
> neeti somaiya wrote:
> > Thanks for your response.
> > Actually I am looking for something standalone and not on the web, as in
> > something which I can download onto my machine and parse later to get id
> and
> > title.
> >
> > On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
> >> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> >>> Another question I had was, I am interested only in pdb id and title,
> >> and
> >>> for this I am downloading and unzipping each of the full pdb structure
> >>> files, parsing to get just id and title. Is there any other data
> source
> >> Hi Neeti,
> >> this is a non bioperl way to download the data.
> >> Use the SRS server on the EBI page to download only id and title lines
> >> from
> >> pdb.
> >>
> >> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
> >> 2) Search for 'PDB' on the 'library page' and select it.
> >> 3) Use the standard query form. Select 'id' in the dropdown list and
> >> insert '*' (wildcard).
> >> 4) Create a view by selecting 'ID' and 'Title', then click the search
> >> button.
> >> 5) Click the save results button.
> >> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> >> entries
> >> to download' field. Press 'save'.
> >>
> >> If the download is slow, read the 'download tips' on the download page
> and
> >> split the results in chunks.
> >>
> >> --
> >> Oliver
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> >
> >
>
> --
> Dipl.-Inform.(FH)
> Bernd Mueller
> phone: +49 179 2336692
> email: bernd at kirx.de
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive


From jaudall at gmail.com  Mon Aug 20 14:30:26 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Mon, 20 Aug 2007 12:30:26 -0600
Subject: [Bioperl-l] concatenating aln splices
In-Reply-To: <46C94631.2060704@sendu.me.uk>
References: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
	<46C94631.2060704@sendu.me.uk>
Message-ID: <52cea20c0708201130u29af2e10w78a852d7f88c23d1@mail.gmail.com>

Thanks, Sendu!  That suggestion was exactly what I needed.  I have it worked
out now with the remove_columns function.  Much easier that way :)

Josh

On 8/20/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> Joshua Udall wrote:
> > Based on several criteria, I've extracted several splices from a
> > single alignment and I'm trying to concatenate my selected sequences
> > together.  Unfortunately, one of the sequences in the original
> > alignment only has gap characters for one or more of the splices.  I'd
> > like to keep the gap splices because other downstream aligned bases
> > depend on them.
> [snip]
> > I don't mind the warnings, in fact I like them, but is there a way to
> > stop the splice function from removing the 'gap' sequence from the
> > alignment?  Perhaps catching the warning and inserting the gaps
> > afterwards might work, but I'm wondering if there's is a simpler
> > modification of SimpleAlign.pm that might work.  Any thoughts?
>
> Let us see some code, so we can get a better idea of what you're doing
> and what you've tried.
>
> You can avoid losing sequences during a slice by not doing a slice.
> Instead, remove_columns(). This way you don't have to splice alignments
> together; you go from original alignment to 'spliced' version in one step.
>


From cjfields at uiuc.edu  Mon Aug 20 14:51:14 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 20 Aug 2007 13:51:14 -0500
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C9C7F8.3020608@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
Message-ID: <4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>

Just curious, but what kind of query were you trying?  It might be  
worth trying to work through it to add as an example to the cookbook  
page.

chris

On Aug 20, 2007, at 11:57 AM, Bernd Mueller wrote:

> Hello,
>
> Maybe you wanna try the Database-EUtilities module from bioperl. They
> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> I tried them for a similar search on pubmed but without any reasonable
> results because my target was too focused.
>
>  From EUtilities documentation on
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=helpentrez.section.EntrezHelp.The_Databases
>
> "Protein Database
>
> The Protein database contains sequence data from the translated coding
> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as  
> protein
> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> (sequences from solved structures). "
>
> So PDB is included in eutilities from NCBI.
>
> Regards,
> Bernd
>
> neeti somaiya wrote:
>> Thanks for your response.
>> Actually I am looking for something standalone and not on the web,  
>> as in
>> something which I can download onto my machine and parse later to  
>> get id and
>> title.
>>
>> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>>> Another question I had was, I am interested only in pdb id and  
>>>> title,
>>> and
>>>> for this I am downloading and unzipping each of the full pdb  
>>>> structure
>>>> files, parsing to get just id and title. Is there any other data  
>>>> source
>>> Hi Neeti,
>>> this is a non bioperl way to download the data.
>>> Use the SRS server on the EBI page to download only id and title  
>>> lines
>>> from
>>> pdb.
>>>
>>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>>> 2) Search for 'PDB' on the 'library page' and select it.
>>> 3) Use the standard query form. Select 'id' in the dropdown list and
>>> insert '*' (wildcard).
>>> 4) Create a view by selecting 'ID' and 'Title', then click the  
>>> search
>>> button.
>>> 5) Click the save results button.
>>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>>> entries
>>> to download' field. Press 'save'.
>>>
>>> If the download is slow, read the 'download tips' on the download  
>>> page and
>>> split the results in chunks.
>>>
>>> --
>>> Oliver
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>>
>
> -- 
> Dipl.-Inform.(FH)
> Bernd Mueller
> phone: +49 179 2336692
> email: bernd at kirx.de
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bernd at kirx.de  Mon Aug 20 15:03:29 2007
From: bernd at kirx.de (Bernd Mueller)
Date: Mon, 20 Aug 2007 21:03:29 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
Message-ID: <46C9E581.1010907@kirx.de>

I attached my script.

Actually I tried to download all articles to a certain search term with
that script. The problem was that the retrieved documents were not free
as mentioned in the documentation of EUtilities on the NCBI page. So
many of the downloaded documents in xml-format were just dummies
containing only the abstract but not the fulltext article.

Bernd

Chris Fields wrote:
> Just curious, but what kind of query were you trying?  It might be worth 
> trying to work through it to add as an example to the cookbook page.
> 
> chris
> 
> On Aug 20, 2007, at 11:57 AM, Bernd Mueller wrote:
> 
>> Hello,
>>
>> Maybe you wanna try the Database-EUtilities module from bioperl. They
>> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>>
>> I tried them for a similar search on pubmed but without any reasonable
>> results because my target was too focused.
>>
>>  From EUtilities documentation on
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases 
>>
>>
>> "Protein Database
>>
>> The Protein database contains sequence data from the translated coding
>> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
>> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
>> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
>> (sequences from solved structures). "
>>
>> So PDB is included in eutilities from NCBI.
>>
>> Regards,
>> Bernd
>>
>> neeti somaiya wrote:
>>> Thanks for your response.
>>> Actually I am looking for something standalone and not on the web, as in
>>> something which I can download onto my machine and parse later to get 
>>> id and
>>> title.
>>>
>>> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>>>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>>>> Another question I had was, I am interested only in pdb id and title,
>>>> and
>>>>> for this I am downloading and unzipping each of the full pdb structure
>>>>> files, parsing to get just id and title. Is there any other data 
>>>>> source
>>>> Hi Neeti,
>>>> this is a non bioperl way to download the data.
>>>> Use the SRS server on the EBI page to download only id and title lines
>>>> from
>>>> pdb.
>>>>
>>>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>>>> 2) Search for 'PDB' on the 'library page' and select it.
>>>> 3) Use the standard query form. Select 'id' in the dropdown list and
>>>> insert '*' (wildcard).
>>>> 4) Create a view by selecting 'ID' and 'Title', then click the search
>>>> button.
>>>> 5) Click the save results button.
>>>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>>>> entries
>>>> to download' field. Press 'save'.
>>>>
>>>> If the download is slow, read the 'download tips' on the download 
>>>> page and
>>>> split the results in chunks.
>>>>
>>>> -- 
>>>> Oliver
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>>
>>>
>>
>> --Dipl.-Inform.(FH)
>> Bernd Mueller
>> phone: +49 179 2336692
>> email: bernd at kirx.de
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 
> 

-- 
Dipl.-Inform.(FH)
Bernd Mueller
phone: +49 179 2336692
email: bernd at kirx.de


-------------- next part --------------
A non-text attachment was scrubbed...
Name: myBioPerl.pl
Type: application/x-perl
Size: 1983 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070820/af579f0a/attachment-0002.bin>

From jayoung at fhcrc.org  Mon Aug 20 18:09:04 2007
From: jayoung at fhcrc.org (Janet Young)
Date: Mon, 20 Aug 2007 15:09:04 -0700
Subject: [Bioperl-l] Assembly::IO write_assembly and remove_seq
Message-ID: <EE800ED8-52E7-4D80-A18F-EDBABB90056C@fhcrc.org>

Hi all,

I realized last week that write_assembly isn't implemented in  
Assemble::IO
(see http://bioperl.org/pipermail/bioperl-l/2006-May/021619.html )
I know this has been asked before, but I wondered if anything has  
changed - does anyone have any plans to write a write_assembly  
method? Alternatively, any suggestions for an alternative solution to  
what I'm trying to do?

I'm trying to write a script to make improvements to the assembly  
that phredPhrap comes out with - it seems to quite frequently throw  
an unrelated sequence into a contig with either no matching sequence  
at all, or very little matching sequence. Mysterious. Anyway, my  
script can recognize the bad sequences easily enough, and thought I'd  
be able to remove them and then write the modified assembly. No joy.  
One very inelegant solution I've played with is that I can add some  
"markedHighQuality" tags to the discrepant sequences in the ace file,  
meaning that next time phredPhrap is run, it sometimes manages not to  
assemble the sequences that shouldn't be there. I'm not sure this  
will work in all cases, and it seems like quite an unsatisfactory way  
to do it.

For the same reason, I'm hoping someone can tell me what remove_seq  
does to a contig object? I'm using it and I don't get any error  
messages (returns 1), but when I check the contig object afterwards  
with get_seq_ids, the sequence I wanted to remove didn't seem to go  
away. Also, when I check out the primary_tags for that contig in the  
objects returned by get_features_collection, nothing seems to have  
changed. So I'm not sure whether the sequence really was removed from  
anything at all, and if it was, which object did it get removed  
from?  (a snippet of my code is below)
           my @seqids  = $contig->get_seq_ids();
           print OUT "seqids @seqids\n";
           my $seqobj = $contig->get_seq_by_name($seq);
           $contig->remove_seq($seqobj) || die "failed to remove seq\n";
           @seqids  = $contig->get_seq_ids();
           print OUT "seqids @seqids\n";

thanks for any advice,

Janet Young


-------------------------------------------------------------------

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung at fhcrc.org

http://www.fhcrc.org/labs/trask/

-------------------------------------------------------------------


From cjfields at uiuc.edu  Tue Aug 21 00:06:26 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 20 Aug 2007 23:06:26 -0500
Subject: [Bioperl-l] EUtilities, was Re:  PDB Parser
In-Reply-To: <46C9E581.1010907@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
	<46C9E581.1010907@kirx.de>
Message-ID: <7BE17595-9BC0-498B-AFA9-03ED0C853BFC@uiuc.edu>

Bernd,

Just in case you weren't aware, I have changed several aspects of  
EUtilities since the 1.5.2 release, so any code in the HOWTO cookbook  
applies ONLY to the version found in CVS (there is a big note at the  
top stating such).  This should be the finalized API which I intend  
on supporting from this point on.  The reason I indicate that is  
there are several giveaways which indicate you are using the older  
API from 1.5.2 (using next_cookie, for instance).

The following modification of your script (using the API in bioperl- 
live) works for me.  You should be able to do something similar with  
the older API as well but I haven't tried.  Note that PMC full-text  
retrieval only works if the article is declared 'open-access'; not  
all journals allow that.  Also, any full-text is only available as  
XML which (I'm guessing here) is transformed to HTML for PMC.

....
my $agent = Bio::DB::EUtilities->new(-eutil      => 'esearch',
-db         => $db,
-term       => $query,
-usehistory => 'y');

my $ct = $agent->get_count;

print "Count = $ct\n";

my $history = $agent->next_History;

if ($fetch eq 'yes') {
   my ($retmax, $retstart) = (1,0);
   while ($retstart < $ct) {
	  $agent->set_parameters(
               -eutil => 'efetch',
               -history => $history,
               -rettype => 'xml',
               -retmax => $retmax,
               -retstart => $retstart,
		  );
           $agent->get_Response(-file => ">./papers/paper_ 
$retstart.xml");
           $retstart += $retmax;
   }
}

------------------------------

It may also be possible to grab the LinkOut for these and try to nab  
the PDF or use the DOI, but I haven't tried anything like that.

chris

On Aug 20, 2007, at 2:03 PM, Bernd Mueller wrote:

> I attached my script.
>
> Actually I tried to download all articles to a certain search term  
> with
> that script. The problem was that the retrieved documents were not  
> free
> as mentioned in the documentation of EUtilities on the NCBI page. So
> many of the downloaded documents in xml-format were just dummies
> containing only the abstract but not the fulltext article.
>
> Bernd
>
> Chris Fields wrote:
>> Just curious, but what kind of query were you trying?  It might be  
>> worth trying to work through it to add as an example to the  
>> cookbook page.
>> chris


From n.haigh at sheffield.ac.uk  Tue Aug 21 04:19:59 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 21 Aug 2007 09:19:59 +0100
Subject: [Bioperl-l] subversion progress
Message-ID: <46CAA02F.60803@sheffield.ac.uk>

Hi,

I was just wondering if there was any further progress towards the svn
migration recently? What is still needing to be done?

Cheers
Nath


From neetisomaiya at gmail.com  Tue Aug 21 05:41:22 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Tue, 21 Aug 2007 15:11:22 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
Message-ID: <764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>

Hi,

I wanted to automate my pdb script, right from downloading of data. As per
the solution given by RCSB about custom report for pdb ids and titles only,
I was trying something like the code below, but it doesnt seem to work :-

my $url = '
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv';
use LWP::Simple;
my $content = get $url;
die "Couldn't get $url" unless defined $content;

Can anyone tell how I can do it, if there is any other way to do it, or if I
am going wrong somewhere, or if it is't possible for this case at all.

Please help.

On 8/20/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> Thanks for all the responses.
> I got the solution from RCBS people :-
>
> Dear Dr. Somaiya,
>
> Thank you for your email message.
>
> Please try the following:
> 1) Go to http://www.pdb.org/pdb/statistics/holdings.do and select the
> number in the bottom right corner of the table (currently 45213).
> 2) From the menu on the left select 'Tabulate'>>'Custom Report' and
> under 'Primary Citation' select 'Title'
> 3) At the bottom, select 'Create Report' and then one of the 'Download'
> options.
>
> Please let us know if we can be of additional assistance.
>
> Sincerely,
> Rachel Green
>
> On 8/20/07, Bernd Mueller <bernd at kirx.de> wrote:
> >
> > Hello,
> >
> > Maybe you wanna try the Database-EUtilities module from bioperl. They
> > are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
> >
> > I tried them for a similar search on pubmed but without any reasonable
> > results because my target was too focused.
> >
> > From EUtilities documentation on
> >
> > http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases
> >
> > "Protein Database
> >
> > The Protein database contains sequence data from the translated coding
> > regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
> >
> > sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> > Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> > (sequences from solved structures). "
> >
> > So PDB is included in eutilities from NCBI.
> >
> > Regards,
> > Bernd
> >
> > neeti somaiya wrote:
> > > Thanks for your response.
> > > Actually I am looking for something standalone and not on the web, as
> > in
> > > something which I can download onto my machine and parse later to get
> > id and
> > > title.
> > >
> > > On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
> > >> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> > >>> Another question I had was, I am interested only in pdb id and
> > title,
> > >> and
> > >>> for this I am downloading and unzipping each of the full pdb
> > structure
> > >>> files, parsing to get just id and title. Is there any other data
> > source
> > >> Hi Neeti,
> > >> this is a non bioperl way to download the data.
> > >> Use the SRS server on the EBI page to download only id and title
> > lines
> > >> from
> > >> pdb.
> > >>
> > >> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk ).
> > >> 2) Search for 'PDB' on the 'library page' and select it.
> > >> 3) Use the standard query form. Select 'id' in the dropdown list and
> > >> insert '*' (wildcard).
> > >> 4) Create a view by selecting 'ID' and 'Title', then click the search
> > >> button.
> > >> 5) Click the save results button.
> > >> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> > >> entries
> > >> to download' field. Press 'save'.
> > >>
> > >> If the download is slow, read the 'download tips' on the download
> > page and
> > >> split the results in chunks.
> > >>
> > >> --
> > >> Oliver
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >>
> > >
> > >
> > >
> >
> > --
> > Dipl.-Inform.(FH)
> > Bernd Mueller
> > phone: +49 179 2336692
> > email: bernd at kirx.de
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
>
>
>
> --
> -Neeti
> Even my blood says, B positive
>


-- 
-Neeti
Even my blood says, B positive


From cjfields at uiuc.edu  Tue Aug 21 10:40:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 09:40:03 -0500
Subject: [Bioperl-l] subversion progress
In-Reply-To: <46CAA02F.60803@sheffield.ac.uk>
References: <46CAA02F.60803@sheffield.ac.uk>
Message-ID: <5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>

Not sure myself, to tell the truth.  Pretty much everything was ready  
to go (i.e. svn commits work, commits post to bioperl-guts, etc.);  
the only possible exception was svn->cvs syncing.  I believe the  
decision for svn access is to stick with ssh only for now for  
simplicity's sake.  I may have to go back into the archives to  
refresh my memory on all the details...

I think a time for the switchover just has to be set so that  
everybody is adequately forewarned, and the docs for getting started  
on SVN need to be updated accordingly.

chris

On Aug 21, 2007, at 3:19 AM, Nathan Haigh wrote:

> Hi,
>
> I was just wondering if there was any further progress towards the svn
> migration recently? What is still needing to be done?
>
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jwalker at watson.wustl.edu  Tue Aug 21 11:20:46 2007
From: jwalker at watson.wustl.edu (Jason Walker)
Date: Tue, 21 Aug 2007 10:20:46 -0500
Subject: [Bioperl-l] RemoteBlast not handling NCBI Error message
Message-ID: <46CB02CE.1080803@watson.wustl.edu>

I've noticed RemoteBlast does not handle a specific error message from 
NCBI correctly.  retrieve_blast() should return 0 if waiting, -1 on 
error, or the results when completed.  It looks like the method relies 
on a specific tag in the NCBI return,  'QBlastInfoBegin'.  The error 
message I'm getting does not have this tag or a value of 
'Status=ERROR'.  After contacting NCBI 'Blast-help', they stated that 
QBlastInfoBegin should not be expected from all GET requests.  The error 
can be reproduced by using RID CM2YJJW501R, until it expires tomorrow.

my $rid = 'CM2YJJW501R';
my $factory = Bio::Tools::Run::RemoteBlast->new( -verbose => 1,);
my $rc = $factory->retrieve_blast($rid);
print $rc ."\n";

The content returned from NCBI looks like:
<hr><font color="red">ERROR: An error has occurred on the server, Too 
many HSPs to save all
 Contact Blast-help at ncbi.nlm.nih.gov and include your RID: 
CM2YJJW501R</font><hr>

I added a conditional statement as seen below to correct my local copy.  
I'm not sure this is the best fix, but it works.
sub retrieve_blast {
    ...
    if( /QBlastInfoBegin/i ) {
        $s = 1;
    } elsif( $s ) {
        if( /Status=(WAITING|ERROR|READY)/i ) {
            ...
         }
    } elsif( /^(?:#\s)?[\w-]*?BLAST\w+/ ) {
        $waiting = 0;
        last;
    } elsif ( /ERROR/i ) {
        close($TMP);
        open(my $ERR, "<$tempfile") or $self->throw("cannot open file 
$tempfile");
        $self->warn(join("", <$ERR>));
        close $ERR;
        return -1;
    }
    ...
}

Thanks,
Jason Walker


From cjfields at uiuc.edu  Tue Aug 21 12:15:36 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 11:15:36 -0500
Subject: [Bioperl-l] RemoteBlast not handling NCBI Error message
In-Reply-To: <46CB02CE.1080803@watson.wustl.edu>
References: <46CB02CE.1080803@watson.wustl.edu>
Message-ID: <348D8645-5DC2-4606-9650-EB08D8053F3D@uiuc.edu>


On Aug 21, 2007, at 10:20 AM, Jason Walker wrote:

> I've noticed RemoteBlast does not handle a specific error message from
> NCBI correctly.  retrieve_blast() should return 0 if waiting, -1 on
> error, or the results when completed.  It looks like the method relies
> on a specific tag in the NCBI return,  'QBlastInfoBegin'.  The error
> message I'm getting does not have this tag or a value of
> 'Status=ERROR'.  After contacting NCBI 'Blast-help', they stated that
> QBlastInfoBegin should not be expected from all GET requests.  The  
> error
> can be reproduced by using RID CM2YJJW501R, until it expires tomorrow.
> ...
> I added a conditional statement as seen below to correct my local  
> copy.
> I'm not sure this is the best fix, but it works.
> sub retrieve_blast {
>     ...
>     if( /QBlastInfoBegin/i ) {
>         $s = 1;
>     } elsif( $s ) {
>         if( /Status=(WAITING|ERROR|READY)/i ) {
>             ...
>          }
>     } elsif( /^(?:#\s)?[\w-]*?BLAST\w+/ ) {
>         $waiting = 0;
>         last;
>     } elsif ( /ERROR/i ) {
>         close($TMP);
>         open(my $ERR, "<$tempfile") or $self->throw("cannot open file
> $tempfile");
>         $self->warn(join("", <$ERR>));
>         close $ERR;
>         return -1;
>     }
>     ...
> }
>
> Thanks,
> Jason Walker

I have added this to RemoteBlast in bioperl cvs.  Thanks for the notice!

chris


From bernd.web at gmail.com  Tue Aug 21 12:32:09 2007
From: bernd.web at gmail.com (Bernd Web)
Date: Tue, 21 Aug 2007 18:32:09 +0200
Subject: [Bioperl-l] SearchIO-BLAST
Message-ID: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>

Dear all,

Recently, I stumbled on something with parsing BLAST reports.  To a
plain text blast report from NCBI a ">aaa" got prepended. This
(fasta-like header) changes the $result->hits array.
The amount of hits is now 2*num_hits + 1. Clearly, this is related to
faulty input, but still the effect of this line is great. Does someone
see what is causing this, and should the BLAST parser maybe be
slightly more relaxed wrt pre/appended text? I have not seen yet why
this extra fastaheader line has such a "large" effect.

A short example BLASTN output is attached.
Example code is:

use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast',
                           -file   => 'apoe_plain.bls');
while( my $result = $in->next_result ) {
  print "Num of hits: ", $result->num_hits, "\n";
  my @hits = $result->hits;
  foreach my $el (@hits) {
  	print $el->name, "\n";
  }


Kind regards,
Bernd
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apoe_plain.bls
Type: application/octet-stream
Size: 7890 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070821/a367809e/attachment-0002.obj>

From cjfields at uiuc.edu  Tue Aug 21 17:53:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 16:53:44 -0500
Subject: [Bioperl-l] SearchIO-BLAST
In-Reply-To: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>
References: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>
Message-ID: <59FF775C-8CAC-4947-A5BA-835ADD45CD32@uiuc.edu>

I can confirm this (I'm using bioperl-live).  The output I get is:

Num of hits: 9
ref|NM_000039.1|
ref|NT_113960.1|Hs22_111679
ref|NT_033899.7|Hs11_34054
ref|NW_925173.1|HsCraAADB02_444
ref|NM_000039.1|
ref|NT_113960.1|Hs22_111679
ref|NT_033899.7|Hs11_34054
ref|NW_925173.1|HsCraAADB02_444
ref|NW_925173.1|HsCraAADB02_444

The extra '>' is definitely throwing the event calls for a loop; the  
2x increase is b/c an extra iteration is started when '>' is  
encountered (changing the event handler reduces the number to 5).   
The extra hit is from the '>' at the beginning.

I hate to say it, but this is an instance where we can't be more  
flexible, primarily b/c '>' is a legit token the parser looks for (it  
is the beginning of the hit block in reports).  Finding it as the  
initial token in the report is also legitimate for some older BLAST  
output, so we also can't simply bypass it.  You'll unfortunately have  
to preparse the reports to get rid of those lines prior to feeding  
them to the BLAST text report parser.

chris

On Aug 21, 2007, at 11:32 AM, Bernd Web wrote:

> Dear all,
>
> Recently, I stumbled on something with parsing BLAST reports.  To a
> plain text blast report from NCBI a ">aaa" got prepended. This
> (fasta-like header) changes the $result->hits array.
> The amount of hits is now 2*num_hits + 1. Clearly, this is related to
> faulty input, but still the effect of this line is great. Does someone
> see what is causing this, and should the BLAST parser maybe be
> slightly more relaxed wrt pre/appended text? I have not seen yet why
> this extra fastaheader line has such a "large" effect.
>
> A short example BLASTN output is attached.
> Example code is:
>
> use Bio::SearchIO;
> my $in = new Bio::SearchIO(-format => 'blast',
>                            -file   => 'apoe_plain.bls');
> while( my $result = $in->next_result ) {
>   print "Num of hits: ", $result->num_hits, "\n";
>   my @hits = $result->hits;
>   foreach my $el (@hits) {
>   	print $el->name, "\n";
>   }
>
>
> Kind regards,
> Bernd
> <apoe_plain.bls>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Tue Aug 21 23:03:55 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 21 Aug 2007 23:03:55 -0400
Subject: [Bioperl-l] subversion progress
In-Reply-To: <5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>
References: <46CAA02F.60803@sheffield.ac.uk>
	<5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>
Message-ID: <51A5996D-A976-47FD-8807-20F6EBAF9E42@gmx.net>


On Aug 21, 2007, at 10:40 AM, Chris Fields wrote:

> I think a time for the switchover just has to be set so that
> everybody is adequately forewarned, and the docs for getting started
> on SVN need to be updated accordingly.

That was my recollection too. -hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Wed Aug 22 03:51:42 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 22 Aug 2007 08:51:42 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>	<46C9C7F8.3020608@kirx.de>	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
	<764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
Message-ID: <46CBEB0E.8030200@sendu.me.uk>

neeti somaiya wrote:
> Hi,
> 
> I wanted to automate my pdb script, right from downloading of data. As per
> the solution given by RCSB about custom report for pdb ids and titles only,
> I was trying something like the code below, but it doesnt seem to work :-
> 
> my $url = '
> http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
> VStructureSummary.structureId~VCitation.title&format=csv';
> use LWP::Simple;
> my $content = get $url;
> die "Couldn't get $url" unless defined $content;
> 
> Can anyone tell how I can do it, if there is any other way to do it, or if I
> am going wrong somewhere, or if it is't possible for this case at all.

Use LWP::UserAgent so you can see what's going on.

my $ua = LWP::UserAgent->new;
$ua->timeout(10);
my $response = $ua->get($url);
if ($response->is_success) {
   print $response->content;
}
else {
   die $response->status_line;
}


Gives:
500 Internal Server Error

Most likely the server is expecting some kind of cookie and falls over 
when you try to visit that url without it. So start where they told you 
to and grab pages successively, keeping any cookies.


From neetisomaiya at gmail.com  Wed Aug 22 06:06:38 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 22 Aug 2007 15:36:38 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46CBEB0E.8030200@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
	<764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
	<46CBEB0E.8030200@sendu.me.uk>
Message-ID: <764978cf0708220306u77cedf22xdd132b324e306f33@mail.gmail.com>

Thanks a lot. It worked for me.

use LWP::UserAgent;
use HTTP::Cookies;

$ua = LWP::UserAgent->new;
$ua->cookie_jar(HTTP::Cookies->new(file => "lwpcookies.txt",
                                     autosave => 1));

$request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=HoldingsQuery&moleculeType=ignore&experimentalMethod=ignore'
);

$response = $ua->request($request);

if ($response->is_success)
{
        print "\nSuccessfully connected to url
http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=HoldingsQuery&moleculeType=ignore&experimentalMethod=ignore\n
";

        $request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/results/tabularForm.do');

        $response = $ua->request($request);

        if ($response->is_success)
        {
                print "\nSuccessfully connected to url
http://www.pdb.org/pdb/results/tabularForm.do\n";

                $request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv');

                $response = $ua->request($request);

                if ($response->is_success)
                {
                        print "\nSuccessfully connected to url
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv\n";
                       open(FH,">tabularResults.csv");
                        print FH $response->content;
                        close(FH);
                }
                else
                {
                        die $response->status_line;
                }
        }
        else
        {
                die $response->status_line;
        }
}
else
{
  die $response->status_line;
}


On 8/22/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> neeti somaiya wrote:
> > Hi,
> >
> > I wanted to automate my pdb script, right from downloading of data. As
> per
> > the solution given by RCSB about custom report for pdb ids and titles
> only,
> > I was trying something like the code below, but it doesnt seem to work
> :-
> >
> > my $url = '
> >
> http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
> > VStructureSummary.structureId~VCitation.title&format=csv';
> > use LWP::Simple;
> > my $content = get $url;
> > die "Couldn't get $url" unless defined $content;
> >
> > Can anyone tell how I can do it, if there is any other way to do it, or
> if I
> > am going wrong somewhere, or if it is't possible for this case at all.
>
> Use LWP::UserAgent so you can see what's going on.
>
> my $ua = LWP::UserAgent->new;
> $ua->timeout(10);
> my $response = $ua->get($url);
> if ($response->is_success) {
>    print $response->content;
> }
> else {
>    die $response->status_line;
> }
>
>
> Gives:
> 500 Internal Server Error
>
> Most likely the server is expecting some kind of cookie and falls over
> when you try to visit that url without it. So start where they told you
> to and grab pages successively, keeping any cookies.
>


-- 
-Neeti
Even my blood says, B positive


From jay at jays.net  Wed Aug 22 08:54:29 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 22 Aug 2007 07:54:29 -0500
Subject: [Bioperl-l] wiki: Current Events
Message-ID: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>

http://www.bioperl.org/wiki/Main_Page

Please change:

< BOSC 2007 will be held July 19-20, 2007
 > BOSC 2007 was held July 19-20, 2007

I'd change it but the page is locked. Even when I'm logged in.   :)

Thanks,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Wed Aug 22 09:58:32 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 22 Aug 2007 08:58:32 -0500
Subject: [Bioperl-l] wiki: Current Events
In-Reply-To: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>
References: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>
Message-ID: <A7C5314E-662C-4160-85B1-0225B95C0BD2@uiuc.edu>

Done.

chris

On Aug 22, 2007, at 7:54 AM, Jay Hannah wrote:

> http://www.bioperl.org/wiki/Main_Page
>
> Please change:
>
> < BOSC 2007 will be held July 19-20, 2007
>> BOSC 2007 was held July 19-20, 2007
>
> I'd change it but the page is locked. Even when I'm logged in.   :)
>
> Thanks,
>
> Jay Hannah
> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Wed Aug 22 15:45:42 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Thu, 23 Aug 2007 01:15:42 +0530 (IST)
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw image according to
 input file ?
In-Reply-To: <A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
Message-ID: <44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>

Dear All,

Is there any option in Bio::Graphics to draw image based on the hits as
explained in the hits file.

For example I am using an input file:
# hit   score   start   end
Query   0       1       101
Sequence_Segment_1      0       1       101
PD:LRR_1|CS:AAC34139        0.16        1        23
PD:LRR_1|CS:AAC34139        3.6        1        22
PD:LRR_1|CS:AAC34139        1.8        1        22
PD:LRR_1|CS:AAC34139        1.3        1        22
PD:LRR_1|CS:XP_640228        2.5        2        23
..... Cropped
PD:LRR_1|CS:NP_611007        55        3        23
PD:LRR_1|CS:NP_611007        3.7        3        24
PD:LRR_1|CS:NP_611007        4.5        3        24
PD:LRR_1|CS:NP_611007        0.71        3        24
If you look at the image, you can see that, its all jumbled up and it
doesnt make any sense in the first look. I am looking for an option to
draw each of the  glyph one by one (say \n), rather that accomodating it
internally by the Bio::Graphics.

PS. Image is attached with this mail.
I am using  Dr. L. Stein's example :

use strict;
use Bio::Graphics;
use Bio::SeqFeature::Generic;
my $panel = Bio::Graphics::Panel->new(-length => 700,
                                      -width  => 800,
                                      -pad_left => 10,
                                      -pad_right => 10,
                                     );

my $full_length = Bio::SeqFeature::Generic->new(-start=>1,-end=>700);
$panel->add_track($full_length,
                  -glyph   => 'arrow',
                  -tick    => 2,
                  -fgcolor => 'black',
                  -double  => 1,
                 );

my $track = $panel->add_track(
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.png
Type: image/png
Size: 27974 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/be285f43/attachment-0002.png>

From cjfields at uiuc.edu  Thu Aug 23 00:53:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 22 Aug 2007 23:53:55 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
Message-ID: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>

As many of the devs know, there are a number of Feature/Annotation  
issues that need to be resolved prior to a 1.6 release:

http://www.bioperl.org/wiki/Release_Schedule#SeqFeature. 
2FAnnotation_changes:_Keep_or_roll_back.3F

There has been little work done over the last 2 1/2 years to undo or  
rectify problems associated with those additions; I feel like those  
of us still routinely contributing have been left holding the bag.   
There has also been very little attempt to document any of this  
adequately enough; as an example see POD for  
Bio::SeqFeature::Annotated (what little there is).

I would like to suggest the radical idea of rolling back AnnotatableI/ 
SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags  
are simple scalars) and possibly work in implementing Ewan's  
SeqFeature::TypedSeqFeatureI for those who want strong data types  
(i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various  
AnnotatableI changes, odd inheritance, and operator overloading have  
really obfuscated the code to the point where no one wants to touch  
it in case it breaks something important.  However, I believe it is  
the one serious impediment to a new stable release.

My thought is we simplify all the relevant interfaces, essentially  
reverting back to rel 1.4.  For instance, we move the various  
Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.   
Bio::SeqFeature::Annotated would implement Bio::AnnotatableI  
directly, and (if needed) also implement  
Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on  
Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI  
methods correctly, just as any other class would when implementing an  
abstract interface.  I have played around with this a bit and managed  
to get most tests working again for Bio::SeqFeature::Generic and  
FeatureIO but a number of others break.

If needed I can try this out on a branch (a bit ironic, since the  
changes instigating this mess should have been tested on a branch!).   
Maybe this will get the ball rolling towards a 1.6 release.  Any  
thoughts?

chris


From shameer at ncbs.res.in  Thu Aug 23 03:06:34 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Thu, 23 Aug 2007 12:36:34 +0530 (IST)
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw image
 according to input file ?
In-Reply-To: <44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
	<44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
Message-ID: <34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>

Dear All,

I will make my question simple :
Is there any way to force the 'Bio::graphics' module to print only one
glyph in a track ?

PS. More Detailed explanation is in my earlier mail (Dont want to spam the
community with my same mail)

Eagerly waiting for a reply.
Thanks,
-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From cain.cshl at gmail.com  Thu Aug 23 04:54:40 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 23 Aug 2007 04:54:40 -0400
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw
	image	according to input file ?
In-Reply-To: <34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
	<44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
	<34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>
Message-ID: <1187859296.2546.6.camel@103.48.216.10.in-addr.arpa>

Shameer,

I don't think that's really what you want.  It seems to me that sorting
them in some useful way (say, by score) would make more sense.  There is
an example using the -sort_order option in Lincoln's howto.

Scott


On Thu, 2007-08-23 at 12:36 +0530, Shameer Khadar wrote:
> Dear All,
> 
> I will make my question simple :
> Is there any way to force the 'Bio::graphics' module to print only one
> glyph in a track ?
> 
> PS. More Detailed explanation is in my earlier mail (Dont want to spam the
> community with my same mail)
> 
> Eagerly waiting for a reply.
> Thanks,
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/6066f0ec/attachment-0002.bin>

From cjfields at uiuc.edu  Thu Aug 23 10:14:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 09:14:51 -0500
Subject: [Bioperl-l] extra rel. 1.6 suggestion
Message-ID: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>

Some interesting points by Sendu:

http://www.bioperl.org/wiki/Release_Schedule#Need_tests

which I agree with completely.

Maybe the best way out if this is a variation on something that was  
suggested before, which was 'splitting' the code into groups.  What  
if we set up a way to automatically gauge test coverage,  
documentation, etc.?  If I remember correctly Nathan had something  
running at one point which did this.

If so, we could determine which code is potentially 'non-compliant'  
and needs to be fixed (tests added, docs brought up to spec, so on),  
and thus prioritize at the minimum what needs to be done for a 1.6  
release.  If it's deemed not worth worrying about (no active  
development, author is out of contact, we have more important  
priorities) we split that code off into a separate 'dev' package.   
That would save some of the headache of trying to split maintenance  
of ~1000 modules up on only a few devs.

Thoughts?

chris


From bix at sendu.me.uk  Thu Aug 23 10:57:21 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 23 Aug 2007 15:57:21 +0100
Subject: [Bioperl-l] extra rel. 1.6 suggestion
In-Reply-To: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
References: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
Message-ID: <46CDA051.40408@sendu.me.uk>

Chris Fields wrote:
> Maybe the best way out if this is a variation on something that was  
> suggested before, which was 'splitting' the code into groups.  What  
> if we set up a way to automatically gauge test coverage,  
> documentation, etc.?  If I remember correctly Nathan had something  
> running at one point which did this.

You can generate this yourself by doing
./Build testcover

Mauricio was going to sort out having this run daily with the results 
displayed on the website... Mauricio?

The major 'annoyance' is that the coverage results don't get generated 
if any test fails. But they shouldn't be failing anyway ;)


From cain.cshl at gmail.com  Thu Aug 23 15:53:37 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 23 Aug 2007 15:53:37 -0400
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
Message-ID: <1187898817.2562.19.camel@localhost.localdomain>

Hi Chris,

GBrowse would be unaffected by this as it doesn't use
Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
hand will almost certainly break horribly, as it depends on the strong
typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
your ideas out in a branch that I could checkout and test on, that would
be good.

Thanks,
Scott


On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
> As many of the devs know, there are a number of Feature/Annotation  
> issues that need to be resolved prior to a 1.6 release:
> 
> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature. 
> 2FAnnotation_changes:_Keep_or_roll_back.3F
> 
> There has been little work done over the last 2 1/2 years to undo or  
> rectify problems associated with those additions; I feel like those  
> of us still routinely contributing have been left holding the bag.   
> There has also been very little attempt to document any of this  
> adequately enough; as an example see POD for  
> Bio::SeqFeature::Annotated (what little there is).
> 
> I would like to suggest the radical idea of rolling back AnnotatableI/ 
> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags  
> are simple scalars) and possibly work in implementing Ewan's  
> SeqFeature::TypedSeqFeatureI for those who want strong data types  
> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various  
> AnnotatableI changes, odd inheritance, and operator overloading have  
> really obfuscated the code to the point where no one wants to touch  
> it in case it breaks something important.  However, I believe it is  
> the one serious impediment to a new stable release.
> 
> My thought is we simplify all the relevant interfaces, essentially  
> reverting back to rel 1.4.  For instance, we move the various  
> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.   
> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI  
> directly, and (if needed) also implement  
> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on  
> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI  
> methods correctly, just as any other class would when implementing an  
> abstract interface.  I have played around with this a bit and managed  
> to get most tests working again for Bio::SeqFeature::Generic and  
> FeatureIO but a number of others break.
> 
> If needed I can try this out on a branch (a bit ironic, since the  
> changes instigating this mess should have been tested on a branch!).   
> Maybe this will get the ball rolling towards a 1.6 release.  Any  
> thoughts?
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/11ce47d3/attachment-0002.bin>

From N.Haigh at sheffield.ac.uk  Thu Aug 23 16:32:12 2007
From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 23 Aug 2007 21:32:12 +0100
Subject: [Bioperl-l] extra rel. 1.6 suggestion
In-Reply-To: <46CDA051.40408@sendu.me.uk>
References: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
	<46CDA051.40408@sendu.me.uk>
Message-ID: <1187901132.46cdeeccce68d@webmail.shef.ac.uk>

Quoting Sendu Bala <bix at sendu.me.uk>:

> Chris Fields wrote:
> > Maybe the best way out if this is a variation on something that was  
> > suggested before, which was 'splitting' the code into groups.  What  
> > if we set up a way to automatically gauge test coverage,  
> > documentation, etc.?  If I remember correctly Nathan had something  
> > running at one point which did this.
> 
> You can generate this yourself by doing
> ./Build testcover

What I did was to patch Devel::Cover to include JavaScript to allow soring of the results by clicking a header in the table. This way, it was easier
to find those modules with poor POD coverage, and any other coverage metric. The developer(s) of Devel::Cover are introducing this into their next
release, ut who knows when that release will be. I could provide a diff, but we may be able to check out Devel::Cover from cvs/svn until the 0.62 is
made.

> 
> Mauricio was going to sort out having this run daily with the results 
> displayed on the website... Mauricio?
> 
> The major 'annoyance' is that the coverage results don't get generated 
> if any test fails. But they shouldn't be failing anyway ;)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Thu Aug 23 17:33:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 16:33:25 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <1187898817.2562.19.camel@localhost.localdomain>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<1187898817.2562.19.camel@localhost.localdomain>
Message-ID: <38B989E4-34CA-42CD-A608-9D2A095E7ADF@uiuc.edu>

Scott,

So far most of FeatureIO.t passes, with only a few exceptions dealing  
with the from_feature method (I know what the problem is there).  A  
large number of other tests crash horribly (not so surprising), so  
I'll have to go through those.  Ergo any changes and testing will  
definitely be conducted on a branch then merged back to main trunk  
once everything is okay.  I'll probably start a branch in the next  
few days or so.

Here's what I have been working on so far, which I think is reasonable:

1) Move all *_tag_* related methods out of Bio::AnnotatableI and into  
Bio::SeqFeature::Annotatable.

2) Reinstate the same tag methods in Bio::SeqFeatureI and remove  
Bio::AnnotatableI from the inheritance tree.

3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it  
already was, strangely enough).  Now it simple implements the proper  
methods from the interface classes SeqFeatureI and AnnotatableI.

4) Revert Bio::SeqFeature::Generic tags back to simple untyped  
strings (reimplement the 1.4 rel methods).

I'm interested in seeing whether this results in a significant  
performance increase in SeqIO since the Annotation instantiation is  
removed.

ToDo: I plan on removing the operator overloading in Bio::Annotation,  
which was a serious sticking point with most of the devs.  This won't  
be done until after tests pass for everything else.

What we will need at some point which I can't provide:  
Bio::SeqFeature::Annotated has no docs (no synopsis, no  
description).  The reason I bring this up is Sendu and I are  
seriously considering running an automated code audits in order to  
gauge which modules lack docs, test coverage, etc..  We're likely  
splitting those without adequate test/doc coverage off into a  
separate 'dev' release.

chris

On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:

> Hi Chris,
>
> GBrowse would be unaffected by this as it doesn't use
> Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
> hand will almost certainly break horribly, as it depends on the strong
> typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
> your ideas out in a branch that I could checkout and test on, that  
> would
> be good.
>
> Thanks,
> Scott
>
>
> On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
>> As many of the devs know, there are a number of Feature/Annotation
>> issues that need to be resolved prior to a 1.6 release:
>>
>> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
>> 2FAnnotation_changes:_Keep_or_roll_back.3F
>>
>> There has been little work done over the last 2 1/2 years to undo or
>> rectify problems associated with those additions; I feel like those
>> of us still routinely contributing have been left holding the bag.
>> There has also been very little attempt to document any of this
>> adequately enough; as an example see POD for
>> Bio::SeqFeature::Annotated (what little there is).
>>
>> I would like to suggest the radical idea of rolling back  
>> AnnotatableI/
>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>> are simple scalars) and possibly work in implementing Ewan's
>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
>> AnnotatableI changes, odd inheritance, and operator overloading have
>> really obfuscated the code to the point where no one wants to touch
>> it in case it breaks something important.  However, I believe it is
>> the one serious impediment to a new stable release.
>>
>> My thought is we simplify all the relevant interfaces, essentially
>> reverting back to rel 1.4.  For instance, we move the various
>> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
>> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
>> directly, and (if needed) also implement
>> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
>> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
>> methods correctly, just as any other class would when implementing an
>> abstract interface.  I have played around with this a bit and managed
>> to get most tests working again for Bio::SeqFeature::Generic and
>> FeatureIO but a number of others break.
>>
>> If needed I can try this out on a branch (a bit ironic, since the
>> changes instigating this mess should have been tested on a branch!).
>> Maybe this will get the ball rolling towards a 1.6 release.  Any
>> thoughts?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                          
> cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From smarkel at accelrys.com  Thu Aug 23 17:59:37 2007
From: smarkel at accelrys.com (Scott Markel)
Date: Thu, 23 Aug 2007 14:59:37 -0700
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <38B989E4-34CA-42CD-A608-9D2A095E7ADF@uiuc.edu>
Message-ID: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>

Chris,

Pipeline Pilot's Sequence Analysis Collection wraps BioPerl.
Once you think the branch changes have converged a bit we'd
be happy to try running our regression suite and report what
we find.

Scott

Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at accelrys.com
Accelrys, Inc.                      mobile: +1 858 205 3653
10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
San Diego, CA 92121                 fax:    +1 858 799 5222
USA                                 web:    http://www.accelrys.com


bioperl-l-bounces at lists.open-bio.org wrote on 23.08.2007 14:33:25:

> Scott,
> 
> So far most of FeatureIO.t passes, with only a few exceptions dealing 
> with the from_feature method (I know what the problem is there).  A 
> large number of other tests crash horribly (not so surprising), so 
> I'll have to go through those.  Ergo any changes and testing will 
> definitely be conducted on a branch then merged back to main trunk 
> once everything is okay.  I'll probably start a branch in the next 
> few days or so.
> 
> Here's what I have been working on so far, which I think is reasonable:
> 
> 1) Move all *_tag_* related methods out of Bio::AnnotatableI and into 
> Bio::SeqFeature::Annotatable.
> 
> 2) Reinstate the same tag methods in Bio::SeqFeatureI and remove 
> Bio::AnnotatableI from the inheritance tree.
> 
> 3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it 
> already was, strangely enough).  Now it simple implements the proper 
> methods from the interface classes SeqFeatureI and AnnotatableI.
> 
> 4) Revert Bio::SeqFeature::Generic tags back to simple untyped 
> strings (reimplement the 1.4 rel methods).
> 
> I'm interested in seeing whether this results in a significant 
> performance increase in SeqIO since the Annotation instantiation is 
> removed.
> 
> ToDo: I plan on removing the operator overloading in Bio::Annotation, 
> which was a serious sticking point with most of the devs.  This won't 
> be done until after tests pass for everything else.
> 
> What we will need at some point which I can't provide: 
> Bio::SeqFeature::Annotated has no docs (no synopsis, no 
> description).  The reason I bring this up is Sendu and I are 
> seriously considering running an automated code audits in order to 
> gauge which modules lack docs, test coverage, etc..  We're likely 
> splitting those without adequate test/doc coverage off into a 
> separate 'dev' release.
> 
> chris
> 
> On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > GBrowse would be unaffected by this as it doesn't use
> > Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
> > hand will almost certainly break horribly, as it depends on the strong
> > typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
> > your ideas out in a branch that I could checkout and test on, that 
> > would
> > be good.
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
> >> As many of the devs know, there are a number of Feature/Annotation
> >> issues that need to be resolved prior to a 1.6 release:
> >>
> >> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
> >> 2FAnnotation_changes:_Keep_or_roll_back.3F
> >>
> >> There has been little work done over the last 2 1/2 years to undo or
> >> rectify problems associated with those additions; I feel like those
> >> of us still routinely contributing have been left holding the bag.
> >> There has also been very little attempt to document any of this
> >> adequately enough; as an example see POD for
> >> Bio::SeqFeature::Annotated (what little there is).
> >>
> >> I would like to suggest the radical idea of rolling back 
> >> AnnotatableI/
> >> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
> >> are simple scalars) and possibly work in implementing Ewan's
> >> SeqFeature::TypedSeqFeatureI for those who want strong data types
> >> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
> >> AnnotatableI changes, odd inheritance, and operator overloading have
> >> really obfuscated the code to the point where no one wants to touch
> >> it in case it breaks something important.  However, I believe it is
> >> the one serious impediment to a new stable release.
> >>
> >> My thought is we simplify all the relevant interfaces, essentially
> >> reverting back to rel 1.4.  For instance, we move the various
> >> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
> >> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
> >> directly, and (if needed) also implement
> >> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
> >> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
> >> methods correctly, just as any other class would when implementing an
> >> abstract interface.  I have played around with this a bit and managed
> >> to get most tests working again for Bio::SeqFeature::Generic and
> >> FeatureIO but a number of others break.
> >>
> >> If needed I can try this out on a branch (a bit ironic, since the
> >> changes instigating this mess should have been tested on a branch!).
> >> Maybe this will get the ball rolling towards a 1.6 release.  Any
> >> thoughts?
> >>
> >> chris
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ---------------------------------------------------------------------- 

> > --
> > Scott Cain, Ph. D. 
> > cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/) 
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> -- 
> Click on the link below to report this email as spam
> https://www.mailcontrol.com/sr/Z!
> PZbyWH8JjiAfutpwULH4r7uW5Ugf1xtM+hyl21+efKtFgsAvNc3weh2hLqBsx8qT3rbOWim!
> Vn7A6djKguyK4O2gER4dLr9AKQF+tbnNRe+5lUPSgNICEO3B01XGW5n2DPe!
> yEtP3js8LAfwb38Bepj7AEJrDzVAG8yHc2pI5Y2U7!
> XHn0N1xbhPb0KSgNCfpTRCAMi3+BBkPbzT1bgrPmgUSJxQ9e 


From cjfields at uiuc.edu  Thu Aug 23 20:39:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 19:39:30 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>
References: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>
Message-ID: <241563BB-F96A-4631-B504-F73699FDE84B@uiuc.edu>

Having an independent test would be great!  The reason I suggest  
there may be a speedup: one complaint popping up after 1.5 was the  
slowdown in sequence parsing, which could be related to the 'heavier'  
objectified tags.

chris

On Aug 23, 2007, at 4:59 PM, Scott Markel wrote:

> Chris,
>
> Pipeline Pilot's Sequence Analysis Collection wraps BioPerl.
> Once you think the branch changes have converged a bit we'd
> be happy to try running our regression suite and report what
> we find.
>
> Scott
>
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect  email:  smarkel at accelrys.com
> Accelrys, Inc.                      mobile: +1 858 205 3653
> 10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
> San Diego, CA 92121                 fax:    +1 858 799 5222
> USA                                 web:    http://www.accelrys.com
>
>
> bioperl-l-bounces at lists.open-bio.org wrote on 23.08.2007 14:33:25:
>
>> Scott,
>>
>> So far most of FeatureIO.t passes, with only a few exceptions dealing
>> with the from_feature method (I know what the problem is there).  A
>> large number of other tests crash horribly (not so surprising), so
>> I'll have to go through those.  Ergo any changes and testing will
>> definitely be conducted on a branch then merged back to main trunk
>> once everything is okay.  I'll probably start a branch in the next
>> few days or so.
>>
>> Here's what I have been working on so far, which I think is  
>> reasonable:
>>
>> 1) Move all *_tag_* related methods out of Bio::AnnotatableI and into
>> Bio::SeqFeature::Annotatable.
>>
>> 2) Reinstate the same tag methods in Bio::SeqFeatureI and remove
>> Bio::AnnotatableI from the inheritance tree.
>>
>> 3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it
>> already was, strangely enough).  Now it simple implements the proper
>> methods from the interface classes SeqFeatureI and AnnotatableI.
>>
>> 4) Revert Bio::SeqFeature::Generic tags back to simple untyped
>> strings (reimplement the 1.4 rel methods).
>>
>> I'm interested in seeing whether this results in a significant
>> performance increase in SeqIO since the Annotation instantiation is
>> removed.
>>
>> ToDo: I plan on removing the operator overloading in Bio::Annotation,
>> which was a serious sticking point with most of the devs.  This won't
>> be done until after tests pass for everything else.
>>
>> What we will need at some point which I can't provide:
>> Bio::SeqFeature::Annotated has no docs (no synopsis, no
>> description).  The reason I bring this up is Sendu and I are
>> seriously considering running an automated code audits in order to
>> gauge which modules lack docs, test coverage, etc..  We're likely
>> splitting those without adequate test/doc coverage off into a
>> separate 'dev' release.
>>
>> chris
>>
>> On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> GBrowse would be unaffected by this as it doesn't use
>>> Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
>>> hand will almost certainly break horribly, as it depends on the  
>>> strong
>>> typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you  
>>> could try
>>> your ideas out in a branch that I could checkout and test on, that
>>> would
>>> be good.
>>>
>>> Thanks,
>>> Scott
>>>
>>>
>>> On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
>>>> As many of the devs know, there are a number of Feature/Annotation
>>>> issues that need to be resolved prior to a 1.6 release:
>>>>
>>>> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
>>>> 2FAnnotation_changes:_Keep_or_roll_back.3F
>>>>
>>>> There has been little work done over the last 2 1/2 years to  
>>>> undo or
>>>> rectify problems associated with those additions; I feel like those
>>>> of us still routinely contributing have been left holding the bag.
>>>> There has also been very little attempt to document any of this
>>>> adequately enough; as an example see POD for
>>>> Bio::SeqFeature::Annotated (what little there is).
>>>>
>>>> I would like to suggest the radical idea of rolling back
>>>> AnnotatableI/
>>>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>>>> are simple scalars) and possibly work in implementing Ewan's
>>>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>>>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
>>>> AnnotatableI changes, odd inheritance, and operator overloading  
>>>> have
>>>> really obfuscated the code to the point where no one wants to touch
>>>> it in case it breaks something important.  However, I believe it is
>>>> the one serious impediment to a new stable release.
>>>>
>>>> My thought is we simplify all the relevant interfaces, essentially
>>>> reverting back to rel 1.4.  For instance, we move the various
>>>> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
>>>> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
>>>> directly, and (if needed) also implement
>>>> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
>>>> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
>>>> methods correctly, just as any other class would when  
>>>> implementing an
>>>> abstract interface.  I have played around with this a bit and  
>>>> managed
>>>> to get most tests working again for Bio::SeqFeature::Generic and
>>>> FeatureIO but a number of others break.
>>>>
>>>> If needed I can try this out on a branch (a bit ironic, since the
>>>> changes instigating this mess should have been tested on a  
>>>> branch!).
>>>> Maybe this will get the ball rolling towards a 1.6 release.  Any
>>>> thoughts?
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>
>>> --
>>> Scott Cain, Ph. D.
>>> cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> -- 
>> Click on the link below to report this email as spam
>> https://www.mailcontrol.com/sr/Z!
>> PZbyWH8JjiAfutpwULH4r7uW5Ugf1xtM+hyl21 
>> +efKtFgsAvNc3weh2hLqBsx8qT3rbOWim!
>> Vn7A6djKguyK4O2gER4dLr9AKQF+tbnNRe+5lUPSgNICEO3B01XGW5n2DPe!
>> yEtP3js8LAfwb38Bepj7AEJrDzVAG8yHc2pI5Y2U7!
>> XHn0N1xbhPb0KSgNCfpTRCAMi3+BBkPbzT1bgrPmgUSJxQ9e
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Thu Aug 23 23:34:12 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 23 Aug 2007 23:34:12 -0400
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
Message-ID: <CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>


On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:

> There has been little work done over the last 2 1/2 years to undo or
> rectify problems associated with those additions; I feel like those
> of us still routinely contributing have been left holding the bag.

Not by intention, but unfortunately that's probably a fair  
assessment. (And I'm one of those guilty of inaction.)

> [...]
> I would like to suggest the radical idea of rolling back AnnotatableI/
> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
> are simple scalars) and possibly work in implementing Ewan's
> SeqFeature::TypedSeqFeatureI for those who want strong data types
> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).

I fully support this; to me that sounds exactly like the way to go.

> The various AnnotatableI changes, odd inheritance, and operator  
> overloading have
> really obfuscated the code to the point where no one wants to touch
> it in case it breaks something important.  However, I believe it is
> the one serious impediment to a new stable release.

Yes, I think you're hitting the nail on the head.

Chris, if you take the lead on this and carry it through we will all  
owe you hugely. I'm not sure how many beers that would compare to,  
but I'll throw in some. (Who else do I owe beer? I'm losing track.  
Strangely nobody tried to redeem beer from me in Vienna. Maybe in  
Toronto?)

Seriously, rectifying this problem would lift a huge weight.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From florent.angly at gmail.com  Fri Aug 24 00:43:23 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 23 Aug 2007 21:43:23 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
Message-ID: <46CE61EB.5000300@gmail.com>

Dear list members,

I would like to "produce" an alignment of a contig, or more exactly 
visualize it in a such a fashion based on the aligned sequences provided 
to be by a sequence assembler:

Consensus: ACGTACGTTG
Sequence1: ACG-AC
Sequence2:  CGTACGT
Sequence3:     AC-TTG

It sounds like a very trivial task but after searching for a long time, 
it seems impossible using the methods BioPerl provides.

Using the Bio::Align classes, it seems like the only way is if the 
sequences have the same aligned length, i.e. like this:

Consensus: ACGTACGTTG
Sequence1: ACG-AC----
Sequence2: -CGTACGT--
Sequence3: ----AC-TTG

It's not very satisfactory if I have to pad the sequences with gaps 
manually. In the context of a phylogenetic alignment, it might make 
sense, but not for contigs.

For assemblies whole sequences are mapped on contigs. Bio::LocatableSeq 
does not help here because it defines locations _within_ the sequence 
(the name LocatableSeq was pretty misleading to me).

I think it's all very strange that contigs have the coordinates of the 
aligned sequences composing them but there is no straightforward way to 
exploit this information.

So what's the bottom line? Am I missing something obvious, an 
out-of-the-box solution? Is it a "missing feature" of BioPerl that is 
planned to be implemented in the future or that should be requested? 
Should I pad my sequences with dashes or spaces after assembly? Or is it 
expected that my aligned reads coming from my assembly be padded with 
lots of gaps at their beginning and end? What's the BioPerl philosophy here?

Thanks for giving me pointers,

Florent


From bix at sendu.me.uk  Fri Aug 24 04:35:23 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 09:35:23 +0100
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CE61EB.5000300@gmail.com>
References: <46CE61EB.5000300@gmail.com>
Message-ID: <46CE984B.3060701@sendu.me.uk>

Florent Angly wrote:
> Dear list members,
> 
> I would like to "produce" an alignment of a contig, or more exactly 
> visualize it in a such a fashion based on the aligned sequences provided 
> to be by a sequence assembler:
> 
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC
> Sequence2:  CGTACGT
> Sequence3:     AC-TTG
> 
> It sounds like a very trivial task but after searching for a long time, 
> it seems impossible using the methods BioPerl provides.

Isn't Bio::Assembly::Contig what you need?

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Assembly/Contig.html


From zhaodj at ioz.ac.cn  Fri Aug 24 05:34:07 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 24 Aug 2007 17:34:07 +0800 (CST)
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CE61EB.5000300@gmail.com>
References: <46CE61EB.5000300@gmail.com>
Message-ID: <51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>

On Fri, Aug 24, 2007 12:43, Florent Angly wrote:
> Dear list members,
>
> I would like to "produce" an alignment of a contig, or more
exactly
> visualize it in a such a fashion based on the aligned sequences
> provided
> to be by a sequence assembler:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC
> Sequence2:  CGTACGT
> Sequence3:     AC-TTG
>
> It sounds like a very trivial task but after searching for a long
time,
> it seems impossible using the methods BioPerl provides.
>
> Using the Bio::Align classes, it seems like the only way is if the
sequences have the same aligned length, i.e. like this:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC----
> Sequence2: -CGTACGT--
> Sequence3: ----AC-TTG
>
> It's not very satisfactory if I have to pad the sequences with
gaps
> manually. In the context of a phylogenetic alignment, it might
make
> sense, but not for contigs.

How do you pad the sequences with gaps manually? Just replace the
hyphens with blanks? If yes, you can program in perl to automate
this process.

> For assemblies whole sequences are mapped on contigs.
> Bio::LocatableSeq
> does not help here because it defines locations _within_ the
> sequence
> (the name LocatableSeq was pretty misleading to me).
>
> I think it's all very strange that contigs have the coordinates of
the
> aligned sequences composing them but there is no straightforward
way
> to
> exploit this information.
>
> So what's the bottom line? Am I missing something obvious, an
> out-of-the-box solution? Is it a "missing feature" of BioPerl that
is
> planned to be implemented in the future or that should be
requested?
> Should I pad my sequences with dashes or spaces after assembly? Or
is it
> expected that my aligned reads coming from my assembly be padded
with
> lots of gaps at their beginning and end? What's the BioPerl
> philosophy here?
>
> Thanks for giving me pointers,
>
> Florent
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From marian.thieme at arcor.de  Fri Aug 24 06:05:55 2007
From: marian.thieme at arcor.de (Marian Thieme)
Date: Fri, 24 Aug 2007 12:05:55 +0200
Subject: [Bioperl-l] ReseqChip, module/package name
Message-ID: <46CEAD83.2050904@arcor.de>

Hi,

2 questions about the naming of the module I did submit
(see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)

1.) The package:
because there exists already an expression package I suggest to create a
new package called resequencing

2.) I would suggest that the module is called RedundantFragments or
AdditionalFragments

so we would have something like:

Bio::Resequencing::AdditionalFragments

Any other ideas ?

Marian

By the way can anybody change my email adress to marian.thieme at arcor.de
in bugzilla as well as in the bioperl list, please ?!! didnt achieve
that by my own...


From mcons004 at fiu.edu  Thu Aug 23 23:30:44 2007
From: mcons004 at fiu.edu (mcons004 at fiu.edu)
Date: Thu, 23 Aug 2007 23:30:44 -0400 (EDT)
Subject: [Bioperl-l] please some help
Message-ID: <20070823233044.BJQ45014@mailstore2.fiu.edu>

  Hello,
     I am new to this software and I am having some trouble starting. The version of Bioperl I am working on is v5.8.6. My OS is Unix (Mac OS X). I am trying to use Bioperl with a file called blastParser to process a file which is the output of a "blastall" operation.
  
 The code that gives me error is:
> perl blastParser.pl junk.out 1 1 1.0
 and the error message says:
Can't locate Bio/SearchIO.pm in @INC (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level

 You online info says I probably means that the module Bio::SearchIO.pm is not instaled and I can either install Bundle::Bioperl or install that specific module by hand. Could you give me some tips in this? I am new working with Unix, and Bioperl so I am a little confused. Any information will be helpful for me. Thanks


From bix at sendu.me.uk  Fri Aug 24 10:38:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 15:38:39 +0100
Subject: [Bioperl-l] please some help
In-Reply-To: <20070823233044.BJQ45014@mailstore2.fiu.edu>
References: <20070823233044.BJQ45014@mailstore2.fiu.edu>
Message-ID: <46CEED6F.1080101@sendu.me.uk>

mcons004 at fiu.edu wrote:
> Hello, I am new to this software and I am having some trouble
> starting. The version of Bioperl I am working on is v5.8.6. My OS is
> Unix (Mac OS X). I am trying to use Bioperl with a file called
> blastParser to process a file which is the output of a "blastall"
> operation.
> 
> The code that gives me error is:
>> perl blastParser.pl junk.out 1 1 1.0
> and the error message says: Can't locate Bio/SearchIO.pm in @INC
> (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level
> 
> 
> You online info says I probably means that the module
> Bio::SearchIO.pm is not instaled and I can either install
> Bundle::Bioperl or install that specific module by hand. Could you
> give me some tips in this? I am new working with Unix, and Bioperl so
> I am a little confused.

You need to install Bioperl first. You can find instructions here:
http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix

If this is your own Mac (you have the root/admin password), when it 
tells you to run cpan (">perl -MCPAN -e shell" or ">cpan"), start the 
command with 'sudo'. So:

 >sudo cpan


From florent.angly at gmail.com  Fri Aug 24 12:07:04 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Fri, 24 Aug 2007 09:07:04 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
Message-ID: <46CF0228.2000404@gmail.com>

Thanks for all the replies.

Sendu Bala wrote:

> Isn't Bio::Assembly::Contig what you need?
>
> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Assembly/Contig.html
>
I'm using this module already to manipulate the contigs, but there's no
option that I know of to _display_ the contigs in the way I described.
(Sorry, the title of my email was misleading.)


De-Jian,ZHAO wrote:
> How do you pad the sequences with gaps manually? Just replace the
> hyphens with blanks? If yes, you can program in perl to automate
> this process.
>   
How do I pad the sequences manually?? I calculate how many gaps have to
go left and right of the aligned sequence based on its length, its
position in the aligned consensus and the consensus length.
my $newseq = '-' x $leftnum . $seq . '-'x$rightnum
By the way, the sequences cannot be stored with blanks in them...

I think the best way to provide an out-of-the-box solution for
displaying contigs the described way would be to _not_ use Bio::Align at
all, but rather to create a new assembly IO module like
Bio::Assembly::IO::simpleout for example. That would be useful.

The reason I wanted to visualize these contigs is because I made a
Bio::Assembly::IO module for TIGR Assembler files that I intend on
submitting to BioPerl. I wanted to make sure first that I did not have
any obvious bug in my contig coordinates. I've read the documentation on
the Wiki so if a BioPerl developer would please like lo step up and
contact me directly for checking my code, that would be nice =)

Florent


From cjfields at uiuc.edu  Fri Aug 24 12:07:36 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 11:07:36 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip, module/package name
In-Reply-To: <46CEAD83.2050904@arcor.de>
References: <46CEAD83.2050904@arcor.de>
Message-ID: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>

Marian,

First, apologies about not getting on this sooner.  It's shaping up  
to be a busy year!

The new package: How about Bio::Expression::Tools::MitoChip?  My  
reasoning: I don't think it's necessary to define a new  
Bio::Resequencing namespace for just one module; my inclination is  
towards using Bio::Expression namespace as Bio::Tools have been  
traditionally reserved for output parsers.  I am unsure what the  
Bio::Expression status is (very little is documented, no tests are  
written, nothing on the mail list archives); maybe Allen can answer  
that?  I don't see anything that precludes you from using that  
namespace as long as your tools are fairly well-defined (they are)  
and have tests (they do).

Also, your module deals with doing one specific thing (extraction and  
incorporation of information about redundant fragments) for the Affy  
MitoChip.  It might be worth genericizing the class a bit so that you  
can add new parser or analysis methods w/o having to define new  
classes to deal with the same Mitochip data.

Mail list: The mail list subscription page (http://bioperl.org/ 
mailman/listinfo/bioperl-l) allows you to subscribe or change  
subscription options (at the bottom of the page).

Bugzilla: if you are logged into Bugzilla under your old email, there  
is an option at the bottom of the page (Edit : Prefs) where you can  
change your email address and other preferences.

chris

On Aug 24, 2007, at 5:05 AM, Marian Thieme wrote:

> Hi,
>
> 2 questions about the naming of the module I did submit
> (see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)
>
> 1.) The package:
> because there exists already an expression package I suggest to  
> create a
> new package called resequencing
>
> 2.) I would suggest that the module is called RedundantFragments or
> AdditionalFragments
>
> so we would have something like:
>
> Bio::Resequencing::AdditionalFragments
>
> Any other ideas ?
>
> Marian
>
> By the way can anybody change my email adress to  
> marian.thieme at arcor.de
> in bugzilla as well as in the bioperl list, please ?!! didnt achieve
> that by my own...
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Aug 24 12:23:12 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 11:23:12 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
Message-ID: <4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>

On Aug 23, 2007, at 10:34 PM, Hilmar Lapp wrote:

> On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:
>
>> There has been little work done over the last 2 1/2 years to undo or
>> rectify problems associated with those additions; I feel like those
>> of us still routinely contributing have been left holding the bag.
>
> Not by intention, but unfortunately that's probably a fair  
> assessment. (And I'm one of those guilty of inaction.)

Not completely.  You, Jason, Chris M., and several others expressed  
yourselves quite clearly (move the code to a branch and test).  I  
think that everyone was trying to be diplomatic about it and so never  
attempted to do anything except get it working correctly.

>> [...]
>> I would like to suggest the radical idea of rolling back  
>> AnnotatableI/
>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>> are simple scalars) and possibly work in implementing Ewan's
>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).
>
> I fully support this; to me that sounds exactly like the way to go.

Okay, I'll probably go ahead and get a branch started today.  I'll  
have to look at Ewan's interface in more detail; it's possible a new  
SeqFeature implementation will need to be written up to incorporate it.

>> The various AnnotatableI changes, odd inheritance, and operator  
>> overloading have
>> really obfuscated the code to the point where no one wants to touch
>> it in case it breaks something important.  However, I believe it is
>> the one serious impediment to a new stable release.
>
> Yes, I think you're hitting the nail on the head.
>
> Chris, if you take the lead on this and carry it through we will  
> all owe you hugely. I'm not sure how many beers that would compare  
> to, but I'll throw in some. (Who else do I owe beer? I'm losing  
> track. Strangely nobody tried to redeem beer from me in Vienna.  
> Maybe in Toronto?)
>
> Seriously, rectifying this problem would lift a huge weight.
>
> 	-hilmar

It would be nice to get regular releases started again.  I think  
this'll help.

chris


From marian.thieme at arcor.de  Fri Aug 24 13:01:07 2007
From: marian.thieme at arcor.de (Marian Thieme)
Date: Fri, 24 Aug 2007 19:01:07 +0200
Subject: [Bioperl-l] Bio::Expression & Re: ReseqChip, module/package name
Message-ID: <46CF0ED3.8000708@arcor.de>

> The new package: How about Bio::Expression::Tools::MitoChip?  My  
> reasoning: I don't think it's necessary to define a new  
> Bio::Resequencing namespace for just one module; my inclination is  
> towards using Bio::Expression namespace as Bio::Tools have been  
> traditionally reserved for output parsers.  I am unsure what the  
> Bio::Expression status is (very little is documented, no tests are  
> written, nothing on the mail list archives); maybe Allen can answer  
> that?  I don't see anything that precludes you from using that  
> namespace as long as your tools are fairly well-defined (they are)  
> and have tests (they do).

The problem I see, with Bio::Expression, is that Resequencing chips are
not belongs to Expression chips.
(Expression chips are designed to hybridisize RNA strands and hence
measure RNA expression levels, on the other hand a resequencing chip is
based on DNA, also the design and the probe length is quite different).
So, from my point of view it make sence to differ between dna and rna
chips, at least.

>
> Also, your module deals with doing one specific thing (extraction and  
> incorporation of information about redundant fragments) for the Affy  
> MitoChip.  It might be worth genericizing the class a bit so that you  
> can add new parser or analysis methods w/o having to define new  
> classes to deal with the same Mitochip data.

OK, need to think about that.

>
> Mail list: The mail list subscription page (http://bioperl.org/
<http://www.arcor.de/home/link.php?url=http%3A%2F%2Fbioperl.org%2F&ts=1187974826&hash=13eb66beff4317844b3e2448aa7af12a>

> mailman/listinfo/bioperl-l) allows you to subscribe or change  
> subscription options (at the bottom of the page).
>
cleared

> Bugzilla: if you are logged into Bugzilla under your old email, there  
> is an option at the bottom of the page (Edit : Prefs) where you can  
> change your email address and other preferences.
>
unfortunatly I dont recieve a mail to confirm the change. did try that
twice..


Marian


From bix at sendu.me.uk  Fri Aug 24 12:43:22 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 17:43:22 +0100
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CF0228.2000404@gmail.com>
References: <46CE61EB.5000300@gmail.com>	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
Message-ID: <46CF0AAA.4090301@sendu.me.uk>

Florent Angly wrote:
> Thanks for all the replies.
> 
> Sendu Bala wrote:
> 
>> Isn't Bio::Assembly::Contig what you need?
>
> I'm using this module already to manipulate the contigs, but there's 
> no option that I know of to _display_ the contigs in the way I 
> described.
[snip]
> I think the best way to provide an out-of-the-box solution for 
> displaying contigs the described way would be to _not_ use Bio::Align
> at all, but rather to create a new assembly IO module like 
> Bio::Assembly::IO::simpleout for example. That would be useful.

Yes...


> The reason I wanted to visualize these contigs is because I made a 
> Bio::Assembly::IO module for TIGR Assembler files that I intend on 
> submitting to BioPerl.

That's wonderful... might I cheekily suggest that the solution to your
problem is to extend your IO module so that it does the 'O' as well? Ie.
unlike the other IO modules, write_assembly() is actually implemented.
Then you can round-trip to ensure your next_assembly() method has no bugs.


> I've read the documentation on the Wiki so if a BioPerl developer
> would please like lo step up and contact me directly for checking my
> code, that would be nice =)

If no one does, post it as an enhancement request to bugzilla. A test
script is a must.

http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests


From cjfields at uiuc.edu  Fri Aug 24 13:16:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 12:16:10 -0500
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CF0228.2000404@gmail.com>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
Message-ID: <32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>


On Aug 24, 2007, at 11:07 AM, Florent Angly wrote:
...

> De-Jian,ZHAO wrote:
>> How do you pad the sequences with gaps manually? Just replace the
>> hyphens with blanks? If yes, you can program in perl to automate
>> this process.
>>
> How do I pad the sequences manually?? I calculate how many gaps  
> have to
> go left and right of the aligned sequence based on its length, its
> position in the aligned consensus and the consensus length.
> my $newseq = '-' x $leftnum . $seq . '-'x$rightnum
> By the way, the sequences cannot be stored with blanks in them...
>
> I think the best way to provide an out-of-the-box solution for
> displaying contigs the described way would be to _not_ use  
> Bio::Align at
> all, but rather to create a new assembly IO module like
> Bio::Assembly::IO::simpleout for example. That would be useful.
>
> The reason I wanted to visualize these contigs is because I made a
> Bio::Assembly::IO module for TIGR Assembler files that I intend on
> submitting to BioPerl. I wanted to make sure first that I did not have
> any obvious bug in my contig coordinates. I've read the  
> documentation on
> the Wiki so if a BioPerl developer would please like lo step up and
> contact me directly for checking my code, that would be nice =)
>
> Florent

A similar question has been previously asked on the same subject:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2827/focus=2869

Jason's suggestion was to have a Bio::Assembly::Contig method get_aln 
() which produces a Bio::SimpleAlign object containing appropriately  
padded seqs compatible for AlignIO output.  However, the method was  
never implemented.

Personally, the way I would try going about this would be to  
implement the Contig::get_aln() method, padding with bioperl- 
compliant alignment gap symbols (currently -.*?=~), so if anyone  
wanted they could write to any AlignIO-implemented format (MSF,  
Clustal, etc).  In your Bio::Assembly::IO::simpleout module implement  
write_assembly() and use the Contig::get_aln() method where needed to  
grab the SimpleAlign, then simply substitute gap symbols with spaces  
when writing contig output.

In general, any new code is attached to a bugzilla report as an  
enhancement request:

http://bugzilla.open-bio.org/

One of the devs will work on getting the code incorporated into  
bioperl.  Make sure the code is documented (http://www.bioperl.org/ 
wiki/Advanced_BioPerl), and attach appropriate tests (http:// 
www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests) and test data.

chris


From cjfields at uiuc.edu  Fri Aug 24 13:20:16 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 12:20:16 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <9824900.1187973171940.JavaMail.ngmail@webmail17>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
	<9824900.1187973171940.JavaMail.ngmail@webmail17>
Message-ID: <A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>


On Aug 24, 2007, at 11:32 AM, marian.thieme at arcor.de wrote:

>> ...
> The problem I see, with Bio::Expression, is that Resequencing chips  
> are not belongs to Expression chips.
> (Expression chips are designed to hybridisize RNA strands and hence  
> measure RNA expression levels, on the other hand a resequencing  
> chip is based on DNA, also the design and the probe length is quite  
> different). So, from my point of view it make sence to differ  
> between dna and rna chips, at least.

Then maybe the more generic Bio::Microarray namespace is the way to  
go, with the module name Bio::Microarray::Tools:: MitoChip.  If  
needed other tools can be added as needed.

>> Also, your module deals with doing one specific thing (extraction and
>> incorporation of information about redundant fragments) for the Affy
>> MitoChip.  It might be worth genericizing the class a bit so that you
>> can add new parser or analysis methods w/o having to define new
>> classes to deal with the same Mitochip data.
>
> OK, need to think about that.

It all depends on how much you intend to contribute; if you plan on  
adding to it over time we can talk about starting up a developer  
account.

>> Mail list: The mail list subscription page (http://bioperl.org/
>> mailman/listinfo/bioperl-l) allows you to subscribe or change
>> subscription options (at the bottom of the page).
>>
> cleared
>
>> Bugzilla: if you are logged into Bugzilla under your old email, there
>> is an option at the bottom of the page (Edit : Prefs) where you can
>> change your email address and other preferences.
>>
> unfortunatly I dont recieve a mail to confirm the change. did try  
> that twice..
>
>
> Marian

I tested it out and received the email at both addresses (as it  
states).  If you respond to either email it should implement the  
change in three days time.  If it doesn't you can email support at  
open.bio.org to see if there is a problem.

chris


From florent.angly at gmail.com  Fri Aug 24 13:58:13 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Fri, 24 Aug 2007 10:58:13 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
	<32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>
Message-ID: <46CF1C35.3050100@gmail.com>

Chris Fields wrote:
>
> A similar question has been previously asked on the same subject:
>
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2827/focus=2869
>
> Jason's suggestion was to have a Bio::Assembly::Contig method 
> get_aln() which produces a Bio::SimpleAlign object containing 
> appropriately padded seqs compatible for AlignIO output.  However, the 
> method was never implemented.
>
> Personally, the way I would try going about this would be to implement 
> the Contig::get_aln() method, padding with bioperl-compliant alignment 
> gap symbols (currently -.*?=~), so if anyone wanted they could write 
> to any AlignIO-implemented format (MSF, Clustal, etc).  In your 
> Bio::Assembly::IO::simpleout module implement write_assembly() and use 
> the Contig::get_aln() method where needed to grab the SimpleAlign, 
> then simply substitute gap symbols with spaces when writing contig 
> output.
>
> In general, any new code is attached to a bugzilla report as an 
> enhancement request:
>
> http://bugzilla.open-bio.org/
>
> One of the devs will work on getting the code incorporated into 
> bioperl.  Make sure the code is documented 
> (http://www.bioperl.org/wiki/Advanced_BioPerl), and attach appropriate 
> tests (http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests) and 
> test data.
>
> chris
>
>
Thanks Chris for the pointers, I will be looking into these things.
Florent


From hlapp at gmx.net  Fri Aug 24 14:25:57 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 24 Aug 2007 14:25:57 -0400
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
	<9824900.1187973171940.JavaMail.ngmail@webmail17>
	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
Message-ID: <BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>


On Aug 24, 2007, at 1:20 PM, Chris Fields wrote:

>>> ...
>> The problem I see, with Bio::Expression, is that Resequencing chips
>> are not belongs to Expression chips.
>> (Expression chips are designed to hybridisize RNA strands and hence
>> measure RNA expression levels, on the other hand a resequencing
>> chip is based on DNA, also the design and the probe length is quite
>> different). So, from my point of view it make sence to differ
>> between dna and rna chips, at least.
>
> Then maybe the more generic Bio::Microarray namespace is the way to
> go, with the module name Bio::Microarray::Tools:: MitoChip.  If
> needed other tools can be added as needed.
>

Makes sense to me too. Presumably, regardless of DNA or RNA being  
hybridized or length of probes, the data that comes out of them is  
quite similar in a general nature (namely hybridization signals)?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From marian.thieme at arcor.de  Fri Aug 24 12:32:51 2007
From: marian.thieme at arcor.de (marian.thieme at arcor.de)
Date: Fri, 24 Aug 2007 18:32:51 +0200 (CEST)
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
 module/package name
In-Reply-To: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
Message-ID: <9824900.1187973171940.JavaMail.ngmail@webmail17>

> The new package: How about Bio::Expression::Tools::MitoChip?  My  
> reasoning: I don't think it's necessary to define a new  
> Bio::Resequencing namespace for just one module; my inclination is  
> towards using Bio::Expression namespace as Bio::Tools have been  
> traditionally reserved for output parsers.  I am unsure what the  
> Bio::Expression status is (very little is documented, no tests are  
> written, nothing on the mail list archives); maybe Allen can answer  
> that?  I don't see anything that precludes you from using that  
> namespace as long as your tools are fairly well-defined (they are)  
> and have tests (they do).

The problem I see, with Bio::Expression, is that Resequencing chips are not belongs to Expression chips.
(Expression chips are designed to hybridisize RNA strands and hence measure RNA expression levels, on the other hand a resequencing chip is based on DNA, also the design and the probe length is quite different). So, from my point of view it make sence to differ between dna and rna chips, at least.

> 
> Also, your module deals with doing one specific thing (extraction and  
> incorporation of information about redundant fragments) for the Affy  
> MitoChip.  It might be worth genericizing the class a bit so that you  
> can add new parser or analysis methods w/o having to define new  
> classes to deal with the same Mitochip data.

OK, need to think about that.

> 
> Mail list: The mail list subscription page (http://bioperl.org/ 
> mailman/listinfo/bioperl-l) allows you to subscribe or change  
> subscription options (at the bottom of the page).
> 
cleared

> Bugzilla: if you are logged into Bugzilla under your old email, there  
> is an option at the bottom of the page (Edit : Prefs) where you can  
> change your email address and other preferences.
> 
unfortunatly I dont recieve a mail to confirm the change. did try that twice..


Marian

> On Aug 24, 2007, at 5:05 AM, Marian Thieme wrote:
> 
> > Hi,
> >
> > 2 questions about the naming of the module I did submit
> > (see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)
> >
> > 1.) The package:
> > because there exists already an expression package I suggest to  
> > create a
> > new package called resequencing
> >
> > 2.) I would suggest that the module is called RedundantFragments or
> > AdditionalFragments
> >
> > so we would have something like:
> >
> > Bio::Resequencing::AdditionalFragments
> >
> > Any other ideas ?
> >
> > Marian
> >
> > By the way can anybody change my email adress to  
> > marian.thieme at arcor.de
> > in bugzilla as well as in the bioperl list, please ?!! didnt achieve
> > that by my own...
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT F?R ALLE NEUEINSTEIGER
Jetzt bei Arcor: g?nstig und schnell mit DSL - das All-Inclusive-Paket
f?r clevere Doppel-Sparer, nur  34,95 ?  inkl. DSL- und ISDN-Grundgeb?hr!
http://www.arcor.de/rd/emf-dsl-2


From cjfields at uiuc.edu  Fri Aug 24 17:12:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 16:12:25 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
	<4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>
Message-ID: <ABED5057-CFB5-4AAA-9D23-B6A069575BF6@uiuc.edu>

Okay, I have started a new branch in cvs (tagged featann_rollback).   
I'll start looking through everything within the next few days to get  
a general idea of what needs to be done.  All I know is the changes  
were extensive and included modifications to tests.

If anyone has comments I have added a wiki page here:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

chris

On Aug 24, 2007, at 11:23 AM, Chris Fields wrote:

> On Aug 23, 2007, at 10:34 PM, Hilmar Lapp wrote:
>
>> On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:
>>
>>> There has been little work done over the last 2 1/2 years to undo or
>>> rectify problems associated with those additions; I feel like those
>>> of us still routinely contributing have been left holding the bag.
>>
>> Not by intention, but unfortunately that's probably a fair
>> assessment. (And I'm one of those guilty of inaction.)
>
> Not completely.  You, Jason, Chris M., and several others expressed
> yourselves quite clearly (move the code to a branch and test).  I
> think that everyone was trying to be diplomatic about it and so never
> attempted to do anything except get it working correctly.
>
>>> [...]
>>> I would like to suggest the radical idea of rolling back
>>> AnnotatableI/
>>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>>> are simple scalars) and possibly work in implementing Ewan's
>>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).
>>
>> I fully support this; to me that sounds exactly like the way to go.
>
> Okay, I'll probably go ahead and get a branch started today.  I'll
> have to look at Ewan's interface in more detail; it's possible a new
> SeqFeature implementation will need to be written up to incorporate  
> it.
>
>>> The various AnnotatableI changes, odd inheritance, and operator
>>> overloading have
>>> really obfuscated the code to the point where no one wants to touch
>>> it in case it breaks something important.  However, I believe it is
>>> the one serious impediment to a new stable release.
>>
>> Yes, I think you're hitting the nail on the head.
>>
>> Chris, if you take the lead on this and carry it through we will
>> all owe you hugely. I'm not sure how many beers that would compare
>> to, but I'll throw in some. (Who else do I owe beer? I'm losing
>> track. Strangely nobody tried to redeem beer from me in Vienna.
>> Maybe in Toronto?)
>>
>> Seriously, rectifying this problem would lift a huge weight.
>>
>> 	-hilmar
>
> It would be nice to get regular releases started again.  I think
> this'll help.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From marian at arcor.de  Fri Aug 24 14:48:20 2007
From: marian at arcor.de (marian)
Date: Fri, 24 Aug 2007 20:48:20 +0200
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
 module/package name
In-Reply-To: <BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>	<46CEAD83.2050904@arcor.de>	<9824900.1187973171940.JavaMail.ngmail@webmail17>	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
	<BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
Message-ID: <46CF27F4.8030608@arcor.de>

Hilmar Lapp schrieb:
> On Aug 24, 2007, at 1:20 PM, Chris Fields wrote:
>
>   
>>>> ...
>>>>         
>>> The problem I see, with Bio::Expression, is that Resequencing chips
>>> are not belongs to Expression chips.
>>> (Expression chips are designed to hybridisize RNA strands and hence
>>> measure RNA expression levels, on the other hand a resequencing
>>> chip is based on DNA, also the design and the probe length is quite
>>> different). So, from my point of view it make sence to differ
>>> between dna and rna chips, at least.
>>>       
>> Then maybe the more generic Bio::Microarray namespace is the way to
>> go, with the module name Bio::Microarray::Tools:: MitoChip.  If
>> needed other tools can be added as needed.
>>
>>     
>
> Makes sense to me too. Presumably, regardless of DNA or RNA being  
> hybridized or length of probes, the data that comes out of them is  
> quite similar in a general nature (namely hybridization signals)?
>
> 	-hilmar
>   

Bio::Microarray::Tools::MitoChip would be OK to me. I merely meant, that it 
isnt an expression chip and you also wont/cant analyze expression data with 
the tool I am talking about.

Marian


From cjfields at uiuc.edu  Fri Aug 24 18:36:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 17:36:46 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
Message-ID: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>

One thing I am noticing with the rollback to tag as strings is that  
tags with an undefined value are not set; I'm assuming when tags were  
Bio::AnnotationI they were instantiated regardless with an undef  
value.  When attempting to call an undef tag with get_tag_values() I  
get:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: asking for tag value that does not exist signalPeptideLength
STACK: Error::throw
STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/ 
bioperl-live/blib/lib/Bio/Root/Root.pm:357
STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/ 
featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
STACK: t/targetp.t:189
-----------------------------------------------------------

I personally think of this as a feature (why set a tag at all if it  
is undef?).  However, are there any circumstances where we might want  
this behavior?  Do we want to simply return w/o a value if a tag name  
isn't found (i.e. remove the exception)?

chris


From hlapp at gmx.net  Fri Aug 24 19:02:43 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 24 Aug 2007 19:02:43 -0400
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
Message-ID: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>

You're supposed to call has_tag() first before you can assume that  
you can call get_tag_values() w/o an exception. That was the original  
API.

	-hilmar

On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:

> One thing I am noticing with the rollback to tag as strings is that
> tags with an undefined value are not set; I'm assuming when tags were
> Bio::AnnotationI they were instantiated regardless with an undef
> value.  When attempting to call an undef tag with get_tag_values() I
> get:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: asking for tag value that does not exist signalPeptideLength
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
> bioperl-live/blib/lib/Bio/Root/Root.pm:357
> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
> STACK: t/targetp.t:189
> -----------------------------------------------------------
>
> I personally think of this as a feature (why set a tag at all if it
> is undef?).  However, are there any circumstances where we might want
> this behavior?  Do we want to simply return w/o a value if a tag name
> isn't found (i.e. remove the exception)?
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Aug 25 00:05:58 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 23:05:58 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
Message-ID: <6392DF1D-D91B-4B6E-812B-38FC0EA0D234@uiuc.edu>

Makes sense.  Okay, I'll leave the exception in.  Thanks!

chris

On Aug 24, 2007, at 6:02 PM, Hilmar Lapp wrote:

> You're supposed to call has_tag() first before you can assume that
> you can call get_tag_values() w/o an exception. That was the original
> API.
>
> 	-hilmar
>
> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>
>> One thing I am noticing with the rollback to tag as strings is that
>> tags with an undefined value are not set; I'm assuming when tags were
>> Bio::AnnotationI they were instantiated regardless with an undef
>> value.  When attempting to call an undef tag with get_tag_values() I
>> get:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: asking for tag value that does not exist signalPeptideLength
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>> STACK: t/targetp.t:189
>> -----------------------------------------------------------
>>
>> I personally think of this as a feature (why set a tag at all if it
>> is undef?).  However, are there any circumstances where we might want
>> this behavior?  Do we want to simply return w/o a value if a tag name
>> isn't found (i.e. remove the exception)?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sat Aug 25 03:50:29 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 25 Aug 2007 08:50:29 +0100
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
Message-ID: <46CFDF45.8030200@sheffield.ac.uk>

This sort of highlights a comment I made previously about how do you
test for a stable API?

It seems to me that unless you have intricate knowledge about the
changes that took place, you will find it difficult to know when an API
change has occurred. Is it possible to run the 1.4 test suite against
existing code to ensure tests pass? What if the 1.4 tests contained
bugs? This approach would need good code coverage by the tests to ensure
things work the same i.e. test code in HEAD against the test suite from
the previous stable release's branch - would/should this work
conceptually?**

Nath

Hilmar Lapp wrote:
> You're supposed to call has_tag() first before you can assume that  
> you can call get_tag_values() w/o an exception. That was the original  
> API.
>
> 	-hilmar
>
> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>
>   
>> One thing I am noticing with the rollback to tag as strings is that
>> tags with an undefined value are not set; I'm assuming when tags were
>> Bio::AnnotationI they were instantiated regardless with an undef
>> value.  When attempting to call an undef tag with get_tag_values() I
>> get:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: asking for tag value that does not exist signalPeptideLength
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>> STACK: t/targetp.t:189
>> -----------------------------------------------------------
>>
>> I personally think of this as a feature (why set a tag at all if it
>> is undef?).  However, are there any circumstances where we might want
>> this behavior?  Do we want to simply return w/o a value if a tag name
>> isn't found (i.e. remove the exception)?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>   


From cjfields at uiuc.edu  Sat Aug 25 10:36:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 25 Aug 2007 09:36:08 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <46CFDF45.8030200@sheffield.ac.uk>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
	<46CFDF45.8030200@sheffield.ac.uk>
Message-ID: <3F3C311E-3CD5-436B-987F-FD7695904647@uiuc.edu>

The rollback branch is off of HEAD, not 1.4, so any bugs fixed since  
then and any modules/tests added will be present.  So far everything  
has worked relatively well; you can check the history of this page to  
track what has happened so far:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

The only problem code remaining for the first round of changes is a  
single method in Bio::SeqFeature::Annotated (if the tests are to be  
trusted) and one test in Bio::SeqFeature::AnnotationAdaptor using  
Hilmar's original test suite.  Most of those were tests breaking  
Feature/Annotation API outlined in the HOWTO (calling get_Annotations  
directly from a Bio::SeqI or Bio::SeqFeatureI for instance), or  
examples where has_tag() was not used.  I agree good test coverage  
would probably help catch some of those still silently lingering in  
code, but I don't think it can find everything; that's the reason I  
indicate there will need extensive testing.  That applies within the  
suite but also by users in the wild.

The SeqFeatureI and AnnotatableI API is defined very specifically in  
the Feature/Annotation HOWTO, so if anything the introduced changes  
violated much of that and started a domino effect of users  
unknowingly violating the API (me among them).  Also, just b/c a test  
passes doesn't mean it is the ->correct<- result; it is very easy to  
just throw something from Data::Dumper into an is() test and have it  
pass.  As an example, it appears there was a bit of cheating going on  
with AnnotationAdaptor.t in particular, where expected numbers were  
changed to conform to results w/o explanation.  Which is the correct  
answer?  I trust Hilmar's original test suite over the (rushed) changes.

chris

On Aug 25, 2007, at 2:50 AM, Nathan S. Haigh wrote:

> This sort of highlights a comment I made previously about how do you
> test for a stable API?
>
> It seems to me that unless you have intricate knowledge about the
> changes that took place, you will find it difficult to know when an  
> API
> change has occurred. Is it possible to run the 1.4 test suite against
> existing code to ensure tests pass? What if the 1.4 tests contained
> bugs? This approach would need good code coverage by the tests to  
> ensure
> things work the same i.e. test code in HEAD against the test suite  
> from
> the previous stable release's branch - would/should this work
> conceptually?**
>
> Nath
>
> Hilmar Lapp wrote:
>> You're supposed to call has_tag() first before you can assume that
>> you can call get_tag_values() w/o an exception. That was the original
>> API.
>>
>> 	-hilmar
>>
>> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>>
>>
>>> One thing I am noticing with the rollback to tag as strings is that
>>> tags with an undefined value are not set; I'm assuming when tags  
>>> were
>>> Bio::AnnotationI they were instantiated regardless with an undef
>>> value.  When attempting to call an undef tag with get_tag_values() I
>>> get:
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: asking for tag value that does not exist signalPeptideLength
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>>> STACK: t/targetp.t:189
>>> -----------------------------------------------------------
>>>
>>> I personally think of this as a feature (why set a tag at all if it
>>> is undef?).  However, are there any circumstances where we might  
>>> want
>>> this behavior?  Do we want to simply return w/o a value if a tag  
>>> name
>>> isn't found (i.e. remove the exception)?
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sat Aug 25 18:12:49 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 25 Aug 2007 17:12:49 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
Message-ID: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>

I have finished rolling back most of the specific changes made prior  
to the 1.5 release and have relevant tests passing:

http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round

Operator overloading of Bio::Annotation objects will be trickier to  
debug as tons of tests fail when the overloading is removed:

http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round

I'll start looking into fixes.  I don't like overloads from a  
personal standpoint (problems w/ long-term code maintenance), but was  
there a more specific reason for removing them?

chris


From hlapp at gmx.net  Sun Aug 26 00:58:46 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 00:58:46 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
Message-ID: <3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>

The reason was to provide for backward compatibility with the  
original API in which tag values were scalars, not objects. The idea  
was that if someone relied on that and treats the object as a scalar  
(comparison, printing, etc), the operator overloading would take care  
of that.

So by going back to the original API the overloading should become  
obsolete, at least theoretically.

The overloading can cause some very subtle issues that I pointed out  
in an earlier email. It's one of those really "clever" tricks that  
just make it very hard for newcomers to understand what's going on in  
their code.

	-hilmar

On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:

> I have finished rolling back most of the specific changes made prior
> to the 1.5 release and have relevant tests passing:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>
> Operator overloading of Bio::Annotation objects will be trickier to
> debug as tons of tests fail when the overloading is removed:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>
> I'll start looking into fixes.  I don't like overloads from a
> personal standpoint (problems w/ long-term code maintenance), but was
> there a more specific reason for removing them?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From n.haigh at sheffield.ac.uk  Sun Aug 26 03:35:36 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 26 Aug 2007 08:35:36 +0100
Subject: [Bioperl-l] please some help
In-Reply-To: <20070823233044.BJQ45014@mailstore2.fiu.edu>
References: <20070823233044.BJQ45014@mailstore2.fiu.edu>
Message-ID: <46D12D48.8080301@sheffield.ac.uk>

mcons004 at fiu.edu wrote:
>   Hello,
>      I am new to this software and I am having some trouble starting. The version of Bioperl I am working on is v5.8.6. My OS is Unix (Mac OS X). I am trying to use Bioperl with a file called blastParser to process a file which is the output of a "blastall" operation.
>   
>  The code that gives me error is:
>> perl blastParser.pl junk.out 1 1 1.0
>  and the error message says:
> Can't locate Bio/SearchIO.pm in @INC (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level
> 
>  You online info says I probably means that the module Bio::SearchIO.pm is not instaled and I can either install Bundle::Bioperl or install that specific module by hand. Could you give me some tips in this? I am new working with Unix, and Bioperl so I am a little confused. Any information will be helpful for me. Thanks
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

 From what you have said, it appears you need some basic info to 
understand what you are trying to achieve.

The Perl programming language requires the Perl interpreter in order to 
execute a Perl script. The Perl interpreter is usually installed as 
standard with Unix/Linux based Operating Systems. The version you 
mention (5.8.6) will not be the version of Bioperl but the version of 
the Perl interpreter you have installed - you can check this by typing 
"perl -v" at a command prompt.

Given your apparent lack of understanding of the Unix OS, it is likely 
that you don't have Bioperl installed. You should have a look at:
http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink

Nath


From cjfields at uiuc.edu  Sun Aug 26 15:22:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 26 Aug 2007 14:22:24 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
Message-ID: <B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>

I managed to find your comments (as well as ones from Ewan, Jason,  
and a few others) on the mail list archives, so I'll link to them.   
The problem will be fixing the several places where overloading is  
assumed but no longer exists (i.e. in write_* methods), but we can  
probably pinpoint those by throwing or warning when overloading is  
assumed.

My thought is to either modify as_text() or add a new display_text()  
method to all AnnotationI that explicitly does what the overloading  
implied (print the annotation in a specified or assumed way).  We  
could then delegate to that in the stringification overload (with  
appropriate deprecation warnings) until 1.6, where we remove it  
completely.  Something like:

my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
                                         -primary_id => 'TSC0000030',
                                         -tagname => "tag2);

# either
print $link1->display_text(),"\n";
# or ...
print $link1->as_text("display"),"\n";
# prints "TSC:TSC0000030"

# default human readable
print $link1->as_text(),"\n";
# prints "Direct database link to TSC0000030 in database TSC"

print "$link1\n";
# gets a deprecation warning for now, removed completely for 1.6

chris

On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:

> The reason was to provide for backward compatibility with the  
> original API in which tag values were scalars, not objects. The  
> idea was that if someone relied on that and treats the object as a  
> scalar (comparison, printing, etc), the operator overloading would  
> take care of that.
>
> So by going back to the original API the overloading should become  
> obsolete, at least theoretically.
>
> The overloading can cause some very subtle issues that I pointed  
> out in an earlier email. It's one of those really "clever" tricks  
> that just make it very hard for newcomers to understand what's  
> going on in their code.
>
> 	-hilmar
>
> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>
>> I have finished rolling back most of the specific changes made prior
>> to the 1.5 release and have relevant tests passing:
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>
>> Operator overloading of Bio::Annotation objects will be trickier to
>> debug as tons of tests fail when the overloading is removed:
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>>
>> I'll start looking into fixes.  I don't like overloads from a
>> personal standpoint (problems w/ long-term code maintenance), but was
>> there a more specific reason for removing them?
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Sun Aug 26 16:57:37 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 16:57:37 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
Message-ID: <503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>

The thing that I actually never quite understood (and predates the  
API changes) is why $ann->as_text() needs to include explanatory text  
such as 'Direct database link to blah in database foo.' I would have  
said that "TSC:TSC0000030" is human readable enough, unless you  
present it without any context so that one would have no clue that it  
is a database cross-reference.

The as_text() method shouldn't be meant for the sole purpose of  
debugging annotation collections. However, I'm not sure for what else  
you could use it for, given that there are no guidelines for what to  
expect.

In fact, I do use as_text() a lot for a real purpose, namely as a  
surrogate unique key. For example, making a collection of dblinks  
unique is quite simple using the as_text() method:

	my %dbhash = map { ($_->as_text(), $_) } $anncoll->remove_Annotations 
('dblink');
	$anncoll->add_Annotation('dblink',$_) foreach (values %dbhash);

This is a common task when harvesting annotation from various places  
and then integrating it. However, there is nothing in the API  
documentation that suggests that this might be a reliable or even  
expected property such that you could omit the 'dblink' tag above.

I agree that having a conceptual equivalent to $feature->display_name  
and $seq->display_id would be good, but these methods have no claim  
to returning something that's unique in any way.

I guess I've now raised more questions than I answered (in fact I  
didn't answer any). Sorry 'bout that.

	-hilmar

On Aug 26, 2007, at 3:22 PM, Chris Fields wrote:

> I managed to find your comments (as well as ones from Ewan, Jason,  
> and a few others) on the mail list archives, so I'll link to them.   
> The problem will be fixing the several places where overloading is  
> assumed but no longer exists (i.e. in write_* methods), but we can  
> probably pinpoint those by throwing or warning when overloading is  
> assumed.
>
> My thought is to either modify as_text() or add a new display_text 
> () method to all AnnotationI that explicitly does what the  
> overloading implied (print the annotation in a specified or assumed  
> way).  We could then delegate to that in the stringification  
> overload (with appropriate deprecation warnings) until 1.6, where  
> we remove it completely.  Something like:
>
> my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
>                                         -primary_id => 'TSC0000030',
>                                         -tagname => "tag2);
>
> # either
> print $link1->display_text(),"\n";
> # or ...
> print $link1->as_text("display"),"\n";
> # prints "TSC:TSC0000030"
>
> # default human readable
> print $link1->as_text(),"\n";
> # prints "Direct database link to TSC0000030 in database TSC"
>
> print "$link1\n";
> # gets a deprecation warning for now, removed completely for 1.6
>
> chris
>
> On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:
>
>> The reason was to provide for backward compatibility with the  
>> original API in which tag values were scalars, not objects. The  
>> idea was that if someone relied on that and treats the object as a  
>> scalar (comparison, printing, etc), the operator overloading would  
>> take care of that.
>>
>> So by going back to the original API the overloading should become  
>> obsolete, at least theoretically.
>>
>> The overloading can cause some very subtle issues that I pointed  
>> out in an earlier email. It's one of those really "clever" tricks  
>> that just make it very hard for newcomers to understand what's  
>> going on in their code.
>>
>> 	-hilmar
>>
>> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>>
>>> I have finished rolling back most of the specific changes made prior
>>> to the 1.5 release and have relevant tests passing:
>>>
>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>>
>>> Operator overloading of Bio::Annotation objects will be trickier to
>>> debug as tons of tests fail when the overloading is removed:
>>>
>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>>>
>>> I'll start looking into fixes.  I don't like overloads from a
>>> personal standpoint (problems w/ long-term code maintenance), but  
>>> was
>>> there a more specific reason for removing them?
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sun Aug 26 18:47:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 26 Aug 2007 17:47:41 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
	<503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
Message-ID: <E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>

Either way I implement, it would be used simply as a generic  
convenience method to replicate output via stringification  
overloading, using a common method name for all AnnotationI; there  
seem to be several instances where this is used for generating output  
(i.e. SeqIO::genbank).  So, for instance, when formatting output you  
could just call as_text('display') or display_text() and you would  
get the most common formatting for that particular annotation type.

chris

On Aug 26, 2007, at 3:57 PM, Hilmar Lapp wrote:

> The thing that I actually never quite understood (and predates the  
> API changes) is why $ann->as_text() needs to include explanatory  
> text such as 'Direct database link to blah in database foo.' I  
> would have said that "TSC:TSC0000030" is human readable enough,  
> unless you present it without any context so that one would have no  
> clue that it is a database cross-reference.
>
> The as_text() method shouldn't be meant for the sole purpose of  
> debugging annotation collections. However, I'm not sure for what  
> else you could use it for, given that there are no guidelines for  
> what to expect.
>
> In fact, I do use as_text() a lot for a real purpose, namely as a  
> surrogate unique key. For example, making a collection of dblinks  
> unique is quite simple using the as_text() method:
>
> 	my %dbhash = map { ($_->as_text(), $_) } $anncoll- 
> >remove_Annotations('dblink');
> 	$anncoll->add_Annotation('dblink',$_) foreach (values %dbhash);
>
> This is a common task when harvesting annotation from various  
> places and then integrating it. However, there is nothing in the  
> API documentation that suggests that this might be a reliable or  
> even expected property such that you could omit the 'dblink' tag  
> above.
>
> I agree that having a conceptual equivalent to $feature- 
> >display_name and $seq->display_id would be good, but these methods  
> have no claim to returning something that's unique in any way.
>
> I guess I've now raised more questions than I answered (in fact I  
> didn't answer any). Sorry 'bout that.
>
> 	-hilmar
>
> On Aug 26, 2007, at 3:22 PM, Chris Fields wrote:
>
>> I managed to find your comments (as well as ones from Ewan, Jason,  
>> and a few others) on the mail list archives, so I'll link to  
>> them.  The problem will be fixing the several places where  
>> overloading is assumed but no longer exists (i.e. in write_*  
>> methods), but we can probably pinpoint those by throwing or  
>> warning when overloading is assumed.
>>
>> My thought is to either modify as_text() or add a new display_text 
>> () method to all AnnotationI that explicitly does what the  
>> overloading implied (print the annotation in a specified or  
>> assumed way).  We could then delegate to that in the  
>> stringification overload (with appropriate deprecation warnings)  
>> until 1.6, where we remove it completely.  Something like:
>>
>> my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
>>                                         -primary_id => 'TSC0000030',
>>                                         -tagname => "tag2);
>>
>> # either
>> print $link1->display_text(),"\n";
>> # or ...
>> print $link1->as_text("display"),"\n";
>> # prints "TSC:TSC0000030"
>>
>> # default human readable
>> print $link1->as_text(),"\n";
>> # prints "Direct database link to TSC0000030 in database TSC"
>>
>> print "$link1\n";
>> # gets a deprecation warning for now, removed completely for 1.6
>>
>> chris
>>
>> On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:
>>
>>> The reason was to provide for backward compatibility with the  
>>> original API in which tag values were scalars, not objects. The  
>>> idea was that if someone relied on that and treats the object as  
>>> a scalar (comparison, printing, etc), the operator overloading  
>>> would take care of that.
>>>
>>> So by going back to the original API the overloading should  
>>> become obsolete, at least theoretically.
>>>
>>> The overloading can cause some very subtle issues that I pointed  
>>> out in an earlier email. It's one of those really "clever" tricks  
>>> that just make it very hard for newcomers to understand what's  
>>> going on in their code.
>>>
>>> 	-hilmar
>>>
>>> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>>>
>>>> I have finished rolling back most of the specific changes made  
>>>> prior
>>>> to the 1.5 release and have relevant tests passing:
>>>>
>>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>>>
>>>> Operator overloading of Bio::Annotation objects will be trickier to
>>>> debug as tons of tests fail when the overloading is removed:
>>>>
>>>> http://www.bioperl.org/wiki/ 
>>>> Feature_Annotation_rollback#Second_round
>>>>
>>>> I'll start looking into fixes.  I don't like overloads from a
>>>> personal standpoint (problems w/ long-term code maintenance),  
>>>> but was
>>>> there a more specific reason for removing them?
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Sun Aug 26 19:01:03 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 19:01:03 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
	<503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
	<E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>
Message-ID: <35BBCF3B-BA1B-4C8D-8753-2A27AB3B423C@gmx.net>


On Aug 26, 2007, at 6:47 PM, Chris Fields wrote:

> just call as_text('display') or display_text()

The latter is more obvious, and can be better tested for presence and  
implementation, though in the world of perl that's of course not  
strictly true.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From zeroliu at 163.com  Mon Aug 27 07:49:53 2007
From: zeroliu at 163.com (zeroliu)
Date: Mon, 27 Aug 2007 19:49:53 +0800 (CST)
Subject: [Bioperl-l] Problems of parse emboss water result by Bio::AlignIO
Message-ID: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>

 Hello,
I'm trying to parse water (EMBOSS 5.0.0) result by Bio::AlignIO
(Bioperl-1.4) and encountered some problems.
1. What does the Bio::AlignIO->next_aln() return?
Does it return a Bio::Align::AlignI or Bio::SimpleAlign object?
Or it depends on the alignment file format?
2. How can I get the "score" properity in a water alignment result?
There is a score method in Bio::SimpleAlign but not in Bio::AlignIO.
In 2004, Jason mentioned:
Scores are set by the Alignment parser - we separate the 'running' from
the 'parsing'.
Bio::AlignIO::emboss had to be updated.
(http://article.gmane.org/gmane.comp.lang.perl.bio.general/7156/match=alignio+water)
How could I know it?
Thank you very much!  


From cjfields at uiuc.edu  Mon Aug 27 13:13:13 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 12:13:13 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Annotated status
Message-ID: <6DC5ECA8-3DF1-4B84-914C-4F2B3B44E29A@uiuc.edu>

What is the current status on maintenance of  
Bio::SeqFeature::Annotated?  From what I gather (based on the code  
and past mail list posts) the intent of the module seems to be to  
store any SeqFeature-specific data (tags, score, source, primary_tag,  
etc) in a Bio::AnnotationCollectionI as strongly typed data.  However  
there are several inconsistencies, such as objects being returned  
when a string is expected (score(), source()).

Also, several methods appear half-implemented, aren't consistent with  
SeqFeatureI API or similar methods in other SeqFeatureI's, and there  
are no docs explaining what is expected.
If no one speaks up on it, I'll do my best with maintaining it  
myself, but don't expect the API to stay as it is.

chris


From cjfields at uiuc.edu  Mon Aug 27 18:31:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 17:31:01 -0500
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
Message-ID: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>

This is related to the ongoing Feature/Annotation rollback.  I have  
found that a few Ontology-related modules are (either directly or  
indirectly) passing strings instead of Bio::Annotation::DBLinks to  
Bio::Ontology::Term::new(), add_dblink(), or add_dblink_context()  
(thelast is where the error occurs).

If needed we could allow strings to be passed but this isn't  
consistent with the API.  Any thoughts on what to do here?

chris


From hlapp at gmx.net  Mon Aug 27 19:07:12 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 27 Aug 2007 19:07:12 -0400
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
In-Reply-To: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
References: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
Message-ID: <01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>

The B::O::TermI interface actually says that get_dblinks() would  
return scalars. That's why the add_dblink methods accept strings. I  
also agree that this is inconsistent with with the rest of BioPerl.

Oddly enough, Term::add_dblink_context() does ask for DBLink objects,  
though it doesn't seem to be enforced, even though  
Term::get_dblink_context() is advertised as returning scalars.

So it does seem this is messed up design-wise. It seems to me that to  
really fix this would inevitably break the API, and I don't see how  
you would make this backwards compatible w/o creating a lot of messy  
code, the sole purpose of which would be backwards compatibility.

One could only fix Term::add_dblink_context() as it's not in the  
interface but that wouldn't contribute anything to improving  
consistency.

So the alternative to breaking the API in a non-backwards compatible  
fashion would be to add to it, map the existing dblink methods onto  
the added ones, and start deprecating them. For example, you could  
add methods get_dbxrefs() (also on the interface), add_dbxref(),  
etc,   and build in a context argument so we don't need another set  
of methods for that. They would accept and return DBLink objects, and  
the get_dblink() methods could be changed to map those to scalars  
while also getting slated for deprecation.

Does this make sense?

	-hilmar

On Aug 27, 2007, at 6:31 PM, Chris Fields wrote:

> This is related to the ongoing Feature/Annotation rollback.  I have
> found that a few Ontology-related modules are (either directly or
> indirectly) passing strings instead of Bio::Annotation::DBLinks to
> Bio::Ontology::Term::new(), add_dblink(), or add_dblink_context()
> (thelast is where the error occurs).
>
> If needed we could allow strings to be passed but this isn't
> consistent with the API.  Any thoughts on what to do here?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Aug 27 21:12:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 20:12:35 -0500
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
In-Reply-To: <01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>
References: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
	<01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>
Message-ID: <EF121F1E-BAA0-49BD-830F-1F3BC6FAC807@uiuc.edu>


On Aug 27, 2007, at 6:07 PM, Hilmar Lapp wrote:

> The B::O::TermI interface actually says that get_dblinks() would  
> return scalars. That's why the add_dblink methods accept strings. I  
> also agree that this is inconsistent with with the rest of BioPerl.
>
> Oddly enough, Term::add_dblink_context() does ask for DBLink  
> objects, though it doesn't seem to be enforced, even though  
> Term::get_dblink_context() is advertised as returning scalars.

This happened b/c of stringification and 'eq' overloading.  Just  
removing the overloads didn't reveal this problem; I had to add  
exceptions to them to fish this out.

> So it does seem this is messed up design-wise. It seems to me that  
> to really fix this would inevitably break the API, and I don't see  
> how you would make this backwards compatible w/o creating a lot of  
> messy code, the sole purpose of which would be backwards  
> compatibility.
>
> One could only fix Term::add_dblink_context() as it's not in the  
> interface but that wouldn't contribute anything to improving  
> consistency.

Agreed; in fact it may make it more confusing.

> So the alternative to breaking the API in a non-backwards  
> compatible fashion would be to add to it, map the existing dblink  
> methods onto the added ones, and start deprecating them. For  
> example, you could add methods get_dbxrefs() (also on the  
> interface), add_dbxref(), etc,   and build in a context argument so  
> we don't need another set of methods for that. They would accept  
> and return DBLink objects, and the get_dblink() methods could be  
> changed to map those to scalars while also getting slated for  
> deprecation.
>
> Does this make sense?
>
> 	-hilmar

I think so; I'll have to look over the code to see how we would  
implement this, though I'm guessing everything would be stored as  
DBLink objects by default.  Any changes will probably need to wait  
until after I fish out any remaining spots in the code where  
overloading is being used, but at least we have a direction on where  
to go.

chris


From cjfields at uiuc.edu  Tue Aug 28 00:18:19 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 23:18:19 -0500
Subject: [Bioperl-l] Feature/Annotation rollback (update #2)
Message-ID: <A91DD20B-841B-480A-A953-E811AD634AF0@uiuc.edu>

Okay, the planned rollback on is pretty much complete with a few  
exceptions.  I'll probably merge back to bioperl-live within the next  
few days once the following issues are addressed:

1)  Bio::Ontology::Term - several classes are using  
Bio::Ontology::Term in ways inconsistent with one another; some are  
passing Bio::Annotation::DBLink instances and other are passing  
simple strings.  This was somewhat transparent with various operator  
overloads but now they have really come to the surface.  I'll  
probably work on Hilmar's suggestion on adding extra class methods to  
give it a more consistent interface and deprecate the older ones.  As  
one might guess this affects much of Bio::Ontology but also  
Bio::Seqfeature::Annotated; strangely enough FeatureIO tests pass  
(which may simply mean there isn't enough test coverage for FeatureIO).

2)  Bio::SeqFeature::Annotated - no word back on maintenance for this  
module.  It needs to implement Bio::SeqFeature::TypedSeqFeatureI  
(pretty easy) and needs documentation (not so easy).  It's apparently  
essential for FeatureIO.  I'll basically get it up-and-running and  
clean up the API.

There are a few odds and ends that need to be addressed with  
roundtripping, but these are already problems on the MAIN trunk so  
they will be addressed once code is merged back in.

chris


From Frigerio at pierroton.inra.fr  Tue Aug 28 03:12:22 2007
From: Frigerio at pierroton.inra.fr (Jean-Marc FRIGERIO)
Date: Tue, 28 Aug 2007 09:12:22 +0200
Subject: [Bioperl-l] Bio::SeqIO::phd_comment objet
Message-ID: <200708280912.22798.Frigerio@pierroton.inra.fr>

Hi,

The Bio::SeqIO::phd module says, speaking about the COMMENT section of a phd 
file:
 # this should be an actual object to assist in serialization
  # but I don't have time for this now."

The doc says ( http://www.bioperl.org/wiki/Core_1.5.1_1.5.2_delta)

   This really needs a "phred_comments" object of some sort so that it will be 
serializable. Then when java clients get this object they will be able to 
deserialize it. 

I volunteer to do this,  but need your opinion.

Do we really need an object (Bio::phd_comment ? Bio::SeqIO::phd_comment ? 
Bio::phd_header ? other ?).

Or adding  few  Bio::Seq::SeqWithQuality subs in the Bio::SeqIO::phd module 
would suffice ? What are the constraints of serialization/deserialization of 
the java clients ?
I was thinking of just adding get/setter for all the comments
chromat_file(), abi_thumbprint(), etc.

touch() for the timestamp
attribute() for new unknown comments
write_comment().

others ?

		-- jmf

-- 
Jean-Marc Frigerio,
UMR BIOGECO   69, route d'Arcachon, 33612 CESTAS France
Tel : +33(0) 557 122 829   Fax : +33(0) 557 122 881
Frigerio at pierroton.inra.fr   http://www.pierroton.inra.fr/biogeco/index.html


From jay at jays.net  Tue Aug 28 07:14:37 2007
From: jay at jays.net (Jay Hannah)
Date: Tue, 28 Aug 2007 06:14:37 -0500
Subject: [Bioperl-l] Problems of parse emboss water result by
	Bio::AlignIO
In-Reply-To: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>
References: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>
Message-ID: <4CD8B5C2-3C87-495C-894E-17C3C67091DA@jays.net>

On Aug 27, 2007, at 6:49 AM, zeroliu wrote:
> I'm trying to parse water (EMBOSS 5.0.0) result by Bio::AlignIO
> (Bioperl-1.4) and encountered some problems.
> 1. What does the Bio::AlignIO->next_aln() return?
> Does it return a Bio::Align::AlignI or Bio::SimpleAlign object?
> Or it depends on the alignment file format?

http://doc.bioperl.org/bioperl-live/Bio/AlignIO.html
  Title   : next_aln
  Usage   : $aln = stream->next_aln
  Function: reads the next $aln object from the stream
  Returns : a Bio::Align::AlignI compliant object

> 2. How can I get the "score" properity in a water alignment result?
> There is a score method in Bio::SimpleAlign but not in Bio::AlignIO.
> In 2004, Jason mentioned:
> Scores are set by the Alignment parser - we separate the 'running'  
> from
> the 'parsing'.
> Bio::AlignIO::emboss had to be updated.
> (http://article.gmane.org/gmane.comp.lang.perl.bio.general/7156/ 
> match=alignio+water)
> How could I know it?

Line 480 of t/AlignIO.t seems to walk you through? Here's the block,  
with the test overhead removed.

# EMBOSS water
$str = Bio::AlignIO->new('-format' => 'emboss',
                          '-file' => 'cysprot.water');
$aln = $str->next_aln();
# $aln is now a Bio::Align::AlignI object
print $aln->score;    # '501.50'

HTH,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Tue Aug 28 17:05:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 28 Aug 2007 16:05:10 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
Message-ID: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>

I'm now wrapping up the Feature/Annotation rollback.  I will probably  
start merging back to the main branch in the next day or two., as  
soon as interested parties (*cough*devs*cough*) look over the last  
batch of changes.

http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round

I have also added a small benchmark test which indicates a decrease  
in parsing time in SeqIO::genbank with all tests passing.  I expect  
this will translate over to any Bio::SeqFeature::Generic-using class  
(open mouth, prepare to insert foot....).

It is also possible there are still some instances where overloading  
is expected lurking about in the ~1000 or so modules, so I'll leave  
the exceptions I added to all Bio::AnnotationI; we can remove them  
down the line, maybe prior to rel1.6, after more tests are added or  
if they get particularly annoying.  My guess is I caught 99.99% of  
them (prepare to insert other foot....).

The key change in this last round is the addition of several class  
*dbxref* methods to Bio::Ontology::Term and  
Bio::Annotation::OntologyTerm, all of which are capable of working  
with either DBLink instances or simple scalars.  This was primarily  
done in order to clear up inconsistencies in the older *dblink*  
methods, which were ambiguous (some indicates simple scalar  
arguments, others DBLink objects); operator overloading was used  
extensively in these cases, which led to several issues.  I have  
added deprecation warnings to the older methods which now map to  
using the newer methods.  All tests pass with the exception of a few  
already failing on the MAIN branch; the single test which needs to be  
fixed is a round-tripping error in swiss.t (now a TODO), which can be  
fixed after merging back.

Please respond to this if there are any questions or if I need to  
clarify the changes I made a bit more.

chris


From hlapp at duke.edu  Tue Aug 28 18:13:32 2007
From: hlapp at duke.edu (Hilmar Lapp)
Date: Tue, 28 Aug 2007 18:13:32 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
References: <20070828070219.DE03668527@evol.biology.mcmaster.ca>
Message-ID: <1F006707-291C-4895-A178-33FDFBDE6AE6@duke.edu>

Is anyone thinking about adding support for this as an aligner  
option? I'm not sure whether aside from a Bio::Tools::Run module we'd  
also need a format parser - it sounds like it's emitting clustalw  
format?

	-hilmar

Begin forwarded message:

> From: evoldir at evol.biology.mcmaster.ca
> Date: August 28, 2007 3:02:19 AM EDT
> To: hlapp at duke.edu
> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
> Reply-To: racartwr at ncsu.edu
>
>
> Ngila is a global, pairwise alignment program that uses logarithmic  
> and
> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are more
> biologically realistic than the more popular (and efficient) affine  
> gap
> cost model.
>
> I have recently completed updating the program to version 1.2.1.  The
> new version includes two new, evolutionary alignment models based  
> on my
> current research.  These models allow you to find the maximum  
> alignment
> of two sequences based on biological, evolutionary parameters---no  
> more
> guessing at biological costs.  Additional changes are noted on the  
> website.
>
> Website & Manual:
>
> http://scit.us/projects/ngila/
>
> Windows Binary:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>
> Unix/Mac Source Code:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>
> I'll be happy to answer any questions users have about the new  
> models or
> the program.
>
> -- 
> *********************************************************
> Reed A. Cartwright, PhD     http://scit.us/
> Postdoctoral Researcher     http://www.dererumnatura.us/
> Department of Genetics      http://www.pandasthumb.org/
>
> Bioinformatics Research Center
> North Carolina State University
> Campus Box 7566
> Raleigh, NC 27695-7566
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:- hlapp at duke dot edu :
===========================================================


From hlapp at duke.edu  Tue Aug 28 18:13:32 2007
From: hlapp at duke.edu (Hilmar Lapp)
Date: Tue, 28 Aug 2007 18:13:32 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
Message-ID: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>

Is anyone thinking about adding support for this as an aligner  
option? I'm not sure whether aside from a Bio::Tools::Run module we'd  
also need a format parser - it sounds like it's emitting clustalw  
format?

	-hilmar

Begin forwarded message:

> From: evoldir at evol.biology.mcmaster.ca
> Date: August 28, 2007 3:02:19 AM EDT
> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
> Reply-To: racartwr at ncsu.edu
>
>
> Ngila is a global, pairwise alignment program that uses logarithmic  
> and
> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are more
> biologically realistic than the more popular (and efficient) affine  
> gap
> cost model.
>
> I have recently completed updating the program to version 1.2.1.  The
> new version includes two new, evolutionary alignment models based  
> on my
> current research.  These models allow you to find the maximum  
> alignment
> of two sequences based on biological, evolutionary parameters---no  
> more
> guessing at biological costs.  Additional changes are noted on the  
> website.
>
> Website & Manual:
>
> http://scit.us/projects/ngila/
>
> Windows Binary:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>
> Unix/Mac Source Code:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>
> I'll be happy to answer any questions users have about the new  
> models or
> the program.
>
> -- 
> *********************************************************
> Reed A. Cartwright, PhD     http://scit.us/
> Postdoctoral Researcher     http://www.dererumnatura.us/
> Department of Genetics      http://www.pandasthumb.org/
>
> Bioinformatics Research Center
> North Carolina State University
> Campus Box 7566
> Raleigh, NC 27695-7566
>


From hlapp at gmx.net  Tue Aug 28 19:09:46 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 28 Aug 2007 19:09:46 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
In-Reply-To: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
References: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
Message-ID: <EF683AC3-F30C-49BC-9F16-7BA10C70F751@gmx.net>

Sorry for the double post, BTW. I had erroneously assumed that the  
first email would be held for post by non-member. -hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 00:01:13 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 28 Aug 2007 23:01:13 -0500
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
In-Reply-To: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
References: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
Message-ID: <EDED724C-3219-45FF-BAF2-592EEEBCB634@uiuc.edu>

It probably wouldn't be hard to write one up, particularly if it's  
got already parsable format.  We could probably base it off the  
current clustalw wrapper unless someone else thinks there is a better  
way.

chris

On Aug 28, 2007, at 5:13 PM, Hilmar Lapp wrote:

> Is anyone thinking about adding support for this as an aligner
> option? I'm not sure whether aside from a Bio::Tools::Run module we'd
> also need a format parser - it sounds like it's emitting clustalw
> format?
>
> 	-hilmar
>
> Begin forwarded message:
>
>> From: evoldir at evol.biology.mcmaster.ca
>> Date: August 28, 2007 3:02:19 AM EDT
>> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
>> Reply-To: racartwr at ncsu.edu
>>
>>
>> Ngila is a global, pairwise alignment program that uses logarithmic
>> and
>> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are  
>> more
>> biologically realistic than the more popular (and efficient) affine
>> gap
>> cost model.
>>
>> I have recently completed updating the program to version 1.2.1.  The
>> new version includes two new, evolutionary alignment models based
>> on my
>> current research.  These models allow you to find the maximum
>> alignment
>> of two sequences based on biological, evolutionary parameters---no
>> more
>> guessing at biological costs.  Additional changes are noted on the
>> website.
>>
>> Website & Manual:
>>
>> http://scit.us/projects/ngila/
>>
>> Windows Binary:
>>
>> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>>
>> Unix/Mac Source Code:
>>
>> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>>
>> I'll be happy to answer any questions users have about the new
>> models or
>> the program.
>>
>> -- 
>> *********************************************************
>> Reed A. Cartwright, PhD     http://scit.us/
>> Postdoctoral Researcher     http://www.dererumnatura.us/
>> Department of Genetics      http://www.pandasthumb.org/
>>
>> Bioinformatics Research Center
>> North Carolina State University
>> Campus Box 7566
>> Raleigh, NC 27695-7566
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Aug 29 12:03:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 11:03:07 -0500
Subject: [Bioperl-l] remote SwissProt server problems
Message-ID: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>

Just as a notice, DBFetch is currently retrieving only single records  
for the UniProtKB database (where Bio::DB::SwissProt fetches  
sequences).  If anyone runs remote sevrer tests and DB.t in the test  
suite you'll see a failure towards the end which indicates this.   
I've posted a notice to the server help desk and will respond when I  
hear more.

chris


From cain.cshl at gmail.com  Wed Aug 29 15:45:48 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 29 Aug 2007 15:45:48 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
Message-ID: <1188416748.2567.36.camel@localhost.localdomain>

Hi Chris,

I just wanted to let you know that I was out of town for a few days, but
now I'm back and I'm doing testing of GMOD software based on the branch
you are working on.  I'll let you know how it goes, but don't let me
stop you if you confident of your changes.  I'm sure whatever goes
wrong, it will just point out holes in the FeatureIO tests (I'm sure
there are plenty) and will require hopefully minimal changes on my end.

Thanks for your considerable efforts on this!  (Regardless of how much
work it makes for me :-)
Scott


On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> I'm now wrapping up the Feature/Annotation rollback.  I will probably  
> start merging back to the main branch in the next day or two., as  
> soon as interested parties (*cough*devs*cough*) look over the last  
> batch of changes.
> 
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
> 
> I have also added a small benchmark test which indicates a decrease  
> in parsing time in SeqIO::genbank with all tests passing.  I expect  
> this will translate over to any Bio::SeqFeature::Generic-using class  
> (open mouth, prepare to insert foot....).
> 
> It is also possible there are still some instances where overloading  
> is expected lurking about in the ~1000 or so modules, so I'll leave  
> the exceptions I added to all Bio::AnnotationI; we can remove them  
> down the line, maybe prior to rel1.6, after more tests are added or  
> if they get particularly annoying.  My guess is I caught 99.99% of  
> them (prepare to insert other foot....).
> 
> The key change in this last round is the addition of several class  
> *dbxref* methods to Bio::Ontology::Term and  
> Bio::Annotation::OntologyTerm, all of which are capable of working  
> with either DBLink instances or simple scalars.  This was primarily  
> done in order to clear up inconsistencies in the older *dblink*  
> methods, which were ambiguous (some indicates simple scalar  
> arguments, others DBLink objects); operator overloading was used  
> extensively in these cases, which led to several issues.  I have  
> added deprecation warnings to the older methods which now map to  
> using the newer methods.  All tests pass with the exception of a few  
> already failing on the MAIN branch; the single test which needs to be  
> fixed is a round-tripping error in swiss.t (now a TODO), which can be  
> fixed after merging back.
> 
> Please respond to this if there are any questions or if I need to  
> clarify the changes I made a bit more.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070829/f8433568/attachment-0002.bin>

From cjfields at uiuc.edu  Wed Aug 29 16:13:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 15:13:17 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188416748.2567.36.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
Message-ID: <8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>

I'll probably go ahead and start merging this stuff over to CVS HEAD  
then.  There haven't been any objections so far.

The page I posted outlines the more critical fixes, primarily the  
changes to Bio::Ontology::Term methods (along with relevant code) due  
to inconsistencies in the interface.  The Bio::Annotation classes  
also now throw if you attempt to use them in an overloaded context.   
I also split off SeqFeature::Annotated tests into it's own test suite  
(SeqFeatAnnotated.t).

Let me know if there are any problems along the way!

chris

On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:

> Hi Chris,
>
> I just wanted to let you know that I was out of town for a few  
> days, but
> now I'm back and I'm doing testing of GMOD software based on the  
> branch
> you are working on.  I'll let you know how it goes, but don't let me
> stop you if you confident of your changes.  I'm sure whatever goes
> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> there are plenty) and will require hopefully minimal changes on my  
> end.
>
> Thanks for your considerable efforts on this!  (Regardless of how much
> work it makes for me :-)
> Scott
>
>
> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>> I'm now wrapping up the Feature/Annotation rollback.  I will probably
>> start merging back to the main branch in the next day or two., as
>> soon as interested parties (*cough*devs*cough*) look over the last
>> batch of changes.
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>>
>> I have also added a small benchmark test which indicates a decrease
>> in parsing time in SeqIO::genbank with all tests passing.  I expect
>> this will translate over to any Bio::SeqFeature::Generic-using class
>> (open mouth, prepare to insert foot....).
>>
>> It is also possible there are still some instances where overloading
>> is expected lurking about in the ~1000 or so modules, so I'll leave
>> the exceptions I added to all Bio::AnnotationI; we can remove them
>> down the line, maybe prior to rel1.6, after more tests are added or
>> if they get particularly annoying.  My guess is I caught 99.99% of
>> them (prepare to insert other foot....).
>>
>> The key change in this last round is the addition of several class
>> *dbxref* methods to Bio::Ontology::Term and
>> Bio::Annotation::OntologyTerm, all of which are capable of working
>> with either DBLink instances or simple scalars.  This was primarily
>> done in order to clear up inconsistencies in the older *dblink*
>> methods, which were ambiguous (some indicates simple scalar
>> arguments, others DBLink objects); operator overloading was used
>> extensively in these cases, which led to several issues.  I have
>> added deprecation warnings to the older methods which now map to
>> using the newer methods.  All tests pass with the exception of a few
>> already failing on the MAIN branch; the single test which needs to be
>> fixed is a round-tripping error in swiss.t (now a TODO), which can be
>> fixed after merging back.
>>
>> Please respond to this if there are any questions or if I need to
>> clarify the changes I made a bit more.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                          
> cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jay at jays.net  Wed Aug 29 18:11:55 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 29 Aug 2007 17:11:55 -0500
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
Message-ID: <46D5EF2B.5000101@jays.net>

Please slap me if I'm hysterical.

I'm seeking a broad bioinformatics search engine platform. I want to 
take gobs of data in gobs of formats and allow people to search it on 
the web.

- Entrez is awesome. Unfortunately I don't see anything in the NCBI 
toolkit that helps me run my own version of it. Even a tiny one. After 
an initial "check out our toolkit" response from NCBI I don't seem to be 
getting anywhere. Maybe I'm not communicating enough or well enough.

- EB-eye Search is slick. I don't see any developer kit or source code 
of any kind and I've gotten no response to my emails to them.

- LuceGene is very cool. But it looks like no one has touched it in 2.5 
years and I've gotten no response from their contact email address. I'm 
especially intrigued by their

  src/LuceGene/src/org/eugenes/index/LuceneReadseqIndexer.java

which seems to use the rather popular(?) Java Readseq to populate Lucene 
with source data in all sorts of different formats.

I don't know Java.

- Solr is really neat. It's easy to install and gives a simple/powerful 
XML API to populate a Lucene index.

... so ...

I'm thinking BioPerl knows how to parse lots of formats into a Bio::Seq.

I'm thinking I could write Perl which would take a Bio::Seq object and 
convert it to an XML file which Solr would happily inject into Lucene 
for me.

If I could do that I'm thinking that any of the many formats that 
Bio::SeqIO can slurp could magically be sent into a Lucene index for 
searching.

I'm thinking that would be really cool and I'm going to write it.

Now's your chance to slap me.

Since I haven't started yet, what would I call this thing? 
Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)

Thanks,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


More notes:
http://clab.ist.unomaha.edu/CLAB/index.php/RT11


From hlapp at gmx.net  Wed Aug 29 21:37:59 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 29 Aug 2007 21:37:59 -0400
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <46D5EF2B.5000101@jays.net>
References: <46D5EF2B.5000101@jays.net>
Message-ID: <D202078D-8F88-4FAA-94EA-8C08CE653C41@gmx.net>


On Aug 29, 2007, at 6:11 PM, Jay Hannah wrote:

> [...]
>
> I'm thinking I could write Perl which would take a Bio::Seq object and
> convert it to an XML file which Solr would happily inject into Lucene
> for me.
>
> If I could do that I'm thinking that any of the many formats that
> Bio::SeqIO can slurp could magically be sent into a Lucene index for
> searching.
>
> [...]
> Since I haven't started yet, what would I call this thing?
> Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)

Would this be a Solr-specific XML writer? Or could you use an  
existing XML format for sequences?

(as an aside, if you do need a Solr-specific format writer, my  
suggestion would be to name it solrxml [lowercase])

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 22:01:45 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 21:01:45 -0500
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <46D5EF2B.5000101@jays.net>
References: <46D5EF2B.5000101@jays.net>
Message-ID: <0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>


On Aug 29, 2007, at 5:11 PM, Jay Hannah wrote:

> Please slap me if I'm hysterical.
>
> I'm seeking a broad bioinformatics search engine platform. I want to
> take gobs of data in gobs of formats and allow people to search it on
> the web.
>
> - Entrez is awesome. Unfortunately I don't see anything in the NCBI
> toolkit that helps me run my own version of it. Even a tiny one. After
> an initial "check out our toolkit" response from NCBI I don't seem  
> to be
> getting anywhere. Maybe I'm not communicating enough or well enough.

No.  I have had non-responses before from NCBI; they may just be too  
busy.  Warnock probably applies.

> - EB-eye Search is slick. I don't see any developer kit or source code
> of any kind and I've gotten no response to my emails to them.

Not sure of this one personally.

> - LuceGene is very cool.
> ...
> I don't know Java.

...but you could write a (perl) wrapper around it.  You can try  
contacting Don Gilbert about it, though I think he's been trying out  
Chado.

> - Solr is really neat. It's easy to install and gives a simple/ 
> powerful
> XML API to populate a Lucene index.
> ... so ...
>
> I'm thinking BioPerl knows how to parse lots of formats into a  
> Bio::Seq.
>
> ...
>
> I'm thinking that would be really cool and I'm going to write it.
>
> Now's your chance to slap me.

No need.

> Since I haven't started yet, what would I call this thing?
> Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)
>
> Thanks,
>
> Jay Hannah
> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
> More notes:
> http://clab.ist.unomaha.edu/CLAB/index.php/RT11

The way I would go about it is use an established XML schema as a  
starting point and implement a writer (if bioperl doesn't already  
support it).  It's better than reinventing (a constantly reinvented)  
wheel and starting up a brand-new schema of your own.  INSDSeq  
(http://www.insdc.org/page.php?page=xmlstatus) is one I've been  
wanting to add for a while but haven't had time to work on; there are  
several other examples.  Note that a few of the currently supported  
ones in bioperl, such as bsml and game, have had very little to no  
development over the years in favor of newer (better?) XML flavors,  
so it likely isn't worth working with those.

chris


From hlapp at gmx.net  Wed Aug 29 22:02:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 29 Aug 2007 22:02:45 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
Message-ID: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>


On Aug 28, 2007, at 5:05 PM, Chris Fields wrote:

> I'm now wrapping up the Feature/Annotation rollback.  I will probably
> start merging back to the main branch in the next day or two., as
> soon as interested parties (*cough*devs*cough*) look over the last
> batch of changes.
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>
> [...]
> It is also possible there are still some instances where overloading
> is expected lurking about in the ~1000 or so modules, so I'll leave
> the exceptions I added to all Bio::AnnotationI

Keep in mind that code such as

	if ($ann) { ... }

is mostly not b/c someone wanted to use overloading, but rather  
someone was lazy and really meant to say

	if (defined($ann)) { ... }

In the absence of eq overloading, these will behave identically. So  
if you leave the exceptions in it is sort-of policing lazy  
programmers, which I guess is fine in principle, but is guaranteed to  
trip up a lot of script code. I'd take it out if you're reasonably  
sure that at least within BioPerl itself those lazy programming  
incidents are removed.

> [...]
> The key change in this last round is the addition of several class
> *dbxref* methods to Bio::Ontology::Term and
> Bio::Annotation::OntologyTerm, all of which are capable of working
> with either DBLink instances or simple scalars.

I don't think you need the code here to deal with both scalars and  
objects. It is fine I think to define the new methods from the outset  
to consistently accept and return DBLink objects, and period.

The backwards compatibility logic should rather be in the *_dblink*()  
methods; i.e., instead of simple aliases they should have the code to  
map to and from the new API. That way, once the deprecation cycle  
ends, they can be removed, and with them all the legacy code that now  
is no longer needed, whereas if you have that in the new methods, it  
keeps bothering the maintainers.

You also mention a add_dbxref_context() on the wiki page - I'm not  
sure why that would be needed given that you build in the -context  
option to add_dbxref() from the outset. But maybe I've glossed over  
some detail.

Once this is merged back to the main trunk, I guess we need to give  
Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it  
makes real sense.

Thanks Chris for this effort, this clears a monumental roadblock.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 23:23:14 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 22:23:14 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
Message-ID: <A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>


On Aug 29, 2007, at 9:02 PM, Hilmar Lapp wrote:

>
> On Aug 28, 2007, at 5:05 PM, Chris Fields wrote:
>
>> I'm now wrapping up the Feature/Annotation rollback.  I will probably
>> start merging back to the main branch in the next day or two., as
>> soon as interested parties (*cough*devs*cough*) look over the last
>> batch of changes.
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>>
>> [...]
>> It is also possible there are still some instances where overloading
>> is expected lurking about in the ~1000 or so modules, so I'll leave
>> the exceptions I added to all Bio::AnnotationI
>
> Keep in mind that code such as
>
> 	if ($ann) { ... }
>
> is mostly not b/c someone wanted to use overloading, but rather
> someone was lazy and really meant to say
>
> 	if (defined($ann)) { ... }

Agreed.

> In the absence of eq overloading, these will behave identically. So
> if you leave the exceptions in it is sort-of policing lazy
> programmers, which I guess is fine in principle, but is guaranteed to
> trip up a lot of script code. I'd take it out if you're reasonably
> sure that at least within BioPerl itself those lazy programming
> incidents are removed.

I agree the overload exceptions shouldn't be left in.  The problem is  
I'm not certain we have caught most implicit overload calls (just the  
ones tested for).  Scott's checking everything against GMOD, though,  
so we can remove them after that.

>> [...]
>> The key change in this last round is the addition of several class
>> *dbxref* methods to Bio::Ontology::Term and
>> Bio::Annotation::OntologyTerm, all of which are capable of working
>> with either DBLink instances or simple scalars.
>
> I don't think you need the code here to deal with both scalars and
> objects. It is fine I think to define the new methods from the outset
> to consistently accept and return DBLink objects, and period.
>
> The backwards compatibility logic should rather be in the *_dblink*()
> methods; i.e., instead of simple aliases they should have the code to
> map to and from the new API. That way, once the deprecation cycle
> ends, they can be removed, and with them all the legacy code that now
> is no longer needed, whereas if you have that in the new methods, it
> keeps bothering the maintainers.

That should be easy enough to fix and would be more consistent.  I  
can look over the various calls to dbxref methods and see what needs  
to be done, then fix that in cvs.

> You also mention a add_dbxref_context() on the wiki page - I'm not
> sure why that would be needed given that you build in the -context
> option to add_dbxref() from the outset. But maybe I've glossed over
> some detail.

The -context parameter was in get_dbxref(), to grab those DBLinks in  
a particular context.  We could do the same with add_dbxref() (pass  
DBLinks in first arg as array ref, context as second arg).  That  
would then obviate the need for add_dbxref_context().

I'll also change the parameter passing in get_dbxref() to just accept  
context as an single optional argument since we're dealing with only  
DBLink instances now.

> Once this is merged back to the main trunk, I guess we need to give
> Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it
> makes real sense.

It describes one method, ontology_term(), which returns a  
Bio::Ontology::TermI.  This is similar to SeqFeature::Annotated::type 
(), which returns a Bio::Annotation::OntologyTerm (a  
Bio::Ontology::TermI).  My thought is to simply deprecate type() in  
favor of TypedSeqFeatureI::ontology_term().

> Thanks Chris for this effort, this clears a monumental roadblock.
>
> 	-hilmar

No problem.  It just needed to be done.

chris


From florent.angly at gmail.com  Wed Aug 29 23:44:58 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 29 Aug 2007 20:44:58 -0700
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
Message-ID: <46D63D3A.6050308@gmail.com>

Hilmar Lapp wrote:
> Keep in mind that code such as
>
> 	if ($ann) { ... }
>
> is mostly not b/c someone wanted to use overloading, but rather  
> someone was lazy and really meant to say
>
> 	if (defined($ann)) { ... }
>
> In the absence of eq overloading, these will behave identically. So  
> if you leave the exceptions in it is sort-of policing lazy  
> programmers, which I guess is fine in principle, but is guaranteed to  
> trip up a lot of script code. I'd take it out if you're reasonably  
> sure that at least within BioPerl itself those lazy programming  
> incidents are removed.
	if ($ann) { ... }

and 

	if (defined($ann)) { ... }

are not the same.

	if ($ann)

is evaluated false for an empty string like

        $ann = '';

and for a value of zero, i.e.

	$ann = 0;

while

	defined($ann)

returns true in these 2 cases.

Florent


From cjfields at uiuc.edu  Wed Aug 29 23:54:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 22:54:05 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <46D63D3A.6050308@gmail.com>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<46D63D3A.6050308@gmail.com>
Message-ID: <90C3DE31-12FD-4BF3-B9F7-0FB5E1DE2A28@uiuc.edu>


On Aug 29, 2007, at 10:44 PM, Florent Angly wrote:

> Hilmar Lapp wrote:
>> Keep in mind that code such as
>>
>> 	if ($ann) { ... }
>>
>> is mostly not b/c someone wanted to use overloading, but rather   
>> someone was lazy and really meant to say
>>
>> 	if (defined($ann)) { ... }
>>
>> In the absence of eq overloading, these will behave identically.  
>> So  if you leave the exceptions in it is sort-of policing lazy   
>> programmers, which I guess is fine in principle, but is guaranteed  
>> to  trip up a lot of script code. I'd take it out if you're  
>> reasonably  sure that at least within BioPerl itself those lazy  
>> programming  incidents are removed.
> 	if ($ann) { ... }
>
> and
> 	if (defined($ann)) { ... }
>
> are not the same.
>
> 	if ($ann)
>
> is evaluated false for an empty string like
>
>        $ann = '';
>
> and for a value of zero, i.e.
>
> 	$ann = 0;
>
> while
>
> 	defined($ann)
>
> returns true in these 2 cases.
>
> Florent

I agree, but we're talking about the context in which this test is  
performed, where $ann is either an instance of a Bio::AnnotationI or  
undef (not a scalar value or '').  In this case it works both as 'if  
($ann)' or 'if (defined($ann))', though the latter is preferred.   
Never underestimate laziness!

chris


From cain.cshl at gmail.com  Wed Aug 29 23:59:11 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 29 Aug 2007 23:59:11 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <46D63D3A.6050308@gmail.com>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<46D63D3A.6050308@gmail.com>
Message-ID: <1188446351.2567.55.camel@localhost.localdomain>

Hi Florent,

Of course what you wrote below is true, but what Hilmar was writing
about was lazy programmers (like me) who assume that the empty string
and 0 value cases aren't going to happen (because we happen to know they
never should in certain contexts), and so use 'if ($ann)'.  Of course,
at the moment, I am in the process of de-lazifying my code (though I
tended to think of it as being efficent :-)

Scott


On Wed, 2007-08-29 at 20:44 -0700, Florent Angly wrote:
> Hilmar Lapp wrote:
> > Keep in mind that code such as
> >
> > 	if ($ann) { ... }
> >
> > is mostly not b/c someone wanted to use overloading, but rather  
> > someone was lazy and really meant to say
> >
> > 	if (defined($ann)) { ... }
> >
> > In the absence of eq overloading, these will behave identically. So  
> > if you leave the exceptions in it is sort-of policing lazy  
> > programmers, which I guess is fine in principle, but is guaranteed to  
> > trip up a lot of script code. I'd take it out if you're reasonably  
> > sure that at least within BioPerl itself those lazy programming  
> > incidents are removed.
> 	if ($ann) { ... }
> 
> and 
> 
> 	if (defined($ann)) { ... }
> 
> are not the same.
> 
> 	if ($ann)
> 
> is evaluated false for an empty string like
> 
>         $ann = '';
> 
> and for a value of zero, i.e.
> 
> 	$ann = 0;
> 
> while
> 
> 	defined($ann)
> 
> returns true in these 2 cases.
> 
> Florent
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070829/27872681/attachment-0002.bin>

From cain.cshl at gmail.com  Thu Aug 30 00:05:06 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 00:05:06 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
Message-ID: <1188446706.2567.59.camel@localhost.localdomain>

Hi Chris,

Is there a reason that the value method of the
Bio::Annotation::SimpleValue (and possibly some of its siblings)
returning "Value: $value"?  It didn't used to have the "Value: " before,
did it?

Thanks,
Scott


On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> I'll probably go ahead and start merging this stuff over to CVS HEAD  
> then.  There haven't been any objections so far.
> 
> The page I posted outlines the more critical fixes, primarily the  
> changes to Bio::Ontology::Term methods (along with relevant code) due  
> to inconsistencies in the interface.  The Bio::Annotation classes  
> also now throw if you attempt to use them in an overloaded context.   
> I also split off SeqFeature::Annotated tests into it's own test suite  
> (SeqFeatAnnotated.t).
> 
> Let me know if there are any problems along the way!
> 
> chris
> 
> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > I just wanted to let you know that I was out of town for a few  
> > days, but
> > now I'm back and I'm doing testing of GMOD software based on the  
> > branch
> > you are working on.  I'll let you know how it goes, but don't let me
> > stop you if you confident of your changes.  I'm sure whatever goes
> > wrong, it will just point out holes in the FeatureIO tests (I'm sure
> > there are plenty) and will require hopefully minimal changes on my  
> > end.
> >
> > Thanks for your considerable efforts on this!  (Regardless of how much
> > work it makes for me :-)
> > Scott
> >
> >
> > On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >> I'm now wrapping up the Feature/Annotation rollback.  I will probably
> >> start merging back to the main branch in the next day or two., as
> >> soon as interested parties (*cough*devs*cough*) look over the last
> >> batch of changes.
> >>
> >> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
> >>
> >> I have also added a small benchmark test which indicates a decrease
> >> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >> this will translate over to any Bio::SeqFeature::Generic-using class
> >> (open mouth, prepare to insert foot....).
> >>
> >> It is also possible there are still some instances where overloading
> >> is expected lurking about in the ~1000 or so modules, so I'll leave
> >> the exceptions I added to all Bio::AnnotationI; we can remove them
> >> down the line, maybe prior to rel1.6, after more tests are added or
> >> if they get particularly annoying.  My guess is I caught 99.99% of
> >> them (prepare to insert other foot....).
> >>
> >> The key change in this last round is the addition of several class
> >> *dbxref* methods to Bio::Ontology::Term and
> >> Bio::Annotation::OntologyTerm, all of which are capable of working
> >> with either DBLink instances or simple scalars.  This was primarily
> >> done in order to clear up inconsistencies in the older *dblink*
> >> methods, which were ambiguous (some indicates simple scalar
> >> arguments, others DBLink objects); operator overloading was used
> >> extensively in these cases, which led to several issues.  I have
> >> added deprecation warnings to the older methods which now map to
> >> using the newer methods.  All tests pass with the exception of a few
> >> already failing on the MAIN branch; the single test which needs to be
> >> fixed is a round-tripping error in swiss.t (now a TODO), which can be
> >> fixed after merging back.
> >>
> >> Please respond to this if there are any questions or if I need to
> >> clarify the changes I made a bit more.
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                          
> > cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/b03eef7e/attachment-0002.bin>

From cjfields at uiuc.edu  Thu Aug 30 00:17:18 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 23:17:18 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188446706.2567.59.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
Message-ID: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>

It shouldn't, that sounds like the output for add_text().  value()  
should just return the scalar value.

As a note, I added a new method, display_text(), for all  
Bio::AnnotationI classes which by default replicates the same output  
that stringification overloads produced.  So you should be able to  
explicitly call $ann->display_text for any Bio::AnnotationI where you  
once used an implicit call:

# old
print "$ann\n";

# new
print $ann->display_text,"\n";

chris

On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:

> Hi Chris,
>
> Is there a reason that the value method of the
> Bio::Annotation::SimpleValue (and possibly some of its siblings)
> returning "Value: $value"?  It didn't used to have the "Value: "  
> before,
> did it?
>
> Thanks,
> Scott
>
>
> On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
>> I'll probably go ahead and start merging this stuff over to CVS HEAD
>> then.  There haven't been any objections so far.
>>
>> The page I posted outlines the more critical fixes, primarily the
>> changes to Bio::Ontology::Term methods (along with relevant code) due
>> to inconsistencies in the interface.  The Bio::Annotation classes
>> also now throw if you attempt to use them in an overloaded context.
>> I also split off SeqFeature::Annotated tests into it's own test suite
>> (SeqFeatAnnotated.t).
>>
>> Let me know if there are any problems along the way!
>>
>> chris
>>
>> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> I just wanted to let you know that I was out of town for a few
>>> days, but
>>> now I'm back and I'm doing testing of GMOD software based on the
>>> branch
>>> you are working on.  I'll let you know how it goes, but don't let me
>>> stop you if you confident of your changes.  I'm sure whatever goes
>>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
>>> there are plenty) and will require hopefully minimal changes on my
>>> end.
>>>
>>> Thanks for your considerable efforts on this!  (Regardless of how  
>>> much
>>> work it makes for me :-)
>>> Scott
>>>
>>>
>>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
>>>> probably
>>>> start merging back to the main branch in the next day or two., as
>>>> soon as interested parties (*cough*devs*cough*) look over the last
>>>> batch of changes.
>>>>
>>>> http://www.bioperl.org/wiki/ 
>>>> Feature_Annotation_rollback#Fourth_Round
>>>>
>>>> I have also added a small benchmark test which indicates a decrease
>>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
>>>> this will translate over to any Bio::SeqFeature::Generic-using  
>>>> class
>>>> (open mouth, prepare to insert foot....).
>>>>
>>>> It is also possible there are still some instances where  
>>>> overloading
>>>> is expected lurking about in the ~1000 or so modules, so I'll leave
>>>> the exceptions I added to all Bio::AnnotationI; we can remove them
>>>> down the line, maybe prior to rel1.6, after more tests are added or
>>>> if they get particularly annoying.  My guess is I caught 99.99% of
>>>> them (prepare to insert other foot....).
>>>>
>>>> The key change in this last round is the addition of several class
>>>> *dbxref* methods to Bio::Ontology::Term and
>>>> Bio::Annotation::OntologyTerm, all of which are capable of working
>>>> with either DBLink instances or simple scalars.  This was primarily
>>>> done in order to clear up inconsistencies in the older *dblink*
>>>> methods, which were ambiguous (some indicates simple scalar
>>>> arguments, others DBLink objects); operator overloading was used
>>>> extensively in these cases, which led to several issues.  I have
>>>> added deprecation warnings to the older methods which now map to
>>>> using the newer methods.  All tests pass with the exception of a  
>>>> few
>>>> already failing on the MAIN branch; the single test which needs  
>>>> to be
>>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
>>>> can be
>>>> fixed after merging back.
>>>>
>>>> Please respond to this if there are any questions or if I need to
>>>> clarify the changes I made a bit more.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>> Scott Cain, Ph. D.
>>> cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From neetisomaiya at gmail.com  Thu Aug 30 00:47:53 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 30 Aug 2007 10:17:53 +0530
Subject: [Bioperl-l] kegg xml parsing
Message-ID: <764978cf0708292147q4ead37b0i782b83ecda8ce3da@mail.gmail.com>

Hi,

Has anyone used XML::Twig for parsing of kegg xml data?
I was looking for some small example code of the same.

Thanks.
-- 
-Neeti
Even my blood says, B positive


From sdavis2 at mail.nih.gov  Thu Aug 30 06:16:54 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 30 Aug 2007 06:16:54 -0400
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>
References: <46D5EF2B.5000101@jays.net>
	<0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>
Message-ID: <46D69916.4060202@mail.nih.gov>

Chris Fields wrote:
> On Aug 29, 2007, at 5:11 PM, Jay Hannah wrote:
> 
>> Please slap me if I'm hysterical.
>>
>> I'm seeking a broad bioinformatics search engine platform. I want to
>> take gobs of data in gobs of formats and allow people to search it on
>> the web.

Not sure how it might or might not meet your needs, but have you looked
at SRS (Sequence Retrieval System)?  I have never tried to use it,
personally, though.

Sean


From cjfields at uiuc.edu  Thu Aug 30 09:17:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 08:17:17 -0500
Subject: [Bioperl-l] remote SwissProt server problems
In-Reply-To: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>
References: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>
Message-ID: <62B4DE62-C11E-4E75-837C-6C1005FB12A4@uiuc.edu>

This should be fixed now (DBFetch-related tests pass, though MeSH  
tests are now failing!).

chris

On Aug 29, 2007, at 11:03 AM, Chris Fields wrote:

> Just as a notice, DBFetch is currently retrieving only single records
> for the UniProtKB database (where Bio::DB::SwissProt fetches
> sequences).  If anyone runs remote sevrer tests and DB.t in the test
> suite you'll see a failure towards the end which indicates this.
> I've posted a notice to the server help desk and will respond when I
> hear more.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cain.cshl at gmail.com  Thu Aug 30 10:39:59 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 10:39:59 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
Message-ID: <1188484799.2567.84.camel@localhost.localdomain>

Hi Chris,

I see--I was using as_text and getting the "Value: $value"; there are
places in my code where I have always used ->value and I thought that
the way it was working had changed.

What is the use case for having the as_text method work the way it does?

Thanks,
Scott


On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
> It shouldn't, that sounds like the output for add_text().  value()  
> should just return the scalar value.
> 
> As a note, I added a new method, display_text(), for all  
> Bio::AnnotationI classes which by default replicates the same output  
> that stringification overloads produced.  So you should be able to  
> explicitly call $ann->display_text for any Bio::AnnotationI where you  
> once used an implicit call:
> 
> # old
> print "$ann\n";
> 
> # new
> print $ann->display_text,"\n";
> 
> chris
> 
> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > Is there a reason that the value method of the
> > Bio::Annotation::SimpleValue (and possibly some of its siblings)
> > returning "Value: $value"?  It didn't used to have the "Value: "  
> > before,
> > did it?
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> >> I'll probably go ahead and start merging this stuff over to CVS HEAD
> >> then.  There haven't been any objections so far.
> >>
> >> The page I posted outlines the more critical fixes, primarily the
> >> changes to Bio::Ontology::Term methods (along with relevant code) due
> >> to inconsistencies in the interface.  The Bio::Annotation classes
> >> also now throw if you attempt to use them in an overloaded context.
> >> I also split off SeqFeature::Annotated tests into it's own test suite
> >> (SeqFeatAnnotated.t).
> >>
> >> Let me know if there are any problems along the way!
> >>
> >> chris
> >>
> >> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> >>
> >>> Hi Chris,
> >>>
> >>> I just wanted to let you know that I was out of town for a few
> >>> days, but
> >>> now I'm back and I'm doing testing of GMOD software based on the
> >>> branch
> >>> you are working on.  I'll let you know how it goes, but don't let me
> >>> stop you if you confident of your changes.  I'm sure whatever goes
> >>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> >>> there are plenty) and will require hopefully minimal changes on my
> >>> end.
> >>>
> >>> Thanks for your considerable efforts on this!  (Regardless of how  
> >>> much
> >>> work it makes for me :-)
> >>> Scott
> >>>
> >>>
> >>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
> >>>> probably
> >>>> start merging back to the main branch in the next day or two., as
> >>>> soon as interested parties (*cough*devs*cough*) look over the last
> >>>> batch of changes.
> >>>>
> >>>> http://www.bioperl.org/wiki/ 
> >>>> Feature_Annotation_rollback#Fourth_Round
> >>>>
> >>>> I have also added a small benchmark test which indicates a decrease
> >>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >>>> this will translate over to any Bio::SeqFeature::Generic-using  
> >>>> class
> >>>> (open mouth, prepare to insert foot....).
> >>>>
> >>>> It is also possible there are still some instances where  
> >>>> overloading
> >>>> is expected lurking about in the ~1000 or so modules, so I'll leave
> >>>> the exceptions I added to all Bio::AnnotationI; we can remove them
> >>>> down the line, maybe prior to rel1.6, after more tests are added or
> >>>> if they get particularly annoying.  My guess is I caught 99.99% of
> >>>> them (prepare to insert other foot....).
> >>>>
> >>>> The key change in this last round is the addition of several class
> >>>> *dbxref* methods to Bio::Ontology::Term and
> >>>> Bio::Annotation::OntologyTerm, all of which are capable of working
> >>>> with either DBLink instances or simple scalars.  This was primarily
> >>>> done in order to clear up inconsistencies in the older *dblink*
> >>>> methods, which were ambiguous (some indicates simple scalar
> >>>> arguments, others DBLink objects); operator overloading was used
> >>>> extensively in these cases, which led to several issues.  I have
> >>>> added deprecation warnings to the older methods which now map to
> >>>> using the newer methods.  All tests pass with the exception of a  
> >>>> few
> >>>> already failing on the MAIN branch; the single test which needs  
> >>>> to be
> >>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
> >>>> can be
> >>>> fixed after merging back.
> >>>>
> >>>> Please respond to this if there are any questions or if I need to
> >>>> clarify the changes I made a bit more.
> >>>>
> >>>> chris
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>> -- 
> >>> -------------------------------------------------------------------- 
> >>> --
> >>> --
> >>> Scott Cain, Ph. D.
> >>> cain at cshl.edu
> >>> GMOD Coordinator (http://www.gmod.org/)
> >>> 216-392-3087
> >>> Cold Spring Harbor Laboratory
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher
> >> Lab of Dr. Robert Switzer
> >> Dept of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                    
> > cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/f2f5159f/attachment-0002.bin>

From cain.cshl at gmail.com  Thu Aug 30 11:46:24 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 11:46:24 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
Message-ID: <1188488785.2567.93.camel@localhost.localdomain>

Hi Chris,

Good news!  I only had to add a few defineds and a few display_texts and
I was able to successfully create a database and load the yeast GFF3
file.  While I want to do more testing with GFF from other sources,
clearly, I am 95% of the way there with relatively little work.

Nice job and Thanks!
Scott


On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
> It shouldn't, that sounds like the output for add_text().  value()  
> should just return the scalar value.
> 
> As a note, I added a new method, display_text(), for all  
> Bio::AnnotationI classes which by default replicates the same output  
> that stringification overloads produced.  So you should be able to  
> explicitly call $ann->display_text for any Bio::AnnotationI where you  
> once used an implicit call:
> 
> # old
> print "$ann\n";
> 
> # new
> print $ann->display_text,"\n";
> 
> chris
> 
> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > Is there a reason that the value method of the
> > Bio::Annotation::SimpleValue (and possibly some of its siblings)
> > returning "Value: $value"?  It didn't used to have the "Value: "  
> > before,
> > did it?
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> >> I'll probably go ahead and start merging this stuff over to CVS HEAD
> >> then.  There haven't been any objections so far.
> >>
> >> The page I posted outlines the more critical fixes, primarily the
> >> changes to Bio::Ontology::Term methods (along with relevant code) due
> >> to inconsistencies in the interface.  The Bio::Annotation classes
> >> also now throw if you attempt to use them in an overloaded context.
> >> I also split off SeqFeature::Annotated tests into it's own test suite
> >> (SeqFeatAnnotated.t).
> >>
> >> Let me know if there are any problems along the way!
> >>
> >> chris
> >>
> >> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> >>
> >>> Hi Chris,
> >>>
> >>> I just wanted to let you know that I was out of town for a few
> >>> days, but
> >>> now I'm back and I'm doing testing of GMOD software based on the
> >>> branch
> >>> you are working on.  I'll let you know how it goes, but don't let me
> >>> stop you if you confident of your changes.  I'm sure whatever goes
> >>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> >>> there are plenty) and will require hopefully minimal changes on my
> >>> end.
> >>>
> >>> Thanks for your considerable efforts on this!  (Regardless of how  
> >>> much
> >>> work it makes for me :-)
> >>> Scott
> >>>
> >>>
> >>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
> >>>> probably
> >>>> start merging back to the main branch in the next day or two., as
> >>>> soon as interested parties (*cough*devs*cough*) look over the last
> >>>> batch of changes.
> >>>>
> >>>> http://www.bioperl.org/wiki/ 
> >>>> Feature_Annotation_rollback#Fourth_Round
> >>>>
> >>>> I have also added a small benchmark test which indicates a decrease
> >>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >>>> this will translate over to any Bio::SeqFeature::Generic-using  
> >>>> class
> >>>> (open mouth, prepare to insert foot....).
> >>>>
> >>>> It is also possible there are still some instances where  
> >>>> overloading
> >>>> is expected lurking about in the ~1000 or so modules, so I'll leave
> >>>> the exceptions I added to all Bio::AnnotationI; we can remove them
> >>>> down the line, maybe prior to rel1.6, after more tests are added or
> >>>> if they get particularly annoying.  My guess is I caught 99.99% of
> >>>> them (prepare to insert other foot....).
> >>>>
> >>>> The key change in this last round is the addition of several class
> >>>> *dbxref* methods to Bio::Ontology::Term and
> >>>> Bio::Annotation::OntologyTerm, all of which are capable of working
> >>>> with either DBLink instances or simple scalars.  This was primarily
> >>>> done in order to clear up inconsistencies in the older *dblink*
> >>>> methods, which were ambiguous (some indicates simple scalar
> >>>> arguments, others DBLink objects); operator overloading was used
> >>>> extensively in these cases, which led to several issues.  I have
> >>>> added deprecation warnings to the older methods which now map to
> >>>> using the newer methods.  All tests pass with the exception of a  
> >>>> few
> >>>> already failing on the MAIN branch; the single test which needs  
> >>>> to be
> >>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
> >>>> can be
> >>>> fixed after merging back.
> >>>>
> >>>> Please respond to this if there are any questions or if I need to
> >>>> clarify the changes I made a bit more.
> >>>>
> >>>> chris
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>> -- 
> >>> -------------------------------------------------------------------- 
> >>> --
> >>> --
> >>> Scott Cain, Ph. D.
> >>> cain at cshl.edu
> >>> GMOD Coordinator (http://www.gmod.org/)
> >>> 216-392-3087
> >>> Cold Spring Harbor Laboratory
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher
> >> Lab of Dr. Robert Switzer
> >> Dept of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                    
> > cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/ec7a594e/attachment-0002.bin>

From hlapp at gmx.net  Thu Aug 30 12:07:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:07:18 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188488785.2567.93.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
Message-ID: <0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:

> Good news!  I only had to add a few defineds and a few  
> display_texts and
> I was able to successfully create a database and load the yeast GFF3

Scott - I'm a little worried - what are you using the display_text()  
calls for? There is no method to set a property that would be  
returned here, so you only have control over that if you override the  
method in a custom AnnotationI class.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
OtZghop1tET5iMqnwXzL+lk=
=NVrK
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Thu Aug 30 12:10:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:10:14 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188484799.2567.84.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188484799.2567.84.camel@localhost.localdomain>
Message-ID: <49824C75-3FA5-4E59-8F99-BC0E974E9652@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 10:39 AM, Scott Cain wrote:

> What is the use case for having the as_text method work the way it  
> does?

That's a bit nebulous as I tried to point out the other day. It's  
just a textual representation of the annotation, but you don't really  
have control over what the particular Annotation class considers to  
fulfill that purpose.

So, it's fine to expect a printable meaningful string to be returned,  
but don't try to parse it or rely on exactly what it is going to look  
like.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1uvnuV6N2JxL7qsRAn+dAKC9iLj93El38uv7kjprdZDo0sXC6wCgqwhm
0/tF89/FO1a4CWAf1bahd+8=
=I7SM
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Thu Aug 30 12:20:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:20:18 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
Message-ID: <DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>


On Aug 29, 2007, at 11:23 PM, Chris Fields wrote:

>> Once this is merged back to the main trunk, I guess we need to give
>> Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it
>> makes real sense.
>
> It describes one method, ontology_term(), which returns a  
> Bio::Ontology::TermI.  This is similar to  
> SeqFeature::Annotated::type(), which returns a  
> Bio::Annotation::OntologyTerm (a Bio::Ontology::TermI).  My thought  
> is to simply deprecate type() in favor of  
> TypedSeqFeatureI::ontology_term().

I think we'll want to think about that. type() gives me some  
indication of what the returned value might represent, whereas  
ontology_term() only tells me about the type of the returned object.

You could make ontology_term() accept a context argument, such as

	my $feature_type = $typedFeat->ontology_term(-context => -type);

Or you could name the method(s) more explicitly, such as

	my $feature_type = $typedFeat->type_term();
	my $feature_source = $typedFeat->source_term();
	my @annTerms = $typedFeat->get_Annotations('Gene Ontology');

Am I making sense?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Thu Aug 30 12:28:47 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 12:28:47 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
Message-ID: <1188491327.2567.101.camel@localhost.localdomain>

Hi Hilmar,

I'm using it as Chris suggested: where I had be depending on ""
overloading.  I think in most places, I am using it on
Bio::Annotation::SimpleValue to get the string that is the simple value.
On more complex data types, I am using other methods built into those
classes to extract useful stuff for inserting into the database.

Scott


On Thu, 2007-08-30 at 12:07 -0400, Hilmar Lapp wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:
> 
> > Good news!  I only had to add a few defineds and a few  
> > display_texts and
> > I was able to successfully create a database and load the yeast GFF3
> 
> Scott - I'm a little worried - what are you using the display_text()  
> calls for? There is no method to set a property that would be  
> returned here, so you only have control over that if you override the  
> method in a custom AnnotationI class.
> 
> 	-hilmar
> - --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.3 (Darwin)
> 
> iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
> OtZghop1tET5iMqnwXzL+lk=
> =NVrK
> -----END PGP SIGNATURE-----
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/1d98e384/attachment-0002.bin>

From hlapp at gmx.net  Thu Aug 30 12:52:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:52:14 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188491327.2567.101.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
Message-ID: <F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 12:28 PM, Scott Cain wrote:

> I think in most places, I am using it on
> Bio::Annotation::SimpleValue to get the string that is the simple  
> value.

You should be using $ann->value() for that, unless I'm missing  
something.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1vXCuV6N2JxL7qsRAkcJAKCICRtOSlPLVYYKCbOTvDIf4idb3wCgkxYM
seeaNvSsFY/4bHLGZ9dum2Q=
=E35w
-----END PGP SIGNATURE-----


From cain.cshl at gmail.com  Thu Aug 30 13:16:09 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 13:16:09 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
	<F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>
Message-ID: <1188494169.2567.109.camel@localhost.localdomain>

Well, in the instances where I was using it, ->value seems to work
exactly the same, so I changed it to value to be more consistent with
other code I'd written.  I'd used display_name without really thinking
about it.

Thanks,
Scott


On Thu, 2007-08-30 at 12:52 -0400, Hilmar Lapp wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> On Aug 30, 2007, at 12:28 PM, Scott Cain wrote:
> 
> > I think in most places, I am using it on
> > Bio::Annotation::SimpleValue to get the string that is the simple  
> > value.
> 
> You should be using $ann->value() for that, unless I'm missing  
> something.
> 
> 	-hilmar
> - --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.3 (Darwin)
> 
> iD8DBQFG1vXCuV6N2JxL7qsRAkcJAKCICRtOSlPLVYYKCbOTvDIf4idb3wCgkxYM
> seeaNvSsFY/4bHLGZ9dum2Q=
> =E35w
> -----END PGP SIGNATURE-----
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/4c383cd3/attachment-0002.bin>

From cjfields at uiuc.edu  Thu Aug 30 13:27:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 12:27:46 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188491327.2567.101.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
Message-ID: <6E9B07D0-AB37-4439-AA9D-9268AB5A38C0@uiuc.edu>

display_text() is really a hack for explicitly getting the same  
output one would have expected from stringification overload for any  
Bio::AnnotationI (you can also use callbacks on it for customizing it  
if needed, but that's not important here).  It works depending on the  
context of what you're trying to accomplish, but it might be best to  
use value() specifically in places where you expect only using  
Bio::Annotation::Simple.

chris

On Aug 30, 2007, at 11:28 AM, Scott Cain wrote:

> Hi Hilmar,
>
> I'm using it as Chris suggested: where I had be depending on ""
> overloading.  I think in most places, I am using it on
> Bio::Annotation::SimpleValue to get the string that is the simple  
> value.
> On more complex data types, I am using other methods built into those
> classes to extract useful stuff for inserting into the database.
>
> Scott
>
>
>
> On Thu, 2007-08-30 at 12:07 -0400, Hilmar Lapp wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>>
>> On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:
>>
>>> Good news!  I only had to add a few defineds and a few
>>> display_texts and
>>> I was able to successfully create a database and load the yeast GFF3
>>
>> Scott - I'm a little worried - what are you using the display_text()
>> calls for? There is no method to set a property that would be
>> returned here, so you only have control over that if you override the
>> method in a custom AnnotationI class.
>>
>> 	-hilmar
>> - --
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.3 (Darwin)
>>
>> iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
>> OtZghop1tET5iMqnwXzL+lk=
>> =NVrK
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug 30 13:45:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 12:45:44 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188488785.2567.93.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
Message-ID: <B81A709F-5081-4EB0-8778-2ABEDB02BA86@uiuc.edu>

Sounds good but I have yet to commit some of the Ontology changes  
Hilmar and I discussed (whereupon our brace heroes deprecate dblinks  
methods in favor of dbxrefs).  These should be committed fairly soon  
(hour or two).

My guess is the change will be fairly transparent so shouldn't affect  
anything unless you have scripts using those methods directly.

chris

On Aug 30, 2007, at 10:46 AM, Scott Cain wrote:

> Hi Chris,
>
> Good news!  I only had to add a few defineds and a few  
> display_texts and
> I was able to successfully create a database and load the yeast GFF3
> file.  While I want to do more testing with GFF from other sources,
> clearly, I am 95% of the way there with relatively little work.
>
> Nice job and Thanks!
> Scott
>
>
> On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
>> It shouldn't, that sounds like the output for add_text().  value()
>> should just return the scalar value.
>>
>> As a note, I added a new method, display_text(), for all
>> Bio::AnnotationI classes which by default replicates the same output
>> that stringification overloads produced.  So you should be able to
>> explicitly call $ann->display_text for any Bio::AnnotationI where you
>> once used an implicit call:
>>
>> # old
>> print "$ann\n";
>>
>> # new
>> print $ann->display_text,"\n";
>>
>> chris
>>
>> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> Is there a reason that the value method of the
>>> Bio::Annotation::SimpleValue (and possibly some of its siblings)
>>> returning "Value: $value"?  It didn't used to have the "Value: "
>>> before,
>>> did it?
>>>
>>> Thanks,
>>> Scott
>>>
>>>
>>> On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
>>>> I'll probably go ahead and start merging this stuff over to CVS  
>>>> HEAD
>>>> then.  There haven't been any objections so far.
>>>>
>>>> The page I posted outlines the more critical fixes, primarily the
>>>> changes to Bio::Ontology::Term methods (along with relevant  
>>>> code) due
>>>> to inconsistencies in the interface.  The Bio::Annotation classes
>>>> also now throw if you attempt to use them in an overloaded context.
>>>> I also split off SeqFeature::Annotated tests into it's own test  
>>>> suite
>>>> (SeqFeatAnnotated.t).
>>>>
>>>> Let me know if there are any problems along the way!
>>>>
>>>> chris
>>>>
>>>> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
>>>>
>>>>> Hi Chris,
>>>>>
>>>>> I just wanted to let you know that I was out of town for a few
>>>>> days, but
>>>>> now I'm back and I'm doing testing of GMOD software based on the
>>>>> branch
>>>>> you are working on.  I'll let you know how it goes, but don't  
>>>>> let me
>>>>> stop you if you confident of your changes.  I'm sure whatever goes
>>>>> wrong, it will just point out holes in the FeatureIO tests (I'm  
>>>>> sure
>>>>> there are plenty) and will require hopefully minimal changes on my
>>>>> end.
>>>>>
>>>>> Thanks for your considerable efforts on this!  (Regardless of how
>>>>> much
>>>>> work it makes for me :-)
>>>>> Scott
>>>>>
>>>>>
>>>>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>>>>>> I'm now wrapping up the Feature/Annotation rollback.  I will
>>>>>> probably
>>>>>> start merging back to the main branch in the next day or two., as
>>>>>> soon as interested parties (*cough*devs*cough*) look over the  
>>>>>> last
>>>>>> batch of changes.
>>>>>>
>>>>>> http://www.bioperl.org/wiki/
>>>>>> Feature_Annotation_rollback#Fourth_Round
>>>>>>
>>>>>> I have also added a small benchmark test which indicates a  
>>>>>> decrease
>>>>>> in parsing time in SeqIO::genbank with all tests passing.  I  
>>>>>> expect
>>>>>> this will translate over to any Bio::SeqFeature::Generic-using
>>>>>> class
>>>>>> (open mouth, prepare to insert foot....).
>>>>>>
>>>>>> It is also possible there are still some instances where
>>>>>> overloading
>>>>>> is expected lurking about in the ~1000 or so modules, so I'll  
>>>>>> leave
>>>>>> the exceptions I added to all Bio::AnnotationI; we can remove  
>>>>>> them
>>>>>> down the line, maybe prior to rel1.6, after more tests are  
>>>>>> added or
>>>>>> if they get particularly annoying.  My guess is I caught  
>>>>>> 99.99% of
>>>>>> them (prepare to insert other foot....).
>>>>>>
>>>>>> The key change in this last round is the addition of several  
>>>>>> class
>>>>>> *dbxref* methods to Bio::Ontology::Term and
>>>>>> Bio::Annotation::OntologyTerm, all of which are capable of  
>>>>>> working
>>>>>> with either DBLink instances or simple scalars.  This was  
>>>>>> primarily
>>>>>> done in order to clear up inconsistencies in the older *dblink*
>>>>>> methods, which were ambiguous (some indicates simple scalar
>>>>>> arguments, others DBLink objects); operator overloading was used
>>>>>> extensively in these cases, which led to several issues.  I have
>>>>>> added deprecation warnings to the older methods which now map to
>>>>>> using the newer methods.  All tests pass with the exception of a
>>>>>> few
>>>>>> already failing on the MAIN branch; the single test which needs
>>>>>> to be
>>>>>> fixed is a round-tripping error in swiss.t (now a TODO), which
>>>>>> can be
>>>>>> fixed after merging back.
>>>>>>
>>>>>> Please respond to this if there are any questions or if I need to
>>>>>> clarify the changes I made a bit more.
>>>>>>
>>>>>> chris
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>> -- 
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> --
>>>>> --
>>>>> Scott Cain, Ph. D.
>>>>> cain at cshl.edu
>>>>> GMOD Coordinator (http://www.gmod.org/)
>>>>> 216-392-3087
>>>>> Cold Spring Harbor Laboratory
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>> Scott Cain, Ph. D.
>>> cain.cshl at gmail.com
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug 30 14:03:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 13:03:29 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
	<DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>
Message-ID: <D4E8E9D3-BB64-48C5-8273-5C6C04DC8DE9@uiuc.edu>


On Aug 30, 2007, at 11:20 AM, Hilmar Lapp wrote:

>> ...It describes one method, ontology_term(), which returns a  
>> Bio::Ontology::TermI.  This is similar to  
>> SeqFeature::Annotated::type(), which returns a  
>> Bio::Annotation::OntologyTerm (a Bio::Ontology::TermI).  My  
>> thought is to simply deprecate type() in favor of  
>> TypedSeqFeatureI::ontology_term().
>
> I think we'll want to think about that. type() gives me some  
> indication of what the returned value might represent, whereas  
> ontology_term() only tells me about the type of the returned object.
>
> You could make ontology_term() accept a context argument, such as
>
> 	my $feature_type = $typedFeat->ontology_term(-context => -type);
>
> Or you could name the method(s) more explicitly, such as
>
> 	my $feature_type = $typedFeat->type_term();
> 	my $feature_source = $typedFeat->source_term();
> 	my @annTerms = $typedFeat->get_Annotations('Gene Ontology');
>
> Am I making sense?
>
> 	-hilmar

I think so; I'll have to look at what is returned from type() in some  
more detail.

It appears that the two main culprits for passing strings off to  
Ontology::Term are the Bio::OntologyIO::obo and  
Bio::OntologyIO::dagflat parsers.  I can add some code in there to  
change those to DBLinks prior to creating Ontology::Term instances,  
which should clean that up.

chris


From cjfields at uiuc.edu  Thu Aug 30 20:57:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 19:57:15 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <46CF27F4.8030608@arcor.de>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>	<46CEAD83.2050904@arcor.de>	<9824900.1187973171940.JavaMail.ngmail@webmail17>	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
	<BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
	<46CF27F4.8030608@arcor.de>
Message-ID: <4ED2E2B0-8E36-4500-A4C9-B8C333E14614@uiuc.edu>


On Aug 24, 2007, at 1:48 PM, marian wrote:

> ...
> Bio::Microarray::Tools::MitoChip would be OK to me. I merely meant,  
> that it
> isnt an expression chip and you also wont/cant analyze expression  
> data with
> the tool I am talking about.
>
> Marian

Okay, I have everything working from bugzilla:

http://bugzilla.open-bio.org/show_bug.cgi?id=2332

I suppose what we need to do next is get a test script going.  I'll  
look at the script attached to see if we can get something going that  
is fairly quick.

chris


From avilella at gmail.com  Fri Aug 31 05:29:43 2007
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 31 Aug 2007 10:29:43 +0100
Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with
	exon boundaries
Message-ID: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com>

Hi,

Probably a bit of a long shot but does anyone have code for
displaying protein or CDS multiple sequence alignments with the exon
boundaries
of each gene in the alignment?

Something in the bioperl world without funky external dependencies. I think
it would
be an awesome addition to the howtos.

Currently, the Bio::Graphics howto has cdna to genome mapping scripts or
blast output scripts, but
I couldn't find code for dealing with multiple sequence alignments.

Cheers,

    Albert.


From neetisomaiya at gmail.com  Fri Aug 31 05:41:51 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 31 Aug 2007 15:11:51 +0530
Subject: [Bioperl-l] need help
Message-ID: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>

Hi,

I am trying to parse the compound (
ftp://ftp.genome.jp/pub/kegg/ligand/compound/compound) and glycan (
ftp://ftp.genome.jp/pub/kegg/ligand/glycan/glycan) files of KEGG using
bioperl.
I just want the kegg id of the compound/glycan and its names and synonyms if
any.
Bio::SeqIO is giving some problem, I am not able to fetch the id and name.
Can someone help me with this.

Thanks.

-- 
-Neeti
Even my blood says, B positive


From cjfields at uiuc.edu  Fri Aug 31 10:51:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 31 Aug 2007 09:51:51 -0500
Subject: [Bioperl-l] need help
In-Reply-To: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>
References: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>
Message-ID: <BD54A833-D2D3-4AE5-8517-BB060F3C132E@uiuc.edu>

I don't believe Bio::SeqIO::kegg will parse those files (they aren't  
sequence files).  The format it recognizes is:

http://www.bioperl.org/wiki/KEGG_sequence_format

for the files found in the subdirectories here:

ftp://ftp.genome.ad.jp/pub/kegg/genes/organisms

I would just build a custom parser if all you're interested in is id/ 
names/synonyms.  It'll be much faster.

chris

On Aug 31, 2007, at 4:41 AM, neeti somaiya wrote:

> Hi,
>
> I am trying to parse the compound (
> ftp://ftp.genome.jp/pub/kegg/ligand/compound/compound) and glycan (
> ftp://ftp.genome.jp/pub/kegg/ligand/glycan/glycan) files of KEGG using
> bioperl.
> I just want the kegg id of the compound/glycan and its names and  
> synonyms if
> any.
> Bio::SeqIO is giving some problem, I am not able to fetch the id  
> and name.
> Can someone help me with this.
>
> Thanks.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Wed Aug  1 01:45:45 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Wed, 1 Aug 2007 11:15:45 +0530 (IST)
Subject: [Bioperl-l] Perl 3D OpenGL
In-Reply-To: <04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu>
References: <152401c7d224$8e2455b0$6e4e7c0a@HPONE>
	<25A5F0A3-1CC3-46B5-8976-A24C451204E7@jays.net>
	<04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu>
Message-ID: <49637.192.168.1.1.1185947145.squirrel@mail.ncbs.res.in>

Hi,
Open-GL/3D contributions are always welcome !!!
What about Perl-OpenGL/3D implimentation of a web-based 3D-Viewer like Jmol.

 http://jmol.sourceforge.net/

(So we dont need to worry about Java installation and stuffs :) develop it
and deploy it in Perl - eternal happiness !!!)
-- 
SK
>
> On Jul 31, 2007, at 7:00 AM, Jay Hannah wrote:
>
>> On Jul 29, 2007, at 4:08 PM, Grafman Productions wrote:
>>> If this posting is inappropriate, please let me know - my apologies.
>>
>> Not at all. AFAIK this is the perfect place to discuss any
>> contributions you're motivated to make to the BioPerl project.
>>
>>> I recently came across an article on BioPerl, and it occurred to me
>>> that
>>> there might be some need for 3D rendering within your BioPerl
>>> project.
>>>
>>> I released a number of new/updated Perl OpenGL (POGL) modules this
>>> year,
>>> along with benchmarks that demonstrate that it performs comparably
>>> to C.
>>>
>>> If there's a need for 3D features within BioPerl, and if I can be
>>> of any
>>> assistance in helping to add such features, I would enjoy the
>>> opportunity.
>>
>> I know nothing about 3D modeling in biology, nor do I hang out with
>> any protein structure folks, but 3D always sounds sexy. -grin-
>>
>> If you're new to bioinformatics (I certainly am) you might want to
>> read this:
>>
>>    http://en.wikipedia.org/wiki/Protein_structure
>>
>> Because that's probably where your 3D work would be used. Especially
>> note the "Software" section, where you'll find some of the
>> "competition".  :)
>>
>> There's some cool stuff out there. I don't know what all would or
>> wouldn't be time well spent in Perl / BioPerl.
>>
>> HTH,
>>
>> Jay Hannah
>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
> I agree that protein structure is the best place for something like
> this.
>
> It's a wide open area as far as I'm concerned; in fact I would say
> that Bio::Structure is getting pretty dated, so if anyone wants to
> take it over, refactor the code, and so on I don't have a problem.
>
> chris
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From Alicia.Amadoz at uv.es  Wed Aug  1 03:13:11 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Wed, 1 Aug 2007 09:13:11 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
Message-ID: <1664224328amadoz@uv.es>

Hi, I would like to save my hit sequences from a blast result in a fasta
file. I am trying some things but I have problems using Bio::SearchIO
and Bio::SeqIO. Hope anyone could help me with this. Here is my current
code:

# my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
"fasta");
my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
         my $hseq = $hsp->hit_string();
         # $seq_out->write_seq($hseq);
         $seq_out->write_result($hseq);
      }
   }
}

Here the error is,

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: ResultWriter not defined.

I couldn't find any kind of documentation about ResultWriter.
Thanks in advance,
Alicia


From xianranli78 at yahoo.com.cn  Wed Aug  1 04:11:53 2007
From: xianranli78 at yahoo.com.cn (Xianran Li)
Date: Wed, 1 Aug 2007 16:11:53 +0800
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
References: <1664224328amadoz@uv.es>
Message-ID: <001101c7d413$a0d79aa0$ed07a8c0@BGI.LOCAL>

The $hseq->$hsp->hit_string() will return the string of hit sequence, rather than an objective of Bio::Seq. So may be you should construct a objective firstly, then you could use $seq_out->write_seq($hseq_obj) to write the seq into a fasta file.

# my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta");
  my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
         my $hseq = $hsp->hit_string(); 
            $hseq =~ s/-//g; #### remove the gap within the aligment
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
         # $seq_out->write_seq($hseq);
         $seq_out->write_result($hseq_obj);
      }
   }
}

Xianran
----- Original Message ----- 
From: "Alicia Amadoz" <Alicia.Amadoz at uv.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, August 01, 2007 3:13 PM
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file


> Hi, I would like to save my hit sequences from a blast result in a fasta
> file. I am trying some things but I have problems using Bio::SearchIO
> and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> code:
> 
> # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> "fasta");
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> while(my $result = $blast_report->next_result()) {
>    while(my $hit = $result->next_hit()) {
>       while(my $hsp = $hit->next_hsp()) {
>          my $hseq = $hsp->hit_string();
>          # $seq_out->write_seq($hseq);
>          $seq_out->write_result($hseq);
>       }
>    }
> }
> 
> Here the error is,
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: ResultWriter not defined.
> 
> I couldn't find any kind of documentation about ResultWriter.
> Thanks in advance,
> Alicia
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f???????


From Alicia.Amadoz at uv.es  Wed Aug  1 06:25:29 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Wed, 1 Aug 2007 12:25:29 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
Message-ID: <5927683277amadoz@uv.es>

Hi, I have tried what you suggested and I get also some errors.
With this code,

my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
	my $hseq = $hsp->hit_string(); 
        $hseq =~ s/-//g; #### remove the gap within the aligment
        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
        $seq_out->write_seq($hseq_obj);
      }
   }				
}

I have the following error:

Can't locate object method "write_seq" via package "Bio::SearchIO::fasta"

And using write_result methog with this code,

my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
	my $hseq = $hsp->hit_string(); 
        $hseq =~ s/-//g; #### remove the gap within the aligment
        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
        $seq_out->write_result($hseq_obj);
      }
   }				
}

I have again this kind of error:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: ResultWriter not defined.
STACK: Error::throw

So, what else can I try?? Thanks in advance,
Alicia


From neetisomaiya at gmail.com  Wed Aug  1 07:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Wed Aug  1 07:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Wed Aug  1 07:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive


From jay at jays.net  Wed Aug  1 09:30:50 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 1 Aug 2007 09:30:50 -0400 (EDT)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <5927683277amadoz@uv.es>
References: <5927683277amadoz@uv.es>
Message-ID: <Pine.LNX.4.64.0708010926370.3555@ferret.jays.net>

On Wed, 1 Aug 2007, Alicia Amadoz wrote:
> Hi, I have tried what you suggested and I get also some errors.
> With this code,
>
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> while(my $result = $blast_report->next_result()) {
>   while(my $hit = $result->next_hit()) {
>      while(my $hsp = $hit->next_hsp()) {
> 	my $hseq = $hsp->hit_string();
>        $hseq =~ s/-//g; #### remove the gap within the aligment
>        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
>        $seq_out->write_seq($hseq_obj);
>      }
>   }
> }
>
> I have the following error:
>
> Can't locate object method "write_seq" via package "Bio::SearchIO::fasta"

You don't want to write_seq() to a SearchIO, you want to write_seq() to a 
SeqIO. Try this:

my $seq_out = Bio::SeqIO->new(-file => ">$fasfilename", -format => "fasta");
while(my $result = $blast_report->next_result()) {
    while(my $hit = $result->next_hit()) {
       while(my $hsp = $hit->next_hsp()) {
 	my $hseq = $hsp->hit_string();
         $hseq =~ s/-//g; #### remove the gap within the aligment
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
         $seq_out->write_seq($hseq_obj);
       }
    }
}

(Untested.)

HTH,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Wed Aug  1 11:02:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 1 Aug 2007 10:02:07 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
Message-ID: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>

Neeti,

Only post to one list email address, namely the one I'm responding to  
and the one shown here:

http://bioperl.org/mailman/listinfo/bioperl-l

The others are aliases so you essentially posted three times.  As for  
your question: there was no attached script or any additional  
information (bioperl version would have also been nice), so we can't  
help you until we have something more to work with.

chris

On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:

> I have downloaded the omim.txt file from NCBI ftp site and I am  
> running my
> attached parser on this file, the parser run stops in between with  
> this :-
>
> ------------- EXCEPTION  -------------
> MSG: a part/organism must be assigned
> STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> STACK toplevel parse_omim_original.pl:47
>
> --------------------------------------
>
> What is the reason for this?
> Can anyone guide me please.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Wed Aug  1 20:50:06 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Thu, 2 Aug 2007 10:50:06 +1000
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <1664224328amadoz@uv.es>
References: <1664224328amadoz@uv.es>
Message-ID: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>

Alicia,

> Hi, I would like to save my hit sequences from a blast result in a fasta
> file. I am trying some things but I have problems using Bio::SearchIO
> and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> code:
> # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> "fasta");
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> ...
>        my $hseq = $hsp->hit_string();
>          # $seq_out->write_seq($hseq);
>          $seq_out->write_result($hseq);

You have encountered two common problems for BioPerl beginners:

1. "fasta" means two different things! In SearchIO it refers to the
output format of the "fasta" sequence alignment software. In SeqIO it
refers to a file format that stores just sequences. Confusing, I know.
You need SeqIO and write_seq, not SearchIO and write_result.

2. $hseq is a STRING which has the raw sequence letters in it.
However, the write_seq() method needs a Bio::Seq object (which has
extra details like the name and ID) not a raw string.

The example code Jay Hannah supplied in his reply looks pretty good,
you should try it.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University


From Alicia.Amadoz at uv.es  Thu Aug  2 03:06:54 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Thu, 2 Aug 2007 09:06:54 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
Message-ID: <3579584634amadoz@uv.es>

Hi, thanks for your help and suggestions. I have tried the example code
of Jay Hannah and it works perfectly. But what I need to save in fasta
format is the whole sequence in the database that is similar to my query
sequence. I don't understand very well the difference between
hit_string() and query_string(), are they the whole sequence that is
similiar (about hit_string), a part of the whole sequence or just the
part that is aligned to my query string? 

With the previous code what I have are different sequences in length
with the same id as my query string, so I am not sure that I am doing
what I need to do. Any light on this point?

Thank you very much for your help.
Alicia

> Alicia,
> 
> > Hi, I would like to save my hit sequences from a blast result in a fasta
> > file. I am trying some things but I have problems using Bio::SearchIO
> > and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> > code:
> > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> > "fasta");
> > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> > => "fasta");
> > ...
> >        my $hseq = $hsp->hit_string();
> >          # $seq_out->write_seq($hseq);
> >          $seq_out->write_result($hseq);
> 
> You have encountered two common problems for BioPerl beginners:
> 
> 1. "fasta" means two different things! In SearchIO it refers to the
> output format of the "fasta" sequence alignment software. In SeqIO it
> refers to a file format that stores just sequences. Confusing, I know.
> You need SeqIO and write_seq, not SearchIO and write_result.
> 
> 2. $hseq is a STRING which has the raw sequence letters in it.
> However, the write_seq() method needs a Bio::Seq object (which has
> extra details like the name and ID) not a raw string.
> 
> The example code Jay Hannah supplied in his reply looks pretty good,
> you should try it.
> 
> -- 
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> 
> 


From xianranli78 at yahoo.com.cn  Thu Aug  2 04:56:04 2007
From: xianranli78 at yahoo.com.cn (Xianran Li)
Date: Thu, 2 Aug 2007 16:56:04 +0800
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
	<3579584634amadoz@uv.es>
Message-ID: <003701c7d4e2$f7a34bc0$ed07a8c0@BGI.LOCAL>

----- Original Message ----- 
From: "Alicia Amadoz" <Alicia.Amadoz at uv.es>
To: "Torsten Seemann" <torsten.seemann at infotech.monash.edu.au>; <bioperl-l at bioperl.org>
Cc: <jay at jays.net>
Sent: Thursday, August 02, 2007 3:06 PM
Subject: Re: [Bioperl-l] trying to save blast hit sequences to fasta file


> Hi, thanks for your help and suggestions. I have tried the example code
> of Jay Hannah and it works perfectly. But what I need to save in fasta
> format is the whole sequence in the database that is similar to my query
> sequence. I don't understand very well the difference between
> hit_string() and query_string(), are they the whole sequence that is
> similiar (about hit_string), a part of the whole sequence or just the
> part that is aligned to my query string? 

The hit_string() returns the  aligned sequences of the subject in your database and the query_string() is the aligned sequences of the query. These two things will be the same unless there are some mutations and or gaps within the alignment. 

> 
> With the previous code what I have are different sequences in length
> with the same id as my query string, so I am not sure that I am doing
> what I need to do. Any light on this point?

Did you specify the $id before 
  
my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 

If you didn't, then all the sequences retrieved will get the same id. The following is a simply way to avoid this problem.

my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta");                                                           
my $i;                                                                    
while(my $result = $blast_report->next_result()) {                        
   while(my $hit = $result->next_hit()) {                                 
      while(my $hsp = $hit->next_hsp()) {                                 
            $i ++;                                                      
         my $hseq = $hsp->hit_string();                                   
            $hseq =~ s/-//g; #### remove the gap within the aligment      
         my $id = $i; ###### specifiy the id                            
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
         # $seq_out->write_seq($hseq);                                    
         $seq_out->write_result($hseq_obj);                               
      }                                                                   
   }                                                                      
}               


Xianran 

> 
> Thank you very much for your help.
> Alicia
> 
> > Alicia,
> > 
> > > Hi, I would like to save my hit sequences from a blast result in a fasta
> > > file. I am trying some things but I have problems using Bio::SearchIO
> > > and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> > > code:
> > > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> > > "fasta");
> > > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> > > => "fasta");
> > > ...
> > >        my $hseq = $hsp->hit_string();
> > >          # $seq_out->write_seq($hseq);
> > >          $seq_out->write_result($hseq);
> > 
> > You have encountered two common problems for BioPerl beginners:
> > 
> > 1. "fasta" means two different things! In SearchIO it refers to the
> > output format of the "fasta" sequence alignment software. In SeqIO it
> > refers to a file format that stores just sequences. Confusing, I know.
> > You need SeqIO and write_seq, not SearchIO and write_result.
> > 
> > 2. $hseq is a STRING which has the raw sequence letters in it.
> > However, the write_seq() method needs a Bio::Seq object (which has
> > extra details like the name and ID) not a raw string.
> > 
> > The example code Jay Hannah supplied in his reply looks pretty good,
> > you should try it.
> > 
> > -- 
> > --Torsten Seemann
> > --Victorian Bioinformatics Consortium, Monash University
> > 
> > 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f???????


From neetisomaiya at gmail.com  Thu Aug  2 02:20:33 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 2 Aug 2007 11:50:33 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
Message-ID: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>

Hi,

The script is attached with this mail.
I am using bioperl-1.4.

Regards,
Neeti.

On 8/1/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Neeti,
>
> Only post to one list email address, namely the one I'm responding to
> and the one shown here:
>
> http://bioperl.org/mailman/listinfo/bioperl-l
>
> The others are aliases so you essentially posted three times.  As for
> your question: there was no attached script or any additional
> information (bioperl version would have also been nice), so we can't
> help you until we have something more to work with.
>
> chris
>
> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>
> > I have downloaded the omim.txt file from NCBI ftp site and I am
> > running my
> > attached parser on this file, the parser run stops in between with
> > this :-
> >
> > ------------- EXCEPTION  -------------
> > MSG: a part/organism must be assigned
> > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > STACK toplevel parse_omim_original.pl:47
> >
> > --------------------------------------
> >
> > What is the reason for this?
> > Can anyone guide me please.
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
-Neeti
Even my blood says, B positive
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_omim_original.pl
Type: application/x-perl
Size: 5998 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/fbbee8db/attachment-0003.bin>

From neetisomaiya at gmail.com  Thu Aug  2 09:00:33 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 2 Aug 2007 18:30:33 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
Message-ID: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>

Also,
As per the following links we can fetch data from the genemap file as well
:-
http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/OMIMparser.pm

But when I am trying to do so in the exact manner as given in the above
link, I get no data. As in there are OMIM ids which are present in both the
omim.txt and genemap files, and for such cases when I parse and fetch data,
data from both files should be obtained, but I aint getting it.

For eg. while running the attached script, for OMIM id 100790, I get all
data from omim.txt but the cytoposition, gene symbol etc from genemap is not
coming, though it is present in the genemap file.

Please help me find what could be going wrong.

On 8/2/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> The script is attached with this mail.
> I am using bioperl-1.4.
>
> Regards,
> Neeti.
>
> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:
> >
> > Neeti,
> >
> > Only post to one list email address, namely the one I'm responding to
> > and the one shown here:
> >
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> > The others are aliases so you essentially posted three times.  As for
> > your question: there was no attached script or any additional
> > information (bioperl version would have also been nice), so we can't
> > help you until we have something more to work with.
> >
> > chris
> >
> > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
> >
> > > I have downloaded the omim.txt file from NCBI ftp site and I am
> > > running my
> > > attached parser on this file, the parser run stops in between with
> > > this :-
> > >
> > > ------------- EXCEPTION  -------------
> > > MSG: a part/organism must be assigned
> > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > > STACK toplevel parse_omim_original.pl:47
> > >
> > > --------------------------------------
> > >
> > > What is the reason for this?
> > > Can anyone guide me please.
> > >
> > > --
> > > -Neeti
> > > Even my blood says, B positive
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
>
>
> --
> -Neeti
> Even my blood says, B positive
>
>


-- 
-Neeti
Even my blood says, B positive
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_omim_original.pl
Type: application/x-perl
Size: 8750 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/6bdb009c/attachment-0003.bin>

From cjfields at uiuc.edu  Thu Aug  2 13:05:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 12:05:55 -0500
Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml
References: <38B65B2C-A36D-41FB-83C9-7D7B55156CCD@uiuc.edu>
Message-ID: <EF284983-9A37-4F0F-BF92-04C7804275A0@uiuc.edu>

For archiving purposes; of course I forgot to cc the list!

-c

Begin forwarded message:

> From: Chris Fields <cjfields at uiuc.edu>
> Date: August 2, 2007 12:04:59 PM CDT
> To: gyang at plantbio.uga.edu
> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
> with xml
>
> Guojun,
>
> Make sure to keep this on the mail list for archiving purposes.
>
> It could be that the RID is not being removed properly (if it isn't  
> removed then you will repeatedly retrieve your BLAST report).  The  
> new error you are seeing may be coming from whatever XML::SAX  
> backend parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat,  
> etc); it doesn't look bioperl-related and there is an eval which  
> catches this stuff in SearchIO::blastxml.  Does text parsing work?
>
> Could you directly send me your script or add it to a new bug  
> report as an attachment?
>
> http://www.bioperl.org/wiki/Bugs
>
> chris
>
> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote:
>
>> Hi,Chris,
>> I installed the latest version of bioperl, in addition to the  
>> repeated output problem, there are new problems with parsing:
>>
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>  No close tag marker [Ln: 4126, Col: 0]
>>
>> ---------------------------------------------------
>>
>> Would you please kindly give me a hint on this,
>> Thanks a lot,
>> Guojun
>>
>>
>> ----- Original Message -----
>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> To: gyang at plantbio.uga.edu
>> Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org]
>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
>> with xml
>>
>>
>>> Make sure to keep responses on the ail list.
>>>> You might want to run a full install, just in case.  If I remember
>>> correctly Sendu made some changes a while back in the BLAST-related
>>> modules which may be related to this.  At the very least install/
>>> upgrade all modules in Bio::Tools::Run.
>>>> chris
>>>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote:
>>>>> Thanks, Chris,
>>>> But when I replaced the old RemoteBlast.pm with the new one, I got
>>>> "can't locate the object method "retrieve_parameter"". Does this
>>>> mean I need to install something else?
>>>> Guojun
>>>>
>>>> ----- Original Message -----
>>>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>>>> To: gyang at plantbio.uga.edu
>>>> Cc: bioperl-l at bioperl.org
>>>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast
>>>> with xml
>>>>
>>>>
>>>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote:
>>>>>>> I am running remoteblast and using readmethod "xml", I  
>>>>>>> noticed that
>>>>>> it is printing the output repeatedly nonstop. It's like in a  
>>>>>> loop.
>>>>>> Did anybody notice this before? Can anybody help me getting  
>>>>>> out of
>>>>>> this?
>>>>>> Thanks a lot,
>>>>>>
>>>>>>
>>>>>> Guojun Yang
>>>>>> University of Georgia
>>>>>> Not seeing that using bioperl-live; you may need to update
>>>>> RemoteBlast.pm as this sounds similar to an issue that popped up
>>>>> earlier in the spring.
>>>>>> chris
>>>>>
>>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>>>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug  2 13:51:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 12:51:27 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>
Message-ID: <921F31D6-3CA9-483A-8AFF-B3555E9768C4@uiuc.edu>

Neeti,

The genemap wasn't loaded in all cases; don't know what the reasoning  
for it was, but it is fixed in CVS now  
(Bio::Phenotype::OMIM::OMIMparser, specifically).  I would recommend  
that you install a full upgrade to at least bioperl 1.5.2 before  
using this; I can't guarantee it will work with bioperl 1.4.

chris

On Aug 2, 2007, at 8:00 AM, neeti somaiya wrote:

> Also,
> As per the following links we can fetch data from the genemap file  
> as well
> :-
> http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/ 
> OMIMparser.pm
>
> But when I am trying to do so in the exact manner as given in the  
> above
> link, I get no data. As in there are OMIM ids which are present in  
> both the
> omim.txt and genemap files, and for such cases when I parse and  
> fetch data,
> data from both files should be obtained, but I aint getting it.
>
> For eg. while running the attached script, for OMIM id 100790, I  
> get all
> data from omim.txt but the cytoposition, gene symbol etc from  
> genemap is not
> coming, though it is present in the genemap file.
>
> Please help me find what could be going wrong.
>
> On 8/2/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Hi,
>>
>> The script is attached with this mail.
>> I am using bioperl-1.4.
>>
>> Regards,
>> Neeti.
>>
>> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:
>>>
>>> Neeti,
>>>
>>> Only post to one list email address, namely the one I'm  
>>> responding to
>>> and the one shown here:
>>>
>>> http://bioperl.org/mailman/listinfo/bioperl-l
>>>
>>> The others are aliases so you essentially posted three times.  As  
>>> for
>>> your question: there was no attached script or any additional
>>> information (bioperl version would have also been nice), so we can't
>>> help you until we have something more to work with.
>>>
>>> chris
>>>
>>> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>>>
>>>> I have downloaded the omim.txt file from NCBI ftp site and I am
>>>> running my
>>>> attached parser on this file, the parser run stops in between with
>>>> this :-
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: a part/organism must be assigned
>>>> STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
>>>> STACK toplevel parse_omim_original.pl:47
>>>>
>>>> --------------------------------------
>>>>
>>>> What is the reason for this?
>>>> Can anyone guide me please.
>>>>
>>>> --
>>>> -Neeti
>>>> Even my blood says, B positive
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>>
>>
>>
>> --
>> -Neeti
>> Even my blood says, B positive
>>
>>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
> <parse_omim_original.pl>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug  2 14:16:56 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 13:16:56 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
Message-ID: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>

Neeti,

Keep this on the list please.  I am unable to reproduce this using  
your script with or without using the optional genemap file.  You  
really should upgrade bioperl to 1.5.2 and try the fix first; this is  
something that may have been fixed post-bioperl 1.4.

chris

On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote:

> Waiting for your reply on the exception I had mentioned in my first  
> mail.
>
> Thanks.
>
> ---------- Forwarded message ----------
> From: neeti somaiya < neetisomaiya at gmail.com>
> Date: Aug 2, 2007 11:50 AM
> Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser
> To: bioperl-l at lists.open-bio.org
>
> Hi,
>
> The script is attached with this mail.
> I am using bioperl-1.4.
>
> Regards,
> Neeti.
>
>
> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti,
>
> Only post to one list email address, namely the one I'm responding to
> and the one shown here:
>
> http://bioperl.org/mailman/listinfo/bioperl-l
>
> The others are aliases so you essentially posted three times.  As for
> your question: there was no attached script or any additional
> information (bioperl version would have also been nice), so we can't
> help you until we have something more to work with.
>
> chris
>
> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>
> > I have downloaded the omim.txt file from NCBI ftp site and I am
> > running my
> > attached parser on this file, the parser run stops in between with
> > this :-
> >
> > ------------- EXCEPTION  -------------
> > MSG: a part/organism must be assigned
> > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > STACK toplevel parse_omim_original.pl:47
> >
> > --------------------------------------
> >
> > What is the reason for this?
> > Can anyone guide me please.
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
> <parse_omim_original.pl>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Thu Aug  2 21:03:36 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 3 Aug 2007 11:03:36 +1000
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <3579584634amadoz@uv.es>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
	<3579584634amadoz@uv.es>
Message-ID: <a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>

Alicia,

> Hi, thanks for your help and suggestions. I have tried the example code
> of Jay Hannah and it works perfectly. But what I need to save in fasta
> format is the whole sequence in the database that is similar to my query
> sequence.

Unfortunately the hit_string is only that part of the sequence in the
database that was similar enough to your query sequence. The BLAST
report does not have the whole hit sequence in it, only the locally
aligned part. SearchIO can only give you what it can get from the
BLAST report.

You will need to record the IDs of the database sequences you are
interested in, and write extra code to retrieve the WHOLE hit sequence
from your database.

--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University


From neetisomaiya at gmail.com  Fri Aug  3 01:46:32 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 3 Aug 2007 11:16:32 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
	<9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>
Message-ID: <764978cf0708022246v98abed6ue41233f6b27c5674@mail.gmail.com>

Hi,

Thanks a lot.
The exception is not coming after upgrade to bioperl-1.5.2
But the genemap data is still a problem.

You had mentioned that I should take Bio::Phenotype::OMIM::OMIMparser,
specifically from cvs. Where exactly can I get it?

Thanks,
Neeti.

On 8/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Neeti,
>
> Keep this on the list please.  I am unable to reproduce this using
> your script with or without using the optional genemap file.  You
> really should upgrade bioperl to 1.5.2 and try the fix first; this is
> something that may have been fixed post-bioperl 1.4.
>
> chris
>
> On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote:
>
> > Waiting for your reply on the exception I had mentioned in my first
> > mail.
> >
> > Thanks.
> >
> > ---------- Forwarded message ----------
> > From: neeti somaiya < neetisomaiya at gmail.com>
> > Date: Aug 2, 2007 11:50 AM
> > Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser
> > To: bioperl-l at lists.open-bio.org
> >
> > Hi,
> >
> > The script is attached with this mail.
> > I am using bioperl-1.4.
> >
> > Regards,
> > Neeti.
> >
> >
> > On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti,
> >
> > Only post to one list email address, namely the one I'm responding to
> > and the one shown here:
> >
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> > The others are aliases so you essentially posted three times.  As for
> > your question: there was no attached script or any additional
> > information (bioperl version would have also been nice), so we can't
> > help you until we have something more to work with.
> >
> > chris
> >
> > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
> >
> > > I have downloaded the omim.txt file from NCBI ftp site and I am
> > > running my
> > > attached parser on this file, the parser run stops in between with
> > > this :-
> > >
> > > ------------- EXCEPTION  -------------
> > > MSG: a part/organism must be assigned
> > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > > STACK toplevel parse_omim_original.pl:47
> > >
> > > --------------------------------------
> > >
> > > What is the reason for this?
> > > Can anyone guide me please.
> > >
> > > --
> > > -Neeti
> > > Even my blood says, B positive
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> >
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> >
> >
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > <parse_omim_original.pl>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
-Neeti
Even my blood says, B positive


From jay at jays.net  Fri Aug  3 10:23:11 2007
From: jay at jays.net (Jay Hannah)
Date: Fri, 03 Aug 2007 09:23:11 -0500
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>	<3579584634amadoz@uv.es>
	<a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
Message-ID: <46B33A4F.2010403@jays.net>

Torsten Seemann wrote:
>> Hi, thanks for your help and suggestions. I have tried the example code
>> of Jay Hannah and it works perfectly. But what I need to save in fasta
>> format is the whole sequence in the database that is similar to my query
>> sequence.
>>     
>
> Unfortunately the hit_string is only that part of the sequence in the
> database that was similar enough to your query sequence. The BLAST
> report does not have the whole hit sequence in it, only the locally
> aligned part. SearchIO can only give you what it can get from the
> BLAST report.
>
> You will need to record the IDs of the database sequences you are
> interested in, and write extra code to retrieve the WHOLE hit sequence
> from your database.
>   
This probably won't help, but my (extremely poorly documented) 
"SeqLab.net" project

   http://seqlab.net

is a framework that sits on top of BioPerl. The current cross_blast() 
stuff (http://seqlab.net/pods2html/tutorial.html) does this:

   GenBank -> FASTA -> formatdb -> "stand alone" NCBI BLAST -> reports

When the reports run they have simultaneous access to both the original 
Bio::Seq objects from the GenBank file and the Bio::SearchIO objects 
from the BLAST results, so it can kick out reports that understand the 
relationships between (and details of) the original sequences and HSPs 
simultaneously...

If you get stuck trying to do what Torsten suggests and have questions 
about SeqLab.net you could open a ticket with my group

   http://clab.ist.unomaha.edu/CLAB/index.php/RT

and I'll try to help.

Cheers,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From mbasu at mail.nih.gov  Fri Aug  3 14:55:57 2007
From: mbasu at mail.nih.gov (Malay)
Date: Fri, 03 Aug 2007 14:55:57 -0400
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <46B33A4F.2010403@jays.net>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>	<3579584634amadoz@uv.es>	<a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
	<46B33A4F.2010403@jays.net>
Message-ID: <46B37A3D.4070606@mail.nih.gov>

Jay Hannah wrote:
> Torsten Seemann wrote:
>>> Hi, thanks for your help and suggestions. I have tried the example code
>>> of Jay Hannah and it works perfectly. But what I need to save in fasta
>>> format is the whole sequence in the database that is similar to my query
>>> sequence.
>>>     
>> Unfortunately the hit_string is only that part of the sequence in the
>> database that was similar enough to your query sequence. The BLAST
>> report does not have the whole hit sequence in it, only the locally
>> aligned part. SearchIO can only give you what it can get from the
>> BLAST report.
>>
>> You will need to record the IDs of the database sequences you are
>> interested in, and write extra code to retrieve the WHOLE hit sequence
>> from your database.

I am not sure whether it has already been suggested or not but you can 
retrieve the full sequence from any blast database using "fastacmd", 
which is part of NCBI toolbox. Parse the "description" string from from 
the BLAST report and run:

fastacmd -d <database file> -s <description>

where, the argument of -s can be any unique string for the database.

-Malay


From cjfields at uiuc.edu  Mon Aug  6 13:49:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 6 Aug 2007 12:49:08 -0500
Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml
References: <1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu>
Message-ID: <B97BD1F9-05FE-4225-810F-5EA10AB2728B@uiuc.edu>

Wasn't paying attention! Forwarding this to the mail list in case  
anyone wanted the answer...

chris

Begin forwarded message:

> From: Chris Fields <cjfields at uiuc.edu>
> Date: August 6, 2007 12:10:37 PM CDT
> To: gyang at plantbio.uga.edu
> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
> with xml
>
> Guojun,
>
> Sorry about the long wait on this.  At this time RemoteBlast  
> doesn't automatically set the retrieval header to return XML when  
> setting the -reporttype parameter to 'xml' or 'blastxml'.  The  
> default is text output, so you are retrieving regular text BLAST  
> reports instead of XML, hence the reported XML parser failure (BTW,  
> you can see the plain text being returned in the debugging  
> output).  I'll look into a fix for that.
>
> In the meantime, you can do this manually by setting the following  
> key prior to submitting the BLAST run:
>
> $Bio::Tools::Run::RemoteBlast::RETRIEVALHEADER{'FORMAT_TYPE'} = 'XML';
>
> When I run your example with the above line added it works fine.   
> As an additional note, the CVS version of Bio::SearchIO::blastxml  
> now supports newer versions of XML::SAX::Expat; the problem there  
> was a bug in XML::SAX::Expat that killed parsing.
>
> Additional rant before I go back to work (you can skip this if  
> needed):  RemoteBlast is one of the most used modules in BioPerl,  
> but it is also the most problematic as NCBI keeps changing things  
> on their end (BLAST text output, prompts when returning RIDs,  
> etc).  It frankly isn't as well-maintained as we would like; this  
> is partly due to plans we have (but unfortunately haven't acted  
> upon) to merge RemoteBlast/StandAloneBlast so they have a similar  
> API and can be used for any BLAST program, including netblast.  If  
> someone wants to take this on at some point then they are more than  
> welcome!
>
> chris
>
> On Aug 3, 2007, at 10:08 AM, Guojun Yang wrote:
>
>> Thanks, Chris,
>> Attached are my script and the query file. I suspected that we  
>> need to add "remove RID... in the code", I tried putting romoving  
>> RID at the end of the parsing coding, but it seemed it removed it  
>> even before the output was processed.   I installed  
>> XML::SAX::Expat, the error became "XML::SAX::Expat is no longer  
>> supported...", so I installed ExpatXS, the error message becomes:
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>  no element found at line 4126, column 1, byte 186628 at /usr/lib/ 
>> perl5/site_perl/5.8.3/Bio/SearchIO/blastxml.pm line 304
>>
>>
>> Would you please try the script with the query file with the  
>> following input parameters, to see what happens on your machine (I  
>> want to make sure there is no installation problem on my machine).  
>> The search subroutine is where blast is performed, I did not  
>> include a romove RID there. Thanks again!
>>
>> master:/home/guojun # perl makcgi07.txt
>> Query file name:
>> kiddo.txt
>> Select a function: 1.member;2.RES; 3, long; 4.Anchor; 5.Associator.
>> 1
>> Type in the name of an organism, e.g. Oryza sativa.
>> Oryza sativa
>> Type in the organism to search for RES:
>> Your E_value:
>> 0.001
>> Size limit for ancestor element:
>> 4000
>> Flanking size for retrieved members:
>> 50
>> Tolerance for end mismatch:
>> 0
>>
>>
>>
>> Guojun From: Chris Fields [mailto:cjfields at uiuc.edu]
>> To: gyang at plantbio.uga.edu
>> Sent: Thu, 02 Aug 2007 13:04:59 -0400
>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
>> with xml
>>
>> Guojun,
>>
>> Make sure to keep this on the mail list for archiving purposes.
>>
>> It could be that the RID is not being removed properly (if it isn't
>> removed then you will repeatedly retrieve your BLAST report). The
>> new error you are seeing may be coming from whatever XML::SAX backend
>> parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat, etc); it
>> doesn't look bioperl-related and there is an eval which catches this
>> stuff in SearchIO::blastxml. Does text parsing work?
>>
>> Could you directly send me your script or add it to a new bug report
>> as an attachment?
>>
>> http://www.bioperl.org/wiki/Bugs
>>
>> chris
>>
>> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote:
>>
>> > Hi,Chris,
>> > I installed the latest version of bioperl, in addition to the
>> > repeated output problem, there are new problems with parsing:
>> >
>> >
>> > -------------------- WARNING ---------------------
>> > MSG: error in parsing a report:
>> > No close tag marker [Ln: 4126, Col: 0]
>> >
>> > ---------------------------------------------------
>> >
>> > Would you please kindly give me a hint on this,
>> > Thanks a lot,
>> > Guojun
>> >
>> >
>> > ----- Original Message -----
>> > From: Chris Fields [mailto:cjfields at uiuc.edu]
>> > To: gyang at plantbio.uga.edu
>> > Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org]
>> > Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast
>> > with xml
>> >
>> >
>> >> Make sure to keep responses on the ail list.
>> >>> You might want to run a full install, just in case. If I remember
>> >> correctly Sendu made some changes a while back in the BLAST- 
>> related
>> >> modules which may be related to this. At the very least install/
>> >> upgrade all modules in Bio::Tools::Run.
>> >>> chris
>> >>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote:
>> >>>> Thanks, Chris,
>> >>> But when I replaced the old RemoteBlast.pm with the new one, I  
>> got
>> >>> "can't locate the object method "retrieve_parameter"". Does this
>> >>> mean I need to install something else?
>> >>> Guojun
>> >>>
>> >>> ----- Original Message -----
>> >>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> >>> To: gyang at plantbio.uga.edu
>> >>> Cc: bioperl-l at bioperl.org
>> >>> Subject: Re: [Bioperl-l] nonstop repeated output from  
>> Remote_blast
>> >>> with xml
>> >>>
>> >>>
>> >>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote:
>> >>>>>> I am running remoteblast and using readmethod "xml", I noticed
>> >>>>>> that
>> >>>>> it is printing the output repeatedly nonstop. It's like in a  
>> loop.
>> >>>>> Did anybody notice this before? Can anybody help me getting  
>> out of
>> >>>>> this?
>> >>>>> Thanks a lot,
>> >>>>>
>> >>>>>
>> >>>>> Guojun Yang
>> >>>>> University of Georgia
>> >>>>> Not seeing that using bioperl-live; you may need to update
>> >>>> RemoteBlast.pm as this sounds similar to an issue that popped up
>> >>>> earlier in the spring.
>> >>>>> chris
>> >>>>
>> >>> Christopher Fields
>> >> Postdoctoral Researcher
>> >> Lab of Dr. Robert Switzer
>> >> Dept of Biochemistry
>> >> University of Illinois Urbana-Champaign
>> >>>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>
>>
>> <makcgi07.txt>
>> <kiddo.txt>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Alicia.Amadoz at uv.es  Tue Aug  7 04:20:12 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Tue, 7 Aug 2007 10:20:12 +0200 (CEST)
Subject: [Bioperl-l] error using standaloneblast through webserver, part II
Message-ID: <1387114447amadoz@uv.es>

Hi again, i'm trying to run a bioperl script in linux with
standaloneblast from a webserver but i now have another error. It is the
following:

[blastall] WARNING: Unable to open outfile_allseq.nin
[blastall] WARNING: 101: Unable to open outfile_allseq.nin

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d
 "/outfile_allseq"  -e  10  -i 
/tmp//alicia_2007_07_20/result_search_alicia_12_03_40.fasta  -o 
/tmp//alicia_2007_08_07/101_result_Local_Blast_alicia_09_56_47.out  -p 
blastn

My perl code is the following:

my $blastdatadir = $ARGV[9]; -> Here the value of the variable is ok

BEGIN { 
	$ENV{PATH} .= ':/usr/local/blast-2.2.16/bin'; # path where blastall bin
is located
	$ENV{BLASTDIR} = '/usr/local/blast-2.2.16/bin'; # path where blastall
bin is located
	$ENV{BLASTDATADIR} = $blastdatadir; # path where formated local
databases are located -> Here the value is empty
}   

I have tried without BEGIN { } so $ENV var has a correct value for
$blastdatadir but i get the same error. I have checked that formatdb was
done and all the files are correct.

Any idea or help to solve this problem? 

Thanks in advance. Regards,
Alicia


From mheusel at gmail.com  Tue Aug  7 04:45:33 2007
From: mheusel at gmail.com (Martin Heusel)
Date: Tue, 7 Aug 2007 10:45:33 +0200
Subject: [Bioperl-l] error using standaloneblast through webserver,
	part II
In-Reply-To: <1387114447amadoz@uv.es>
References: <1387114447amadoz@uv.es>
Message-ID: <6127fc200708070145keb750acycce8a43edd0f724d@mail.gmail.com>

> MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d
>  "/outfile_allseq"  -e  10  -i

I'm not familiar with all this, but it seems your script tries to
write in the systems root directory /

-d "/outfile_allseq"

that is normally not writable for normal users

is this the problem?

cu

Martin

-- 
+ openid: http://mhe.myopenid.com/
+ gpg   : http://user.cs.tu-berlin.de/~mhe/pub/martin.gpg
+ gpg fp: 4844 71B5 B4E4 3892 69CA  6EA5 6598 61BE 0021 94A2


From Alicia.Amadoz at uv.es  Tue Aug  7 07:08:12 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Tue, 7 Aug 2007 13:08:12 +0200 (CEST)
Subject: [Bioperl-l] error using standaloneblast through webserver,
	part II
In-Reply-To: <1387114447amadoz@uv.es>
References: <1387114447amadoz@uv.es>
Message-ID: <5825345446amadoz@uv.es>

Hi, i thought that it was enough with setting $ENV{BLASTDATADIR} and
standaloneblast would find the database. I have change it, setting
-database option of params with path_to_database+name_of_database and it
works ok.

Thanks for your help. Regards,
Alicia


From jason at bioperl.org  Wed Aug  8 15:16:07 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 8 Aug 2007 14:16:07 -0500
Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module
References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com>
Message-ID: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>

Young -
I'm forwarding to the list for more help.

Begin forwarded message:

> From: "Young Song" <youngcsong at gmail.com>
> Date: August 8, 2007 1:48:29 PM CDT
> To: jason at bioperl.org
> Subject: Question regarding Bio::GenBank module
>
> Hello,
>
>    I am currently located in Vancouver, Canada, and I actually have  
> some
> question based on the Bio::GenBank module for bioperl.  I read in the
> online document for the module (
> http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are  
> not
> supposed to spam the NCBI with multiple requests, which lead me to  
> think
> about the script that I wrote.  I am trying to extract some  
> information
> based on the fasta protein files located in the  NCBI's  database.   
> The
> script  reads  each '.faa' (Fasta Protein) file and takes in the  
> 'gi'  ID
> for each  sequence, and extracts several information, which looks like
> following output (please note that there are lot more gi's then I  
> am showing
> you right now):
>
> 10954456
> accesstion number: NP_047185.1
> dbsource: GenBank: NC_001911.1
> NP_047185.1
> starting pos. at genomic seq: 1488
> ending pos. at genomic seq: 1991
> strand: +
> description: putative membrane-associated protein
> organism: Buchnera aphidicola
> MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGLL 
> VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLCV 
> IIHLTFVLSAFGMAYIDKMSKKKHVLH
> ************************************************
> 10954457
> accesstion number: NP_047186.1
> dbsource: GenBank: NC_001911.1
> NP_047186.1
> starting pos. at genomic seq: 2158
> ending pos. at genomic seq: 2913
> strand: +
> description: putative replication-associated protein
> organism: Buchnera aphidicola
> MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHRA 
> CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKRK 
> FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKKI 
> LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK
>
>
>
>   Because there are lot of sequences I am dealing with here, I am  
> little bit
> worried that I may be causing harm to the NCBI server.  I just need  
> to know
> if this is the right approach to take, or if there is another  
> solution (I am
> little bit confused what you mean by "multiple requests" in the  
> document).
> Your reply would be very much appreciated.  Thank you in advance.
>
>   Sincerely,
>
>      Young C. Song

--
Jason Stajich
jason at bioperl.org


From cjfields at uiuc.edu  Wed Aug  8 15:41:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 8 Aug 2007 14:41:34 -0500
Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module
In-Reply-To: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>
References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com>
	<24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>
Message-ID: <FD7D1694-604A-4C8B-AC47-B31F306EA5B0@uiuc.edu>

NCBI eUtils (which Bio::DB::GenBank uses to get sequence data) has a  
list of user requirements:

http://www.ncbi.nlm.nih.gov/entrez/query/static/ 
eutils_help.html#UserSystemRequirements

The most important one is the 3 second timeout between requests, but  
the module already implements that policy so there isn't a real issue  
unless you deliberately mess with that setting.  NCBI has been known  
to block IPs which don't follow that particular rule.  Also, if you  
are planning making hundreds of requests you should consider running  
the script during low traffic times as indicated in the above link.

chris

On Aug 8, 2007, at 2:16 PM, Jason Stajich wrote:

> Young -
> I'm forwarding to the list for more help.
>
> Begin forwarded message:
>
>> From: "Young Song" <youngcsong at gmail.com>
>> Date: August 8, 2007 1:48:29 PM CDT
>> To: jason at bioperl.org
>> Subject: Question regarding Bio::GenBank module
>>
>> Hello,
>>
>>    I am currently located in Vancouver, Canada, and I actually have
>> some
>> question based on the Bio::GenBank module for bioperl.  I read in the
>> online document for the module (
>> http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are
>> not
>> supposed to spam the NCBI with multiple requests, which lead me to
>> think
>> about the script that I wrote.  I am trying to extract some
>> information
>> based on the fasta protein files located in the  NCBI's  database.
>> The
>> script  reads  each '.faa' (Fasta Protein) file and takes in the
>> 'gi'  ID
>> for each  sequence, and extracts several information, which looks  
>> like
>> following output (please note that there are lot more gi's then I
>> am showing
>> you right now):
>>
>> 10954456
>> accesstion number: NP_047185.1
>> dbsource: GenBank: NC_001911.1
>> NP_047185.1
>> starting pos. at genomic seq: 1488
>> ending pos. at genomic seq: 1991
>> strand: +
>> description: putative membrane-associated protein
>> organism: Buchnera aphidicola
>> MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGL 
>> L
>> VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLC 
>> V
>> IIHLTFVLSAFGMAYIDKMSKKKHVLH
>> ************************************************
>> 10954457
>> accesstion number: NP_047186.1
>> dbsource: GenBank: NC_001911.1
>> NP_047186.1
>> starting pos. at genomic seq: 2158
>> ending pos. at genomic seq: 2913
>> strand: +
>> description: putative replication-associated protein
>> organism: Buchnera aphidicola
>> MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHR 
>> A
>> CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKR 
>> K
>> FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKK 
>> I
>> LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK
>>
>>
>>
>>   Because there are lot of sequences I am dealing with here, I am
>> little bit
>> worried that I may be causing harm to the NCBI server.  I just need
>> to know
>> if this is the right approach to take, or if there is another
>> solution (I am
>> little bit confused what you mean by "multiple requests" in the
>> document).
>> Your reply would be very much appreciated.  Thank you in advance.
>>
>>   Sincerely,
>>
>>      Young C. Song
>
> --
> Jason Stajich
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From gyang at plantbio.uga.edu  Thu Aug  9 15:03:21 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Thu, 09 Aug 2007 15:03:21 -0400
Subject: [Bioperl-l] standalone blastall call crashed, please help
In-Reply-To: 1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu
Message-ID: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>

Hi, Chris,  
Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm?  Best, Guojun  
   
I set the blast env variables:  
   
BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
$PROGRAMDIR = $ENV{'BLASTDIR'} || '';
......  
   
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  -o  /tmp/4qSSO16EZP  -p  blastx   
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359
STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
STACK: main::ancestor makcgi07.txt:593
STACK: makcgi07.txt:208
  

sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  

my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
print $query->seq();
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('swissprot');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
*    my $blast_report = $factory->blastall($query);
    while (my $result = $blast_report->next_result) {
            while( my $hit = $result->next_hit()) {
                $hit_name=$hit->name;
                $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
                $name=$1;
                $desc = $hit->description();
                if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){
                     $AN=0;
                     $replica=0;
                     while ($ancestor_name[$AN]) {
                        $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name));
                         $AN+=1;
                     }
                        if ($replica==0) {
                        push @ancestor_name, $long_name;
                        push @ancestor_start, $long_start;
                        push @ancestor_end, $long_end;
                        push @desc, $desc;
                        push @hitname,$name;
                        }
                }
               }
              }}
return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
}


From harijay at gmail.com  Thu Aug  9 17:47:50 2007
From: harijay at gmail.com (hari jayaram)
Date: Thu, 9 Aug 2007 17:47:50 -0400
Subject: [Bioperl-l] newbie wants install help
Message-ID: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>

Hi I am trying to install bioperl as a non root user since I dont have root
access on the machine.

I was following the instructions as given on the wiki at
http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
I started from scratch using perl version v5.8.5 and used cpan to install
the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
was needed. Everything worked just fine
I could use cpan as a non root user following instructions given at
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html

But when I try to install bioperl using the instructions for non-root I get
an error when I build Module::Build because I am not root.
Iget the same Module::Build error when I try to install without CPAN using
command line script perl Build.PL --install_base option as given on the
wiki.

Is there a way out

Thanks for your help in advance
harijay
Brandeis University


Installing /usr/share/man/man3/Module::Build::Platform::VMS.3pm
Installing /usr/share/man/man3/Module::Build::Base.3pm
Installing /usr/share/man/man3/Module::Build::Authoring.3pm
Installing /usr/share/man/man3/Module::Build::Compat.3pm
mkdir /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi/auto/Module:
Permission denied at /usr/lib/perl5/5.8.5/ExtUtils/Install.pm line 207
Installing /usr/bin/config_data
make: *** [install] Error 255
  /usr/bin/make install  -- NOT OK
    You may have to su to root to install the package
Couldn't install Module::Build, giving up.
make: *** No targets specified and no makefile found.  Stop.
  /usr/bin/make  -- NOT OK
Running make test
  Can't test without successful make
Running make install
  make had returned bad status, install seems impossible


From bix at sendu.me.uk  Thu Aug  9 18:23:24 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 09 Aug 2007 23:23:24 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
Message-ID: <46BB93DC.9010608@sendu.me.uk>

hari jayaram wrote:
> Hi I am trying to install bioperl as a non root user since I dont have root
> access on the machine.
> 
> I was following the instructions as given on the wiki at
> http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
> I started from scratch using perl version v5.8.5 and used cpan to install
> the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
> was needed. Everything worked just fine
> I could use cpan as a non root user following instructions given at
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
> 
> But when I try to install bioperl using the instructions for non-root I get
> an error when I build Module::Build because I am not root.
> Iget the same Module::Build error when I try to install without CPAN using
> command line script perl Build.PL --install_base option as given on the
> wiki.

Follow the cpan instructions you found to install as non-root:

Bundle::CPAN

Failing that, you require at least:
Module::Build

Failing that:
http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#INSTALLING_BIOPERL_MODULES_THE_HARD_WAY
(it's actually the easiest way, go figure)


From bix at sendu.me.uk  Fri Aug 10 03:41:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 10 Aug 2007 08:41:29 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>	
	<46BB93DC.9010608@sendu.me.uk>
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
Message-ID: <46BC16A9.7090709@sendu.me.uk>

hari jayaram wrote:
> Hi Sendu ,

Hi, please post back to the list as well, so others can benefit.


> Well after going through a few attempts at installing Bundle::CPAN I 
> gave up.
> It always had weird timeout issues . ANd kept re-installing everything 
> on restarting the CPAN shell
> After a while I thought it did complete - since it retunred me to the shell
> 
> I tried the CPAN install of bioperl at that point
> 
> ANd bingo I got booted out at the exact same point when the Bioperl 
> install tried to re-install(?) Module:Build which failed as non root

Did you follow steps 7 and 8 of 
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?

If you managed to install Bundle::CPAN, when you now run 'cpan' it 
should start up and tell you its version number, which should be v1.9102 
or higher. If its lower, you didn't manage to install the latest CPAN, 
or you haven't managed to tell Perl where your newly installed modules are.


> I guess for all future modules I will adopt the option 3 you detailed , 
> i.e just have the modules sitting somewhere and use them from there
> 
> But I am still interested in getting it done right via CPAN.


From n.haigh at sheffield.ac.uk  Fri Aug 10 06:14:06 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 10 Aug 2007 11:14:06 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <46BC16A9.7090709@sendu.me.uk>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>		<46BB93DC.9010608@sendu.me.uk>	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
	<46BC16A9.7090709@sendu.me.uk>
Message-ID: <46BC3A6E.80302@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> hari jayaram wrote:
>> Hi Sendu ,
> 
> Hi, please post back to the list as well, so others can benefit.
> 
> 
>> Well after going through a few attempts at installing Bundle::CPAN I 
>> gave up.
>> It always had weird timeout issues . ANd kept re-installing everything 
>> on restarting the CPAN shell
>> After a while I thought it did complete - since it retunred me to the shell
>>
>> I tried the CPAN install of bioperl at that point
>>
>> ANd bingo I got booted out at the exact same point when the Bioperl 
>> install tried to re-install(?) Module:Build which failed as non root
> 
> Did you follow steps 7 and 8 of 
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?
> 
> If you managed to install Bundle::CPAN, when you now run 'cpan' it 
> should start up and tell you its version number, which should be v1.9102 
> or higher. If its lower, you didn't manage to install the latest CPAN, 
> or you haven't managed to tell Perl where your newly installed modules are.
> 
> 
>> I guess for all future modules I will adopt the option 3 you detailed , 
>> i.e just have the modules sitting somewhere and use them from there
>>
>> But I am still interested in getting it done right via CPAN.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

It will probably also help, if you post the commands you have run and
any output (truncated if it's really long), then we can follow what you
have tried and make some better suggestions.

Cheers
Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGvDpuczuW2jkwy2gRAjFjAJ0eG90cMfHrrIh7LbKWx1JN94kbXgCdGSbi
tMjQrZ/8EPc0wLiNAhYTr4Y=
=kXZ2
-----END PGP SIGNATURE-----


From mbasu at mail.nih.gov  Fri Aug 10 11:25:35 2007
From: mbasu at mail.nih.gov (Malay)
Date: Fri, 10 Aug 2007 11:25:35 -0400
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
Message-ID: <46BC836F.7010906@mail.nih.gov>

hari jayaram wrote:
> Hi I am trying to install bioperl as a non root user since I dont have root
> access on the machine.
> 
> I was following the instructions as given on the wiki at
> http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
> I started from scratch using perl version v5.8.5 and used cpan to install
> the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
> was needed. Everything worked just fine
> I could use cpan as a non root user following instructions given at
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
> 
> But when I try to install bioperl using the instructions for non-root I get
> an error when I build Module::Build because I am not root.
> Iget the same Module::Build error when I try to install without CPAN using
> command line script perl Build.PL --install_base option as given on the
> wiki.
> 
> Is there a way out
> 
> Thanks for your help in advance
> harijay
> Brandeis University
> 

This is related your situation and broadly applicable to all perl users 
in a non root situation. I can tell from my own experience the best way 
to handle your situation is to use your own Perl, if you are a dedicated 
perl developer. Just compile and install your own perl installation in 
any directory of you choice and put the "bin" directory in front of you 
path and off you go. The advantages are several fold. First, you get a 
very optimized, fast perl. The sysadmin might have just installed a 
binary run-of-the-mill perl version. Second, you get all the freedom of 
installing the very latest updates of all the modules. The sysadmins may 
be too busy man to update perl frequently. Third, a very common problem 
with production machine is that they follow strictly the perl 
installation instruction and avoid threaded perl, which clips your wings 
particularly, when almost all machines contain multiple processors.

The drawbacks are related to finding "/usr/bin/perl" in the shebang 
line. If you follow the perl way of installing any script, it will take 
care of it. When you develop, use the more portable way of

#!/usr/bin/env perl
BEGIN {$^W =1 } # Use it switch on compile time warnings (-w)

All the best,

Malay


-- 
Malay K Basu
www.malaybasu.net


From gyang at plantbio.uga.edu  Fri Aug 10 11:23:36 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Fri, 10 Aug 2007 11:23:36 -0400
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
 from StandAloneBlast
In-Reply-To: 20070809190321.191d0d4a@dogwood.plantbio.uga.edu
Message-ID: <20070810152336.898c3979@dogwood.plantbio.uga.edu>

Hi, Chris,  
Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run.  If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run.
Thanks,  
Guojun  
       _____  

  From: Guojun Yang [mailto:gyang at plantbio.uga.edu]
To: Chris Fields [mailto:cjfields at uiuc.edu]
Cc: bioperl-l at lists.open-bio.org
Sent: Thu, 09 Aug 2007 15:03:21 -0400
Subject: standalone blastall call crashed, please help

  
Hi, Chris,  
Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm?  Best, Guojun  
   
I set the blast env variables:  
   
BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
$PROGRAMDIR = $ENV{'BLASTDIR'} || '';
......  
   
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  -o  /tmp/4qSSO16EZP  -p  blastx   
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359
STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
STACK: main::ancestor makcgi07.txt:593
STACK: makcgi07.txt:208
  

sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  

my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
print $query->seq();
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('swissprot');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
*    my $blast_report = $factory->blastall($query);
    while (my $result = $blast_report->next_result) {
            while( my $hit = $result->next_hit()) {
                $hit_name=$hit->name;
                $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
                $name=$1;
                $desc = $hit->description();
                if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){
                     $AN=0;
                     $replica=0;
                     while ($ancestor_name[$AN]) {
                        $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name));
                         $AN+=1;
                     }
                        if ($replica==0) {
                        push @ancestor_name, $long_name;
                        push @ancestor_start, $long_start;
                        push @ancestor_end, $long_end;
                        push @desc, $desc;
                        push @hitname,$name;
                        }
                }
               }
              }}
return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
}


From cjfields at uiuc.edu  Fri Aug 10 12:17:38 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 10 Aug 2007 11:17:38 -0500
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
References: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
Message-ID: <56186844-3CB9-4968-B16F-FD5EE72865A2@uiuc.edu>

This should be filed as a bug if possible; could you do that?

http://www.bioperl.org/wiki/Bugs

Suggestions have been made many times previously that  
StandAloneBlast, RemoteBlast, etc be combined to use a common API,  
incorporate other BLAST implementations (i.e. WU-BLAST, NCBI's  
netblast, etc), and maybe utilize other cross-platform compatible  
means of running programs and passing off reports to parsers.  In  
fact, Jason, Roger Hall, Torsten, and I discussed tentative plans for  
plugin-able BLAST wrappers:

http://www.bioperl.org/wiki/Module:Bio::Tools::Run::RemoteBlast

Though they have never been acted upon.  If I get time towards the  
end of fall and manage to finish up some other projects I may try  
taking this on, maybe using the wiki to track progress.

chris

On Aug 10, 2007, at 10:23 AM, Guojun Yang wrote:

> Hi, Chris,
> Interestingly, I found the message in bioperl-l from Matthew Laird  
> 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES  
> run.  If one comments out this line in StandAloneBlast.pm, the  
> execution succeeds perfectly fine". It seemed to be mysterious when  
> I uncommented the " $self->throw("$executable call crashed: $? $!  
> $commandstring\n") unless ($status==0) ;" line, the blastall runs.  
> The only difference from what Matthew saw is that, when I did not  
> uncomment the line, blastall DID NOT run.
> Thanks,
> Guojun
>
> From: Guojun Yang [mailto:gyang at plantbio.uga.edu]
> To: Chris Fields [mailto:cjfields at uiuc.edu]
> Cc: bioperl-l at lists.open-bio.org
> Sent: Thu, 09 Aug 2007 15:03:21 -0400
> Subject: standalone blastall call crashed, please help
>
> Hi, Chris,
> Thanks a lot for your efforts. With your help, I am gaining more  
> confidence to fix the cgi code. While the remoteblast problem is  
> fixed now, I am caught in a local blast problem (see the error  
> message and subroutine). The line starting with * is line 593 in  
> the error message. I tried command line blastall, it works fine. I  
> set the permission to all the blast folders and files, it did not  
> help much. The same sequence and database works OK if I use command  
> line blastall. I used the seq object ref $query as query, the error  
> message gives "-i /tmp/...", does this look like an input problem?  
> The subroutine was working before early 2006 (on a different  
> machine), I am wondering whether this is due to changes in the  
> StandAloneBlast.pm?  Best, Guojun
>
> I set the blast env variables:
>
> BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
> BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
> BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
> $PROGRAMDIR = $ENV{'BLASTDIR'} || '';
> ......
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/ 
> usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  - 
> o  /tmp/4qSSO16EZP  -p  blastx
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/ 
> Root/Root.pm:359
> STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/ 
> site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
> STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/ 
> lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
> STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/ 
> site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
> STACK: main::ancestor makcgi07.txt:593
> STACK: makcgi07.txt:208
> sub ancestor {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
>
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"test");
> print $query->seq();
> my $len=$query->length();
> my $long_name=$_[1];
> my $long_start=$_[2];
> my $long_end=$_[3];
> @db=('swissprot');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new(-program =>  
> "blastx",
>                                                         -database  
> => "$db",
>                                                         -e => 1e-3,
>                                                         );
> *    my $blast_report = $factory->blastall($query);
>     while (my $result = $blast_report->next_result) {
>             while( my $hit = $result->next_hit()) {
>                 $hit_name=$hit->name;
>                 $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
>                 $name=$1;
>                 $desc = $hit->description();
>                 if ($desc =~ /.*{|\btransposon\b|\btransposase 
> \b|}.*/i){
>                      $AN=0;
>                      $replica=0;
>                      while ($ancestor_name[$AN]) {
>                         $replica=1 if (($ancestor_name[$AN] eq  
> $long_name) && ($hitname[$AN] eq $name));
>                          $AN+=1;
>                      }
>                         if ($replica==0) {
>                         push @ancestor_name, $long_name;
>                         push @ancestor_start, $long_start;
>                         push @ancestor_end, $long_end;
>                         push @desc, $desc;
>                         push @hitname,$name;
>                         }
>                 }
>                }
>               }}
> return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
> }
>
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From harijay at gmail.com  Fri Aug 10 13:09:32 2007
From: harijay at gmail.com (hari jayaram)
Date: Fri, 10 Aug 2007 13:09:32 -0400
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <46BC16A9.7090709@sendu.me.uk>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
	<46BB93DC.9010608@sendu.me.uk>
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
	<46BC16A9.7090709@sendu.me.uk>
Message-ID: <aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>

Hey all ,
Thanks for your help. Its working real well now.

Turns out I had not set my PERL5LIB environment variable correctly and it
was not finding the installed modules (thanks Sendu)

So the steps I followed were
1) Install CPAN as myself as detailed
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
Importantly the line which tells CPAN what prefix to use for all module
installs
PREFIX=~/perl5lib/ LIB=~/perl5lib/lib INSTALLMAN1DIR=~/perl5lib/man1
INSTALLMAN3DIR=~/perl5lib/man3

2) Set the Perl5LIB to /home/perl5lib/lib ( and not just /home/perl5lib) in
the shell . I use cshell so I edited .cshrc
setenv PERL5LIB /home/hari/perl5lib/lib
setenv MANPATH ${MANPATH}:/home/hari/perl5lib

3) Updated the system CPAN to latest version - this woked very well once the
perl5lib was installed ..only it took a while and sometimes stalled with
messages like done 31/34  But a CTRL C , got it going again

4) Made sure I was using the new CPAN v1.9102

5) Installed Bioperl with command
install S/SE/SENDU/bioperl-1.5.2_102.tar.gz

AND I was good to go..

I am thinking I will screencast this process for everyones benefit and put
it up on bioscreencast.com . If that will be useful for others.
Thanks to everyone on the group. Now the journey begins

Hari Jayaram


On 8/10/07, Sendu Bala <bix at sendu.me.uk> wrote:
> hari jayaram wrote:
> > Hi Sendu ,
>
> Hi, please post back to the list as well, so others can benefit.
>
>
> > Well after going through a few attempts at installing Bundle::CPAN I
> > gave up.
> > It always had weird timeout issues . ANd kept re-installing everything
> > on restarting the CPAN shell
> > After a while I thought it did complete - since it retunred me to the
shell
> >
> > I tried the CPAN install of bioperl at that point
> >
> > ANd bingo I got booted out at the exact same point when the Bioperl
> > install tried to re-install(?) Module:Build which failed as non root
>
> Did you follow steps 7 and 8 of
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?
>
> If you managed to install Bundle::CPAN, when you now run 'cpan' it
> should start up and tell you its version number, which should be v1.9102
> or higher. If its lower, you didn't manage to install the latest CPAN,
> or you haven't managed to tell Perl where your newly installed modules
are.
>
>
> > I guess for all future modules I will adopt the option 3 you detailed ,
> > i.e just have the modules sitting somewhere and use them from there
> >
> > But I am still interested in getting it done right via CPAN.
>


From torsten.seemann at infotech.monash.edu.au  Fri Aug 10 17:48:56 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 11 Aug 2007 07:48:56 +1000
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>
	<20070810152336.898c3979@dogwood.plantbio.uga.edu>
Message-ID: <a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>

> Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run.  If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run.

Yes, Matthew is one of the authors of PSORTB and I spent a bit of time
last year trying to fix this problem (unsuccessfully). The PSORTB docs
http://www.psort.org/downloads/index.html
explain how to get around this problem just as Guojun describes. I use
a custom BioPerl installation just for PSORTB!

 I was under the impression it was already filed as a bug, but my
searching indicates this is not so.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University


From cjfields at uiuc.edu  Fri Aug 10 18:04:20 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 10 Aug 2007 17:04:20 -0500
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>
References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>
	<20070810152336.898c3979@dogwood.plantbio.uga.edu>
	<a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>
Message-ID: <41A08079-6EEC-4B62-8104-C41E70C03083@uiuc.edu>


On Aug 10, 2007, at 4:48 PM, Torsten Seemann wrote:

>> Interestingly, I found the message in bioperl-l from Matthew Laird  
>> 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast  
>> DOES run.  If one comments out this line in StandAloneBlast.pm,  
>> the execution succeeds perfectly fine". It seemed to be mysterious  
>> when I uncommented the " $self->throw("$executable call crashed:  
>> $? $! $commandstring\n") unless ($status==0) ;" line, the blastall  
>> runs. The only difference from what Matthew saw is that, when I  
>> did not uncomment the line, blastall DID NOT run.
>
> Yes, Matthew is one of the authors of PSORTB and I spent a bit of time
> last year trying to fix this problem (unsuccessfully). The PSORTB docs
> http://www.psort.org/downloads/index.html
> explain how to get around this problem just as Guojun describes. I use
> a custom BioPerl installation just for PSORTB!
>
>  I was under the impression it was already filed as a bug, but my
> searching indicates this is not so.
>
> -- 
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Might be wise to go ahead and add it to bugzilla so we can track it,  
along with the workaround.

chris


From neetisomaiya at gmail.com  Mon Aug 13 06:29:39 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 13 Aug 2007 15:59:39 +0530
Subject: [Bioperl-l] Homologene parser?
Message-ID: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>

Hi,

Does anyone know of any Homologene parser, if available?
Please let me know.

Thanks and Regards,
Neeti.


-- 
-Neeti
Even my blood says, B positive


From shameer at ncbs.res.in  Mon Aug 13 07:07:45 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 16:37:45 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add
 direction to SeqFeature
In-Reply-To: <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
Message-ID: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>

Dear All,

I am generating images based on Transcription Factor binding site data
using bio::graphics module.
I created my images using program : version-2 
[http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
Stein ). I attaching one of the image with this mail.

I need to make 3 changes to this image

1. to color the 'scale'
Color the scale in two different colors ie, from start 1.0k - color blue
from 101 - till end of the scale green (I thoroghly checked the
Bio::Graphics document, I couldnt find an option to do this )

2. to sort the Transcription factors based on the z_score

3. to give forward/reverse [> or < ]direction for the black boxes

I would appreaciate if any one can give me some clues/link to accomplish
this :).
thanks in advance ,
Shameer

-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TF_top3.png
Type: image/png
Size: 2188 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070813/6a4423bd/attachment-0003.png>

From bix at sendu.me.uk  Mon Aug 13 09:11:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 14:11:50 +0100
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>	<1178028249.2644.13.camel@localhost.localdomain>	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
Message-ID: <46C05896.1010002@sendu.me.uk>

Shameer Khadar wrote:
> Dear All,
> 
> I am generating images based on Transcription Factor binding site data
> using bio::graphics module.
> I created my images using program : version-2 
> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
> Stein ). I attaching one of the image with this mail.
> 
> I need to make 3 changes to this image
> 
> 1. to color the 'scale'
> Color the scale in two different colors ie, from start 1.0k - color blue
> from 101 - till end of the scale green (I thoroghly checked the
> Bio::Graphics document, I couldnt find an option to do this )

The scale is just a scale and shouldn't need colouring. You can do what 
you want by having a blue 'upstream' feature and a green 'gene' feature 
in the first row.


> 2. to sort the Transcription factors based on the z_score

I don't know Bio::Graphics well enough, but am interested in the answer...


> 3. to give forward/reverse [> or < ]direction for the black boxes

Presumably you just change the glyph type of your binding sites to 
something that shows direction, like 'processed_transcript'. Someone 
else may have a more appropriate suggestion.

However, do your binding sites really have a direction? That is, do you 
really know which strand your transcription factor bound to?


From cjfields at uiuc.edu  Mon Aug 13 10:39:11 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 09:39:11 -0500
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
	add direction to SeqFeature
In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
Message-ID: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>


On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote:

> Dear All,
>
> I am generating images based on Transcription Factor binding site data
> using bio::graphics module.
> I created my images using program : version-2
> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
> Stein ). I attaching one of the image with this mail.
>
> I need to make 3 changes to this image
>
> 1. to color the 'scale'
> Color the scale in two different colors ie, from start 1.0k - color  
> blue
> from 101 - till end of the scale green (I thoroghly checked the
> Bio::Graphics document, I couldnt find an option to do this )

Much of the documentation you need is available via 'perldoc  
Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes.   
The above may be possible using two seqfeatures instead of one or  
maybe a split location with a callback (not sure, haven't tried  
either, mileage may vary, batteries not included, warranty void if  
packaging is opened, etc).  Might be worth checking out the POD for  
the arrow glyph to see what's possible.

> 2. to sort the Transcription factors based on the z_score

In Bio::Graphics::Panel POD under 'Glyph Options', there is  
documentation for 'sort_order' which accepts callbacks.  According to  
the docs you would basically do something like the following (the  
prototype is required; note the score):

   -sort_order => sub ($$) {
     my ($glyph1,$glyph2) = @_;
     my $a = $glyph1->feature;
     my $b = $glyph2->feature;
     ( $b->score/log($b->length)
           <=>
       $a->score/log($a->length) )
           ||
     ( $a->start <=> $b->start )
   }

Again, haven't tried.

> 3. to give forward/reverse [> or < ]direction for the black boxes

I think you first need to ensure the glyph will accept strandedness,  
though I think most do.  Then you would set either the 'strand_arrow'  
or 'stranded' option to 1 (they are synonyms).  Again, see  
Bio::Graphics::Panel POD under Glyph Options, specifically the  
parameter 'stranded' or 'strand_arrow'.

> I would appreaciate if any one can give me some clues/link to  
> accomplish
> this :).
> thanks in advance ,
> Shameer

No problem!

chris

> -- 
> Shameer Khadar
> Lab (# 25) The Computational Biology Group
> National Centre for Biological Sciences (TIFR)
> GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
> T - 91-080-23666001 EXT - 6251
> W - http://www.ncbs.res.in
> <TF_top3.png>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Mon Aug 13 10:47:35 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 20:17:35 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <46C05896.1010002@sendu.me.uk>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
Message-ID: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>

Dear Sendu,

Thanks for your reply.

>> I need to make 3 changes to this image
>>
>> 1. to color the 'scale'
>> Color the scale in two different colors ie, from start 1.0k - color blue
>> from 101 - till end of the scale green (I thoroghly checked the
>> Bio::Graphics document, I couldnt find an option to do this )
>
> The scale is just a scale and shouldn't need colouring. You can do what
> you want by having a blue 'upstream' feature and a green 'gene' feature
> in the first row.
Thanks for the point : 'The scale is just a scale...'.
But my idea is to differentiate the scale in to three to diffentiate
between 100bp upstream region, UTR and gene start site. starting point of
scale till 0k is the 100bp upstream. From 0k till end of the current_scale
is UTR, from the end of scale gene starts, since this is a bit tough to
distinguish, we thought of this coloring option. Addition of an extra
track may is an alternate option (I tried to convince our experimental
team by adding an extra track, but they want it this way :(..)

>
>> 2. to sort the Transcription factors based on the z_score
> I don't know Bio::Graphics well enough, but am interested in the answer...
>
It is possible, but sort_order option is available. I tried it a couple of
times but it is not  working.

>
>> 3. to give forward/reverse [> or < ]direction for the black boxes
>
> Presumably you just change the glyph type of your binding sites to
> something that shows direction, like 'processed_transcript'. Someone
> else may have a more appropriate suggestion.
Thanks, I will look in to it.

>
> However, do your binding sites really have a direction? That is, do you
> really know which strand your transcription factor bound to?
Yes, these info we collated from various experimental datasets.

-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From bix at sendu.me.uk  Mon Aug 13 11:01:43 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 16:01:43 +0100
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
Message-ID: <46C07257.1000308@sendu.me.uk>

Shameer Khadar wrote:
>> However, do your binding sites really have a direction? That is, do you
>> really know which strand your transcription factor bound to?
 >
> Yes, these info we collated from various experimental datasets.

Well, those datasets I'd like to see... What I was getting at is the 
strand probably isn't known at the experimental level, but to describe 
the site a strand has to be arbitrarily picked so you can write the 
sequence of the site down as a single string. Its probably the case that 
the strand information you have is just the way it happened to be 
reported in the literature and has no biological meaning.


From shameer at ncbs.res.in  Mon Aug 13 11:16:33 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 20:46:33 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>
Message-ID: <42833.192.168.1.1.1187018193.squirrel@mail.ncbs.res.in>

Chris,

Thanks for your detailed reply.
I will read up the docs and try different options using ur code snippets
as starting point. I will get back to the list with my results.

Thanks
-- 
Shameer

>
> On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote:
>
>> Dear All,
>>
>> I am generating images based on Transcription Factor binding site data
>> using bio::graphics module.
>> I created my images using program : version-2
>> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
>> Stein ). I attaching one of the image with this mail.
>>
>> I need to make 3 changes to this image
>>
>> 1. to color the 'scale'
>> Color the scale in two different colors ie, from start 1.0k - color
>> blue
>> from 101 - till end of the scale green (I thoroghly checked the
>> Bio::Graphics document, I couldnt find an option to do this )
>
> Much of the documentation you need is available via 'perldoc
> Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes.
> The above may be possible using two seqfeatures instead of one or
> maybe a split location with a callback (not sure, haven't tried
> either, mileage may vary, batteries not included, warranty void if
> packaging is opened, etc).  Might be worth checking out the POD for
> the arrow glyph to see what's possible.
>
>> 2. to sort the Transcription factors based on the z_score
>
> In Bio::Graphics::Panel POD under 'Glyph Options', there is
> documentation for 'sort_order' which accepts callbacks.  According to
> the docs you would basically do something like the following (the
> prototype is required; note the score):
>
>    -sort_order => sub ($$) {
>      my ($glyph1,$glyph2) = @_;
>      my $a = $glyph1->feature;
>      my $b = $glyph2->feature;
>      ( $b->score/log($b->length)
>            <=>
>        $a->score/log($a->length) )
>            ||
>      ( $a->start <=> $b->start )
>    }
>
> Again, haven't tried.
>
>> 3. to give forward/reverse [> or < ]direction for the black boxes
>
> I think you first need to ensure the glyph will accept strandedness,
> though I think most do.  Then you would set either the 'strand_arrow'
> or 'stranded' option to 1 (they are synonyms).  Again, see
> Bio::Graphics::Panel POD under Glyph Options, specifically the
> parameter 'stranded' or 'strand_arrow'.
>


-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From bix at sendu.me.uk  Mon Aug 13 11:47:10 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 16:47:10 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>	
	<46BB93DC.9010608@sendu.me.uk>	
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>	
	<46BC16A9.7090709@sendu.me.uk>
	<aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>
Message-ID: <46C07CFE.7020105@sendu.me.uk>

hari jayaram wrote:
> Hey all ,
> Thanks for your help. Its working real well now.
[snip]
> I am thinking I will screencast this process for everyones benefit and 
> put it up on bioscreencast.com <http://bioscreencast.com> . If that will 
> be useful for others.

I'm certain it will. That's a very interesting website. Thanks for 
taking the time, and I hope you find Bioperl useful.


From cjfields at uiuc.edu  Mon Aug 13 12:24:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 11:24:15 -0500
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
	add direction to SeqFeature
In-Reply-To: <46C07257.1000308@sendu.me.uk>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
Message-ID: <A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>


On Aug 13, 2007, at 10:01 AM, Sendu Bala wrote:

> Shameer Khadar wrote:
>>> However, do your binding sites really have a direction? That is,  
>>> do you
>>> really know which strand your transcription factor bound to?
>>
>> Yes, these info we collated from various experimental datasets.
>
> Well, those datasets I'd like to see... What I was getting at is the
> strand probably isn't known at the experimental level, but to describe
> the site a strand has to be arbitrarily picked so you can write the
> sequence of the site down as a single string. Its probably the case  
> that
> the strand information you have is just the way it happened to be
> reported in the literature and has no biological meaning.

It's subjective.  I can think of several cases where strandedness  
does matter and has meaning.  If the motif is related to how the gene  
is transcribed or post-transcriptionally regulated, for instance;  
elements which indicate start of transcription (-10/-35 or any sigma- 
factor-related promoter element in prokaryotes), end of transcription  
(poly-A signal, transcription terminators), modulation of translation  
(SECIS, IRES), or conserved DNA motifs which are transcribed prior to  
regulation (RNA-binding proteins like IRE).

chris


From amacgregor at ccg.murdoch.edu.au  Mon Aug 13 20:52:10 2007
From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor)
Date: Tue, 14 Aug 2007 08:52:10 +0800
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
Message-ID: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>

On 13/08/2007, at 6:29 PM, neeti somaiya wrote:

> Hi,
>
> Does anyone know of any Homologene parser, if available?
> Please let me know.
>
> Thanks and Regards,
> Neeti.

Hi Neeti,

Quite a long time ago now I wrote an Homologene parser and posted it  
to the mailing list:

<http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>

I don't know if this still works but you could use it as a starting  
point. There may also be something newer out there too, I don't know.  
If you search the mailing list archives you'll get a few messages  
around the topic.

Cheers, Andrew.


Andrew Macgregor
Centre for Comparative Genomics, Murdoch University
Email: amacgregor at ccg.murdoch.edu.au
Tel: (08) 9360 2961


From cjfields at uiuc.edu  Mon Aug 13 23:21:54 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 22:21:54 -0500
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
Message-ID: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>

It looks like Heikki responded and thought a good place for it would  
be Bio::SeqIO, but it didn't go anywhere I suppose.  I see that a few  
other posts suggest it could be placed in Bio::Cluster as well which  
I'm not familiar with.  We could add it in if you were still  
interested, just need to find a good place for it; might be nice to  
have a Parse::RecDescent-based parser.

chris

On Aug 13, 2007, at 7:52 PM, Andrew Macgregor wrote:

> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>
>> Hi,
>>
>> Does anyone know of any Homologene parser, if available?
>> Please let me know.
>>
>> Thanks and Regards,
>> Neeti.
>
> Hi Neeti,
>
> Quite a long time ago now I wrote an Homologene parser and posted it
> to the mailing list:
>
> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>
> I don't know if this still works but you could use it as a starting
> point. There may also be something newer out there too, I don't know.
> If you search the mailing list archives you'll get a few messages
> around the topic.
>
> Cheers, Andrew.
>
>
> Andrew Macgregor
> Centre for Comparative Genomics, Murdoch University
> Email: amacgregor at ccg.murdoch.edu.au
> Tel: (08) 9360 2961
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Tue Aug 14 03:46:19 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 14 Aug 2007 08:46:19 +0100
Subject: [Bioperl-l] Warnings/errors generated by Eclipse
Message-ID: <46C15DCB.80603@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've just been setting up Eclipse with the EPIC plugin, and it's
generating some errors and warnings about bioperl-live that I'd like to
pass by you.

I think most of the errors are along the lines of:
"Can't find 'build_params' in _build in
/usr/local/share/perl/5.8.8/Module/Build/Base.pm line 1011"

This occurs with files like:
t/Biblio_biofetch.t
t/seqread_fail.t

I think it's to do with the parameters passed to test_begin() or it
could be my setup of Eclipse?

Other highlighted problems are some of the scripts in the examples dir.
Some require modules that reside in the bioperl-run package. Would it be
wise to move these to the bioperl-run examples dir?

There may also be some problems with XML files in t/data e.g.
t/data/interpro_ebi.xml
There appears to be a typo on line 2. However, I'm not sure this is
up-to-date? I can comment on the others later if required.

Cheers
Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGwV3KczuW2jkwy2gRApM/AJ9abWl02CAJqDK2sEXEUEg8nGRC4ACdHcAb
nZmh+1dmtc1W9mThkUVKitw=
=5eXZ
-----END PGP SIGNATURE-----


From amacgregor at ccg.murdoch.edu.au  Tue Aug 14 01:14:58 2007
From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor)
Date: Tue, 14 Aug 2007 13:14:58 +0800
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
Message-ID: <C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>

On 14/08/2007, at 11:21 AM, Chris Fields wrote:

> It looks like Heikki responded and thought a good place for it  
> would be Bio::SeqIO, but it didn't go anywhere I suppose.  I see  
> that a few other posts suggest it could be placed in Bio::Cluster  
> as well which I'm not familiar with.  We could add it in if you  
> were still interested, just need to find a good place for it; might  
> be nice to have a Parse::RecDescent-based parser.
>
> chris
>

Hi Chris,

I was also doing some parsing of UniGene at the time but found  
RecDescent was too slow and went back to regexes. That code found  
it's way into Bio::Cluster. Occasionally I see a message with someone  
looking for a Homologene parser but not very often, so I'm not sure  
it is worth the effort of moving the code into bioperl.

Cheers, Andrew.


From neetisomaiya at gmail.com  Tue Aug 14 09:24:07 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Tue, 14 Aug 2007 18:54:07 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
Message-ID: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>

Hi Andrew,

I think the homologene data files have changed now on the ftp, from what you
had used.
It is now homologene.data and homologene.xml.
I tried using your parser, but because it was written on the file
hmlg.trip.ftp, it doesnt work anymore.

I came across a parser
http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
.
I am looking at it to see if it works for me. NOt sure if it will.

~Neeti.

On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>
> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>
> > Hi,
> >
> > Does anyone know of any Homologene parser, if available?
> > Please let me know.
> >
> > Thanks and Regards,
> > Neeti.
>
> Hi Neeti,
>
> Quite a long time ago now I wrote an Homologene parser and posted it
> to the mailing list:
>
> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>
> I don't know if this still works but you could use it as a starting
> point. There may also be something newer out there too, I don't know.
> If you search the mailing list archives you'll get a few messages
> around the topic.
>
> Cheers, Andrew.
>
>
> Andrew Macgregor
> Centre for Comparative Genomics, Murdoch University
> Email: amacgregor at ccg.murdoch.edu.au
> Tel: (08) 9360 2961
>
>
>
>


-- 
-Neeti
Even my blood says, B positive


From bix at sendu.me.uk  Tue Aug 14 10:57:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 14 Aug 2007 15:57:29 +0100
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
Message-ID: <46C1C2D9.6050409@sendu.me.uk>

I'm looking at what looks like a pretty major bug in Bio::SimpleAlign, 
but before I commit the fix I wanted to check my sanity/understanding.

My understanding is that an alignment may be built from just sub-parts 
of a number of sequences. So you give each sequence in the alignment a 
start and stop so you can later map back the aligned region to the 
original sequence. So, for example, the following should all pass:

diff -r1.56 SimpleAlign.t
459a460,540
 >
 >
 > # is _remove_col really working correctly?
 > my $a = Bio::LocatableSeq->new(-id => 'a', -seq => 
'atcgatcgatcgatcg', -start => 5, -end => 20);
 > my $b = Bio::LocatableSeq->new(-id => 'b', -seq => 
'-tcgatc-atcgatcg', -start => 30, -end => 43);
 > my $c = Bio::LocatableSeq->new(-id => 'c', -seq => 
'atcgatcgatc-atc-', -start => 50, -end => 63);
 > my $d = Bio::LocatableSeq->new(-id => 'd', -seq => 
'--cgatcgatcgat--', -start => 80, -end => 91);
 > my $e = Bio::LocatableSeq->new(-id => 'e', -seq => 
'-t-gatcgatcga-c-', -start => 100, -end => 111);
 > $aln = Bio::SimpleAlign->new();
 > $aln->add_seq($a);
 > $aln->add_seq($b);
 > $aln->add_seq($c);
 >
 > my $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 6;
 >               is $seq->end, 19;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 >       elsif ($seq->id eq 'b') {
 >               is $seq->start, 30;
 >               is $seq->end, 42;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 >       elsif ($seq->id eq 'c') {
 >               is $seq->start, 51;
 >               is $seq->end, 63;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 > }
 >
 > $aln->add_seq($d);
 > $aln->add_seq($e);
 > $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 8;
 >               is $seq->end, 17;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'b') {
 >               is $seq->start, 32;
 >               is $seq->end, 40;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'c') {
 >               is $seq->start, 53;
 >               is $seq->end, 61;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'd') {
 >               is $seq->start, 81;
 >               is $seq->end, 90;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'e') {
 >               is $seq->start, 101;
 >               is $seq->end, 110;
 >               is $seq->seq, 'gatcatca';
 >       }
 > }
 >
 > my $f = Bio::LocatableSeq->new(-id => 'f', -seq => 
'a-cgatcgatcgat-g', -start => 30, -end => 43);
 > $aln = Bio::SimpleAlign->new();
 > $aln->add_seq($a);
 > $aln->add_seq($f);
 >
 > $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 5;
 >               is $seq->end, 20;
 >               is $seq->seq, 'acgatcgatcgatg';
 >       }
 >       elsif ($seq->id eq 'f') {
 >               is $seq->start, 30;
 >               is $seq->end, 43;
 >               is $seq->seq, 'acgatcgatcgatg';
 >       }
 > }


But they don't. Once you remove certain columns the start and stop of 
the sequences in the alignment are no longer correct coordinates for the 
sub-sequence in the original sequence.

I propose the following patch to resolve this issue:

diff -r1.136 SimpleAlign.pm
1116c1116,1118
<
---
 >
 >     my $gap = $self->gap_char;
 >
1129,1137c1131,1147
<             my $spliced;
<             $spliced .= $start > 0 ? substr($sequence,0,$start) : '';
<             $spliced .= substr($sequence,$end+1,$seq->length-$end+1);
<             $sequence = $spliced;
<             if ($start == 1) {
<               $new_seq->start($end);
<             }
<             else {
<               $new_seq->start( $seq->start);
---
 >             my $orig = $sequence;
 >             my $head =  $start > 0 ? substr($sequence, 0, $start) : '';
 >             my $tail = ($end + 1) >= length($sequence) ? '' : 
substr($sequence, $end + 1);
 >             $sequence = $head.$tail;
 >             # start
 >             unless (defined $new_seq->start) {
 >                 if ($start == 0) {
 >                     my $start_adjust = () = substr($orig, 0, $end + 
1) =~ /$gap/g;
 >                     $new_seq->start($seq->start + $end + 1 - 
$start_adjust);
 >                 }
 >                 else {
 >                     my $start_adjust = $orig =~ /$gap+/;
 >                     if ($start_adjust) {
 >                         $start_adjust = $+[0] - 1 < $start;
 >                     }
 >                     $new_seq->start($seq->start + $start_adjust);
 >                 }
1140,1141c1150,1152
<             if($end >= $seq->end){
<              $new_seq->end( $start);
---
 >             if (($end + 1) >= length($orig)) {
 >                 my $end_adjust = () = substr($orig, $start) =~ /$gap/g;
 >                 $new_seq->end($seq->end - (length($orig) - $start) + 
$end_adjust);
1144c1155
<              $new_seq->end($seq->end);
---
 >                 $new_seq->end($seq->end);
1148c1159
<                 push @new, $new_seq;
---
 >               push @new, $new_seq;
1207,1209c1218,1234
<       # sort the positions to remove columns at the end 1st
<       @$positions = sort { $b->[0] <=> $a->[0] } @$positions;
<       $aln = $self->_remove_col($aln,$positions);
---
 >       # sort the positions
 >       @$positions = sort { $a->[0] <=> $b->[0] } @$positions;
 >
 >     my @remove;
 >     my $length = 0;
 >     foreach my $pos (@{$positions}) {
 >         my ($start, $end) = @{$pos};
 >
 >         #have to offset the start and end for subsequent removes
 >         $start-=$length;
 >         $end  -=$length;
 >         $length += ($end-$start+1);
 >         push @remove, [$start,$end];
 >     }
 >
 >     #remove the segments
 >     $aln = $#remove >= 0 ? $self->_remove_col($aln,\@remove) : $self;


This breaks 2 tests in SimpleAlign.t, but as far as I can tell, those 
tests expect the wrong answer. Changed to expect the correct answer, 
SimpleAlign.t and all other tests in the test suite pass.

diff -r1.56 SimpleAlign.t
214,215c214,215
<       "P84139/1-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
<       "P814153/1-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
---
 >       "P84139/2-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
 >       "P814153/2-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
229c229
<       "gb|443893|124775/1-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
---
 >       "gb|443893|124775/2-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",


Can someone triple-check my thinking and report back please?

Cheers,
Sendu.


From basu at pharm.sunysb.edu  Tue Aug 14 11:02:06 2007
From: basu at pharm.sunysb.edu (Siddhartha Basu)
Date: Tue, 14 Aug 2007 11:02:06 -0400
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
Message-ID: <46C1C3EE.4030006@pharm.sunysb.edu>

neeti somaiya wrote:
> Hi Andrew,
> 
> I think the homologene data files have changed now on the ftp, from what you
> had used.
> It is now homologene.data and homologene.xml.
> I tried using your parser, but because it was written on the file
> hmlg.trip.ftp, it doesnt work anymore.
> 
> I came across a parser
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> .
> I am looking at it to see if it works for me. NOt sure if it will.
> 
> ~Neeti.

Hi Neeti,
I have recently written a parser for 'homologene' xml data specific for 
my purpose. I am not sure whether it will suit your purpose but it could 
be extended for general purpose parsing, so i am putting it forward. 
Here is how it works .......

* It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
* It does SAX based parsing (currently uses XML::SAX::ExpatXS)
* Returns a graph(uses Graph module of perl) object where each node is a 
homologue entry with its corresponding entrez gene id. Each node also 
contain the following attributes ...
	* Refseq protein id.
	* Protein id (pid)
	* ncbi taxon id.
* The edge attribute contain information about the ortholog(true/false) 
relationship between two nodes.
* The rest of tags currently are not being extracted. However, parsing 
the rest of the tags should not be very difficult.

Generally i get homologene xml stream from an 'efetch' through 
Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and 
then works on it.

So, to make it more generic and work on local file

* We need another class that reads the chunk between 
<HG-Entry>.....</HG-Entry> and sends it to the parser.
* Add supports for most of the tags.
* Massage the data to a bioperl compatible object.

The first two i could work it out and for the last one i have to figure 
out the bioperl object that could be suitable (like  Bio::Cluster or 
Bio::NetWork::Node/Edge).

Let me know if it sounds interesting and i will send you the code.

-siddhartha


> 
> On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>>
>>> Hi,
>>>
>>> Does anyone know of any Homologene parser, if available?
>>> Please let me know.
>>>
>>> Thanks and Regards,
>>> Neeti.
>> Hi Neeti,
>>
>> Quite a long time ago now I wrote an Homologene parser and posted it
>> to the mailing list:
>>
>> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>>
>> I don't know if this still works but you could use it as a starting
>> point. There may also be something newer out there too, I don't know.
>> If you search the mailing list archives you'll get a few messages
>> around the topic.
>>
>> Cheers, Andrew.
>>
>>
>> Andrew Macgregor
>> Centre for Comparative Genomics, Murdoch University
>> Email: amacgregor at ccg.murdoch.edu.au
>> Tel: (08) 9360 2961
>>
>>
>>
>>
> 
> 


From cjfields at uiuc.edu  Tue Aug 14 12:33:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 14 Aug 2007 11:33:31 -0500
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
In-Reply-To: <46C1C2D9.6050409@sendu.me.uk>
References: <46C1C2D9.6050409@sendu.me.uk>
Message-ID: <B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>

Could you attach the scripts and patches to a bug report for tracking  
so anyone interested can double-check?  Having them in an email is  
problematic as the text in some clients wraps.

 From what I'm seeing I think we're in general agreement, though I'll  
reason through it to see if I'm following correctly.  The data in the  
SimpleAlign example you give is this:

a/5-20            atcgatcgatcgatcg
b/30-43           -tcgatc-atcgatcg
c/50-63           atcgatcgatc-atc-
                    ****** *** ***

Removing the gaps gives:

a/5-20            tcgatcatcatc
b/30-43           tcgatcatcatc
c/50-63           tcgatcatcatc
                   ************

The start/end is wrong, as you state.  Adjusting to map simple start/ 
ends to the original sequence won't work as we're removing gaps and  
residues in the LocatableSeqs along with it (ends and internal  
residues).  I guess if we want to map back to the original sequence  
accurately we would have to use split locations (not currently  
implemented with LocatableSeq) or maybe a cigar-like syntax against  
consensus (ugh), otherwise we wouldn't know where to map the relevant  
internal gaps (now missing from the alignment) w/o running a local  
alignment against the original sequence:

a/6-11;12-19      tcgatcatcatc
b/30-38;40-42     tcgatcatcatc
c/51-56;58-63     tcgatcatcatc
                   ************

That could get really hairy for long alignments.  We could also  
return multiple SimpleAligns which map correctly (ugh), but what we  
really want (and the API specifies) is a new single SimpleAlign.

It may come down to simply stating it 'voids the warranty' (so-to- 
speak) when modifications are made to alignments which remove/insert  
residues from LocatableSeqs via remove_gaps/remove_columns or  
similar, and either leave as is with relevant warnings or readjust  
start/end appropriately when LocatableSeq residues change.

gapless_a/1-12    tcgatcatcatc
gapless_b/1-12    tcgatcatcatc
gapless_c/1-12    tcgatcatcatc
                   ************

Not sure which is the best approach but anything would be better than  
giving an unexpectedly incorrect answer.

chris

On Aug 14, 2007, at 9:57 AM, Sendu Bala wrote:

> I'm looking at what looks like a pretty major bug in Bio::SimpleAlign,
> but before I commit the fix I wanted to check my sanity/understanding.
>
> My understanding is that an alignment may be built from just sub-parts
> of a number of sequences. So you give each sequence in the alignment a
> start and stop so you can later map back the aligned region to the
> original sequence. So, for example, the following should all pass:
>
> diff -r1.56 SimpleAlign.t
> 459a460,540
>>
>>
>> # is _remove_col really working correctly?
>> my $a = Bio::LocatableSeq->new(-id => 'a', -seq =>
> 'atcgatcgatcgatcg', -start => 5, -end => 20);
>> my $b = Bio::LocatableSeq->new(-id => 'b', -seq =>
> '-tcgatc-atcgatcg', -start => 30, -end => 43);
>> my $c = Bio::LocatableSeq->new(-id => 'c', -seq =>
> 'atcgatcgatc-atc-', -start => 50, -end => 63);
>> my $d = Bio::LocatableSeq->new(-id => 'd', -seq =>
> '--cgatcgatcgat--', -start => 80, -end => 91);
>> my $e = Bio::LocatableSeq->new(-id => 'e', -seq =>
> '-t-gatcgatcga-c-', -start => 100, -end => 111);
>> $aln = Bio::SimpleAlign->new();
>> $aln->add_seq($a);
>> $aln->add_seq($b);
>> $aln->add_seq($c);
>>
>> my $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 6;
>>               is $seq->end, 19;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>>       elsif ($seq->id eq 'b') {
>>               is $seq->start, 30;
>>               is $seq->end, 42;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>>       elsif ($seq->id eq 'c') {
>>               is $seq->start, 51;
>>               is $seq->end, 63;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>> }
>>
>> $aln->add_seq($d);
>> $aln->add_seq($e);
>> $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 8;
>>               is $seq->end, 17;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'b') {
>>               is $seq->start, 32;
>>               is $seq->end, 40;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'c') {
>>               is $seq->start, 53;
>>               is $seq->end, 61;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'd') {
>>               is $seq->start, 81;
>>               is $seq->end, 90;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'e') {
>>               is $seq->start, 101;
>>               is $seq->end, 110;
>>               is $seq->seq, 'gatcatca';
>>       }
>> }
>>
>> my $f = Bio::LocatableSeq->new(-id => 'f', -seq =>
> 'a-cgatcgatcgat-g', -start => 30, -end => 43);
>> $aln = Bio::SimpleAlign->new();
>> $aln->add_seq($a);
>> $aln->add_seq($f);
>>
>> $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 5;
>>               is $seq->end, 20;
>>               is $seq->seq, 'acgatcgatcgatg';
>>       }
>>       elsif ($seq->id eq 'f') {
>>               is $seq->start, 30;
>>               is $seq->end, 43;
>>               is $seq->seq, 'acgatcgatcgatg';
>>       }
>> }
>
>
> But they don't. Once you remove certain columns the start and stop of
> the sequences in the alignment are no longer correct coordinates  
> for the
> sub-sequence in the original sequence.
>
> I propose the following patch to resolve this issue:
>
> diff -r1.136 SimpleAlign.pm
> 1116c1116,1118
> <
> ---
>>
>>     my $gap = $self->gap_char;
>>
> 1129,1137c1131,1147
> <             my $spliced;
> <             $spliced .= $start > 0 ? substr($sequence,0,$start) :  
> '';
> <             $spliced .= substr($sequence,$end+1,$seq->length-$end 
> +1);
> <             $sequence = $spliced;
> <             if ($start == 1) {
> <               $new_seq->start($end);
> <             }
> <             else {
> <               $new_seq->start( $seq->start);
> ---
>>             my $orig = $sequence;
>>             my $head =  $start > 0 ? substr($sequence, 0,  
>> $start) : '';
>>             my $tail = ($end + 1) >= length($sequence) ? '' :
> substr($sequence, $end + 1);
>>             $sequence = $head.$tail;
>>             # start
>>             unless (defined $new_seq->start) {
>>                 if ($start == 0) {
>>                     my $start_adjust = () = substr($orig, 0, $end +
> 1) =~ /$gap/g;
>>                     $new_seq->start($seq->start + $end + 1 -
> $start_adjust);
>>                 }
>>                 else {
>>                     my $start_adjust = $orig =~ /$gap+/;
>>                     if ($start_adjust) {
>>                         $start_adjust = $+[0] - 1 < $start;
>>                     }
>>                     $new_seq->start($seq->start + $start_adjust);
>>                 }
> 1140,1141c1150,1152
> <             if($end >= $seq->end){
> <              $new_seq->end( $start);
> ---
>>             if (($end + 1) >= length($orig)) {
>>                 my $end_adjust = () = substr($orig, $start) =~ / 
>> $gap/g;
>>                 $new_seq->end($seq->end - (length($orig) - $start) +
> $end_adjust);
> 1144c1155
> <              $new_seq->end($seq->end);
> ---
>>                 $new_seq->end($seq->end);
> 1148c1159
> <                 push @new, $new_seq;
> ---
>>               push @new, $new_seq;
> 1207,1209c1218,1234
> <       # sort the positions to remove columns at the end 1st
> <       @$positions = sort { $b->[0] <=> $a->[0] } @$positions;
> <       $aln = $self->_remove_col($aln,$positions);
> ---
>>       # sort the positions
>>       @$positions = sort { $a->[0] <=> $b->[0] } @$positions;
>>
>>     my @remove;
>>     my $length = 0;
>>     foreach my $pos (@{$positions}) {
>>         my ($start, $end) = @{$pos};
>>
>>         #have to offset the start and end for subsequent removes
>>         $start-=$length;
>>         $end  -=$length;
>>         $length += ($end-$start+1);
>>         push @remove, [$start,$end];
>>     }
>>
>>     #remove the segments
>>     $aln = $#remove >= 0 ? $self->_remove_col($aln,\@remove) : $self;
>
>
> This breaks 2 tests in SimpleAlign.t, but as far as I can tell, those
> tests expect the wrong answer. Changed to expect the correct answer,
> SimpleAlign.t and all other tests in the test suite pass.
>
> diff -r1.56 SimpleAlign.t
> 214,215c214,215
> <       "P84139/1-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
> <       "P814153/1-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
> ---
>>       "P84139/2-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
>>       "P814153/2-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
> 229c229
> <       "gb|443893|124775/1-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
> ---
>>       "gb|443893|124775/2-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
>
>
> Can someone triple-check my thinking and report back please?
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Aug 14 13:13:30 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 14 Aug 2007 18:13:30 +0100
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
 columns?
In-Reply-To: <B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
References: <46C1C2D9.6050409@sendu.me.uk>
	<B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
Message-ID: <46C1E2BA.8060606@sendu.me.uk>

Chris Fields wrote:
> Could you attach the scripts and patches to a bug report for tracking
> so anyone interested can double-check?  Having them in an email is 
> problematic as the text in some clients wraps.

http://bugzilla.open-bio.org/show_bug.cgi?id=2344


> From what I'm seeing I think we're in general agreement, though I'll
>  reason through it to see if I'm following correctly.  The data in
> the SimpleAlign example you give is this:
> 
> a/5-20            atcgatcgatcgatcg
> b/30-43           -tcgatc-atcgatcg
> c/50-63           atcgatcgatc-atc-
>                    ****** *** ***
> 
> Removing the gaps gives:
> 
> a/5-20            tcgatcatcatc
> b/30-43           tcgatcatcatc
> c/50-63           tcgatcatcatc
>                   ************
> 
> The start/end is wrong, as you state.

Yes. For extra clarity, my thinking is that the correct answer is:

a/6-19            tcgatcatcatc
b/30-42           tcgatcatcatc
c/51-63           tcgatcatcatc
                   ************


> Adjusting to map simple start/ends to the original sequence won't
> work as we're removing gaps and residues in the LocatableSeqs along
> with it (ends and internal residues).  I guess if we want to map back
> to the original sequence accurately [snip]

What you say in the rest of your discussion is valid and deserves some 
thought/discussion, but for now just getting the start and end correct, 
ignoring any issues with internal residues, seems like a no-brainer.

For my own purposes that is all I need; having removed gaps I only need 
the start and end so I can take that region from each sequence and do a 
new alignment (for example).


BTW. Either my patch isn't quite perfect or there's another related bug 
I'm still tracking down. I'll commit when I've solved that, unless 
someone points out any mistakes in my thinking.


From basu at pharm.stonybrook.edu  Tue Aug 14 12:16:23 2007
From: basu at pharm.stonybrook.edu (Siddhartha Basu)
Date: Tue, 14 Aug 2007 12:16:23 -0400
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
Message-ID: <46C1D557.7090101@pharm.stonybrook.edu>

neeti somaiya wrote:
> Hi Andrew,
> 
> I think the homologene data files have changed now on the ftp, from what you
> had used.
> It is now homologene.data and homologene.xml.
> I tried using your parser, but because it was written on the file
> hmlg.trip.ftp, it doesnt work anymore.
> 
> I came across a parser
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> .
> I am looking at it to see if it works for me. NOt sure if it will.
> 
> ~Neeti.

Hi Neeti,
I have recently written a parser for 'homologene' xml data specific for
my purpose. I am not sure whether it will suit your purpose but it could
be extended for general purpose parsing, so i am putting it forward.
Here is how it works .......

* It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
* It does SAX based parsing (currently uses XML::SAX::ExpatXS)
* Returns a graph(uses Graph module of perl) object where each node is a
homologue entry with its corresponding entrez gene id. Each node also
contain the following attributes ...
	* Refseq protein id.
	* Protein id (pid)
	* ncbi taxon id.
* The edge attribute contain information about the ortholog(true/false)
relationship between two nodes.
* The rest of tags currently are not being extracted. However, parsing
the rest of the tags should not be very difficult.

Generally i get homologene xml stream from an 'efetch' through
Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and
then works on it.

So, to make it more generic and work on local file

* We need another class that reads the chunk between
<HG-Entry>.....</HG-Entry> and sends it to the parser.
* Add supports for most of the tags.
* Massage the data to a bioperl compatible object.

The first two i could work it out and for the last one i have to figure
out the bioperl object that could be suitable (like  Bio::Cluster or
Bio::NetWork::Node/Edge).

Let me know if it sounds interesting and i will send you the code.

-siddhartha


> 
> On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>>
>>> Hi,
>>>
>>> Does anyone know of any Homologene parser, if available?
>>> Please let me know.
>>>
>>> Thanks and Regards,
>>> Neeti.
>> Hi Neeti,
>>
>> Quite a long time ago now I wrote an Homologene parser and posted it
>> to the mailing list:
>>
>> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>>
>> I don't know if this still works but you could use it as a starting
>> point. There may also be something newer out there too, I don't know.
>> If you search the mailing list archives you'll get a few messages
>> around the topic.
>>
>> Cheers, Andrew.
>>
>>
>> Andrew Macgregor
>> Centre for Comparative Genomics, Murdoch University
>> Email: amacgregor at ccg.murdoch.edu.au
>> Tel: (08) 9360 2961
>>
>>
>>
>>
> 
> 


From cjfields at uiuc.edu  Tue Aug 14 13:19:59 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 14 Aug 2007 12:19:59 -0500
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
In-Reply-To: <46C1E2BA.8060606@sendu.me.uk>
References: <46C1C2D9.6050409@sendu.me.uk>
	<B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
	<46C1E2BA.8060606@sendu.me.uk>
Message-ID: <EE515FDC-2223-4D03-B819-3EA909539A61@uiuc.edu>


On Aug 14, 2007, at 12:13 PM, Sendu Bala wrote:
...

>
> Yes. For extra clarity, my thinking is that the correct answer is:
>
> a/6-19            tcgatcatcatc
> b/30-42           tcgatcatcatc
> c/51-63           tcgatcatcatc
>  ...
> What you say in the rest of your discussion is valid and deserves  
> some thought/discussion, but for now just getting the start and end  
> correct, ignoring any issues with internal residues, seems like a  
> no-brainer.
>
> For my own purposes that is all I need; having removed gaps I only  
> need the start and end so I can take that region from each sequence  
> and do a new alignment (for example).

It might be worth addressing the split location issue in the bug  
report before it gets lost in the ether.  Or maybe start a new one as  
an enhancement request.

> BTW. Either my patch isn't quite perfect or there's another related  
> bug I'm still tracking down. I'll commit when I've solved that,  
> unless someone points out any mistakes in my thinking.

Sounds fine by me.

chris


From gyang at plantbio.uga.edu  Tue Aug 14 15:01:07 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Tue, 14 Aug 2007 15:01:07 -0400
Subject: [Bioperl-l] the most weird thing  I've seen, help please
In-Reply-To: 41A08079-6EEC-4B62-8104-C41E70C03083@uiuc.edu
Message-ID: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>

Hi, all,  
I have two subroutines in my code. One is remoteblast and the other local blast. It works well.  
When I decided to change the remoteblast to local blast, I always get the following error. I downloaded nt database from NCBI as preformatted, but it works ok for both subroutines when I use command line blastall -p blastn.... I changed the db name to 'nt', 'nt.00', the same error message was returned. The error says: "program name was not given an argument", but I apparently gave it there.  Can anybody help me? The code for the two subrountines are very similar:  
   
sub search {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  
my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"query");
my $len=$query->length();
@db=('nt.nal');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new( -program =>"blastn",
                                                         -database =>"$db",
                                                         -e =>"$_[1]");
    my $rc = $factory->blastall($query);  
......  
   
   
sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  
my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('TNDB');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
    my $blast_report = $factory->blastall($query);

  
Thanks a lot!  
Guojun Yang  
Department of Plant Biology  
University of Georgia


From zhaodj at ioz.ac.cn  Wed Aug 15 04:05:36 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Wed, 15 Aug 2007 16:05:36 +0800 (CST)
Subject: [Bioperl-l] the most weird thing  I've seen, help please
In-Reply-To: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>
References: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>
Message-ID: <52820.159.226.67.49.1187165136.squirrel@mail.ioz.ac.cn>

Hi Guojun Yang,

I tested your code,modifying part of them. However,I did not
encounter the error.The modified code follows (see below and the
attachment). The codes run without any error on my Windows XP and
generates a file named lclblastResult.txt

In the codes I use the NCBI ecoli.nt database instead. Some
parameters change without affecting its function.

I think errors may happen in other part of your codes and more
details are needed.

-------code starts-------
#sub search {
use Bio::Tools::Run::StandAloneBlast;
use Bio::SearchIO::blast;

#my $query = Bio::Seq -> new ( -seq=>"$_[0]",
#                              -id=>"query");
my $query=Bio::Seq->new(-seq=>"ctgtattctgggatgca");
my $len=$query->length();

#@db=('nt.nal');
#foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new( -program
=>"blastn",
                                                         -database
=>'D:/blast/bin/ecoli.nt',
                                                         -e =>1,
														 -o=>'lclblastResult.txt');
my $rc = $factory->blastall($query);
-----code ends--------


On Wed, Aug 15, 2007 03:01, Guojun Yang wrote:
> Hi, all,
> I have two subroutines in my code. One is remoteblast and the
other
> local blast. It works well.
> When I decided to change the remoteblast to local blast, I always
get the following error. I downloaded nt database from NCBI as
> preformatted, but it works ok for both subroutines when I use
> command line blastall -p blastn.... I changed the db name to 'nt',
'nt.00', the same error message was returned. The error says:
> "program name was not given an argument", but I apparently gave it
there.  Can anybody help me? The code for the two subrountines are
very similar:
>
> sub search {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"query");
> my $len=$query->length();
> @db=('nt.nal');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new( -program
> =>"blastn",
>                                                          -database
> =>"$db",
>                                                          -e
> =>"$_[1]");
>     my $rc = $factory->blastall($query);
> ......
>
>
> sub ancestor {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"test");
> my $len=$query->length();
> my $long_name=$_[1];
> my $long_start=$_[2];
> my $long_end=$_[3];
> @db=('TNDB');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new(-program
=>
> "blastx",
>                                                         -database
=>
> "$db",
>                                                         -e =>
1e-3,
>                                                         );
>     my $blast_report = $factory->blastall($query);
>
>
> Thanks a lot!
> Guojun Yang
> Department of Plant Biology
> University of Georgia
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


-------------- next part --------------
A non-text attachment was scrubbed...
Name: lclblast.pl
Type: application/octet-stream
Size: 644 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/f40b2950/attachment-0003.obj>

From tania.oh at brasenose.oxford.ac.uk  Wed Aug 15 12:05:15 2007
From: tania.oh at brasenose.oxford.ac.uk (Tania Oh)
Date: Wed, 15 Aug 2007 17:05:15 +0100
Subject: [Bioperl-l] exonerate parser in bioperl-live fails when protein2dna
	comparison is performed
Message-ID: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>

Dear All,

I was trying to use the Bio::SearchIO::Alignment::Exonerate module to  
run and parse my exonerate output. But I've noticed that the parser  
which is actually Bio::SearchIO::Exonerate works if the model used in  
Exonerate is --model est2genome. I used exonerate with the model -- 
model protein2dna and the parser was unable to parse the hsps.


Below is a simple of code I used for testing the output from exonerate:

use Bio::SearchIO;
use strict;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exonerate.output.works
Type: application/octet-stream
Size: 6056 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/e4e43d75/attachment-0006.obj>
-------------- next part --------------
my $searchio = Bio::SearchIO->new(-file => 'test_data/ 
exonerate.output.dontwork
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exonerate.output.dontwork
Type: application/octet-stream
Size: 3283 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/e4e43d75/attachment-0007.obj>
-------------- next part --------------
',
                                    -format => 'exonerate');

   while( my $r = $searchio->next_result ) {
           while(my $hit = $r->next_hit){
                   while(my $hsp = $hit->next_hsp){
                           print $hsp->start. "\t". $hsp->end. "\n";
                   }
           }

     print $r->query_name, "\n";
   }


There are 2 files attached to show the examples of using either the  
est2genome or protein2dna model:
1. exonerate.output.works  - produced from the command line:
exonerate -q exonerate_cdna.fa -t exonerate_genomic.fa --model  
est2genome --bestn 1 > exonerate.output.works

2. exonerate.output.dontwork - produced from the command line:
exonerate -q test_aa.fa -t test_cds.fa --model protein2dna >  
exonerate.output.dontwork


Line 239 in Bio::searchIO::exonerate (cut and pasted below)

elsif(  s/^vulgar:\s+(\S+)\s+         # query sequence id
                  (\d+)\s+(\d+)\s+([\-\+])\s+   # query start-end-strand
                  (\S+)\s+                      # target sequence id
                  (\d+)\s+(\d+)\s+([\-\+])\s+   # target start-end- 
strand
                  (\d+)\s+                      # score
                  //ox ) {

parses the vulgar line of an --model est2genome exonerate output  
well. An example of the (complex) vulgar line which I've truncated  
for readability is:
vulgar: MUSSPSYN 3 1279 + 4.143962167-143965267 28 3074 + 6137 M 8 8  
G 0 1 M 231 231 5 0 2 I 0 253 3 0

whereas the vulgar line I've obtained from a --model protein2dna  
exonerate output is much simpler and the parser fails to pick it up:
vulgar: SJCHGC00851 0 204 . SJCHGC00851 2 614 + 1059 M 204 612

Has anyone encountered this situation before? I've not changed the  
parser as exonerate is widely used for it's est2genome model, and  
thought I'd run it pass the list to see if there is a work around  
solution.

many thanks in advance,
tania


From johnsonmar at mail.nih.gov  Wed Aug 15 12:47:10 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 12:47:10 -0400
Subject: [Bioperl-l] Need assistance with make error
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>

I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
Enterprise Linux 4, and the other running RHEL3.  I'm getting the
following 'make Error 255' when running make test.  I'm not sure what
this error indicates, and whether I should continue with a force
install?  Could you please advise.

 
Failed Test        Stat Wstat Total Fail  Failed  List of Failed

------------------------------------------------------------------------
-------

t/BioFetch_DB.t                  27    1   3.70%  8

t/EMBL_DB.t                      15    3  20.00%  6 13-14

t/Ontology.t          9  2304    50  100 200.00%  1-50

t/TreeIO.t                       41    1   2.44%  42

t/Variation_IO.t                 25    3  12.00%  15 20 25

t/simpleGOparser.t    9  2304    98  196 200.00%  1-98

120 subtests skipped.

Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
okay.

make: *** [test_dynamic] Error 255

 
Thanks,

 
Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/ <http://www.terpsys.com/> 

 
From arareko at campus.iztacala.unam.mx  Wed Aug 15 13:45:39 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 15 Aug 2007 12:45:39 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
Message-ID: <46C33BC3.9000409@campus.iztacala.unam.mx>

Which version of bioperl you're trying to install?

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.
> 
>  
> 
>  
> 
> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> 
> ------------------------------------------------------------------------
> -------
> 
> t/BioFetch_DB.t                  27    1   3.70%  8
> 
> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> 
> t/Ontology.t          9  2304    50  100 200.00%  1-50
> 
> t/TreeIO.t                       41    1   2.44%  42
> 
> t/Variation_IO.t                 25    3  12.00%  15 20 25
> 
> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> 
> 120 subtests skipped.
> 
> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
> okay.
> 
> make: *** [test_dynamic] Error 255
> 
>  
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/ <http://www.terpsys.com/> 
> 
>  
> 
>  
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From mbasu at mail.nih.gov  Wed Aug 15 13:55:50 2007
From: mbasu at mail.nih.gov (Malay)
Date: Wed, 15 Aug 2007 13:55:50 -0400
Subject: [Bioperl-l] Developer docs
Message-ID: <46C33E26.2050004@mail.nih.gov>

Hello All:

I apologize for not searching throughly. But I'd appreciate if someone 
point to a location where I can find any bioperl coding convention that 
I need follow for any code contribution to Bioperl.

-Malay

-- 
Malay K Basu
www.malaybasu.net


From arareko at campus.iztacala.unam.mx  Wed Aug 15 14:39:29 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 15 Aug 2007 13:39:29 -0500
Subject: [Bioperl-l] Developer docs
In-Reply-To: <46C33E26.2050004@mail.nih.gov>
References: <46C33E26.2050004@mail.nih.gov>
Message-ID: <46C34861.8090400@campus.iztacala.unam.mx>

You may want to bookmark this one:

http://bioperl.org/wiki/Developer_Information#BioPerl_Code

Mauricio.

Malay wrote:
> Hello All:
> 
> I apologize for not searching throughly. But I'd appreciate if someone 
> point to a location where I can find any bioperl coding convention that 
> I need follow for any code contribution to Bioperl.
> 
> -Malay
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From johnsonmar at mail.nih.gov  Wed Aug 15 15:01:23 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 15:01:23 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <46C33BC3.9000409@campus.iztacala.unam.mx>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>

This is version 1.4.

Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx] 
Sent: Wednesday, August 15, 2007 1:46 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

Which version of bioperl you're trying to install?

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.
> 
>  
> 
>  
> 
> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> 
> ------------------------------------------------------------------------
> -------
> 
> t/BioFetch_DB.t                  27    1   3.70%  8
> 
> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> 
> t/Ontology.t          9  2304    50  100 200.00%  1-50
> 
> t/TreeIO.t                       41    1   2.44%  42
> 
> t/Variation_IO.t                 25    3  12.00%  15 20 25
> 
> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> 
> 120 subtests skipped.
> 
> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
> okay.
> 
> make: *** [test_dynamic] Error 255
> 
>  
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/ <http://www.terpsys.com/> 
> 
>  
> 
>  
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Wed Aug 15 16:25:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 15:25:30 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>
Message-ID: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>

You'll definitely want to update to the latest (v 1.5.2).  We hope to  
get a new stable release out sometime soon and possibly move to a  
more regular release cycle.

chris

On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> This is version 1.4.
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx]
> Sent: Wednesday, August 15, 2007 1:46 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> Which version of bioperl you're trying to install?
>
> Johnson, Mary (NIH/NCI) [C] wrote:
>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>> following 'make Error 255' when running make test.  I'm not sure what
>> this error indicates, and whether I should continue with a force
>> install?  Could you please advise.
>>
>>
>>
>>
>>
>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>
>> --------------------------------------------------------------------- 
>> ---
>> -------
>>
>> t/BioFetch_DB.t                  27    1   3.70%  8
>>
>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>
>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>
>> t/TreeIO.t                       41    1   2.44%  42
>>
>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>
>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>
>> 120 subtests skipped.
>>
>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,  
>> 98.14%
>> okay.
>>
>> make: *** [test_dynamic] Error 255
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> MAURICIO HERRERA CUADRA
> arareko at campus.iztacala.unam.mx
> Laboratorio de Gen?tica
> Unidad de Morfofisiolog?a y Funci?n
> Facultad de Estudios Superiores Iztacala, UNAM
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonmar at mail.nih.gov  Wed Aug 15 16:32:43 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 16:32:43 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>

I saw the 1.5.2 version, but it stated that this was a developer release and that 1.4 was the latest stable version, so I went with 1.4.  I'll give 1.5.2 a try.

Thanks,


Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu] 
Sent: Wednesday, August 15, 2007 4:26 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

You'll definitely want to update to the latest (v 1.5.2).  We hope to  
get a new stable release out sometime soon and possibly move to a  
more regular release cycle.

chris

On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> This is version 1.4.
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx]
> Sent: Wednesday, August 15, 2007 1:46 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> Which version of bioperl you're trying to install?
>
> Johnson, Mary (NIH/NCI) [C] wrote:
>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>> following 'make Error 255' when running make test.  I'm not sure what
>> this error indicates, and whether I should continue with a force
>> install?  Could you please advise.
>>
>>
>>
>>
>>
>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>
>> --------------------------------------------------------------------- 
>> ---
>> -------
>>
>> t/BioFetch_DB.t                  27    1   3.70%  8
>>
>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>
>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>
>> t/TreeIO.t                       41    1   2.44%  42
>>
>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>
>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>
>> 120 subtests skipped.
>>
>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,  
>> 98.14%
>> okay.
>>
>> make: *** [test_dynamic] Error 255
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> MAURICIO HERRERA CUADRA
> arareko at campus.iztacala.unam.mx
> Laboratorio de Gen?tica
> Unidad de Morfofisiolog?a y Funci?n
> Facultad de Estudios Superiores Iztacala, UNAM
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Aug 15 16:40:32 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 15:40:32 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
Message-ID: <E16950D3-9F60-4862-9325-57CA26107649@uiuc.edu>

The term 'stable' is relative in this case; tons of bugs fixes were  
incorporated in the 1.5.2 release.  There are a few dev-specific  
issues we'll need to resolve prior to a new release; once those are  
out of the way we'll try to get a new 'stable' out.

chris

On Aug 15, 2007, at 3:32 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> I saw the 1.5.2 version, but it stated that this was a developer  
> release and that 1.4 was the latest stable version, so I went with  
> 1.4.  I'll give 1.5.2 a try.
>
> Thanks,
>
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> You'll definitely want to update to the latest (v 1.5.2).  We hope to
> get a new stable release out sometime soon and possibly move to a
> more regular release cycle.
>
> chris
>
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
>
>> This is version 1.4.
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/
>>
>>
>>
>> -----Original Message-----
>> From: Mauricio Herrera Cuadra  
>> [mailto:arareko at campus.iztacala.unam.mx]
>> Sent: Wednesday, August 15, 2007 1:46 PM
>> To: Johnson, Mary (NIH/NCI) [C]
>> Cc: bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Need assistance with make error
>>
>> Which version of bioperl you're trying to install?
>>
>> Johnson, Mary (NIH/NCI) [C] wrote:
>>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>>> following 'make Error 255' when running make test.  I'm not sure  
>>> what
>>> this error indicates, and whether I should continue with a force
>>> install?  Could you please advise.
>>>
>>>
>>>
>>>
>>>
>>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> ---
>>> -------
>>>
>>> t/BioFetch_DB.t                  27    1   3.70%  8
>>>
>>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>>
>>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>>
>>> t/TreeIO.t                       41    1   2.44%  42
>>>
>>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>>
>>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>>
>>> 120 subtests skipped.
>>>
>>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,
>>> 98.14%
>>> okay.
>>>
>>> make: *** [test_dynamic] Error 255
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Mary Johnson
>>>
>>> Sr. Network Engineer
>>>
>>> National Cancer Institute Center for Bioinformatics
>>> Contractor, TerpSys
>>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> MAURICIO HERRERA CUADRA
>> arareko at campus.iztacala.unam.mx
>> Laboratorio de Gen?tica
>> Unidad de Morfofisiolog?a y Funci?n
>> Facultad de Estudios Superiores Iztacala, UNAM
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Kevin.M.Brown at asu.edu  Wed Aug 15 16:54:04 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 15 Aug 2007 13:54:04 -0700
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
References: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>
	<EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
Message-ID: <1A4207F8295607498283FE9E93B775B40386D612@EX02.asurite.ad.asu.edu>

It technically is a developer release, but given the age of the 1.4 release it is better because of fixes for things like doing webblasts and other improvements and I've found that it is reliable in the results that come out of the various objects that I've had to use in my current projects.

> I saw the 1.5.2 version, but it stated that this was a 
> developer release and that 1.4 was the latest stable version, 
> so I went with 1.4.  I'll give 1.5.2 a try.
> 
> Thanks,
> 
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics 
> Contractor, TerpSys http://www.terpsys.com/
> 
>  
> 
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
> 
> You'll definitely want to update to the latest (v 1.5.2).  We 
> hope to get a new stable release out sometime soon and 
> possibly move to a more regular release cycle.
> 
> chris
> 
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
> 
> > This is version 1.4.
> >
> > Mary Johnson
> >
> > Sr. Network Engineer
> >
> > National Cancer Institute Center for Bioinformatics Contractor, 
> > TerpSys http://www.terpsys.com/
> >
> >
> >
> > -----Original Message-----
> > From: Mauricio Herrera Cuadra 
> [mailto:arareko at campus.iztacala.unam.mx]
> > Sent: Wednesday, August 15, 2007 1:46 PM
> > To: Johnson, Mary (NIH/NCI) [C]
> > Cc: bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] Need assistance with make error
> >
> > Which version of bioperl you're trying to install?
> >
> > Johnson, Mary (NIH/NCI) [C] wrote:
> >> I'm trying to install bioperl on 2 Linux servers - 1 
> running Redhat 
> >> Enterprise Linux 4, and the other running RHEL3.  I'm getting the 
> >> following 'make Error 255' when running make test.  I'm 
> not sure what 
> >> this error indicates, and whether I should continue with a force 
> >> install?  Could you please advise.
> >>
> >>
> >>
> >>
> >>
> >> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> >>
> >> 
> ---------------------------------------------------------------------
> >> ---
> >> -------
> >>
> >> t/BioFetch_DB.t                  27    1   3.70%  8
> >>
> >> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> >>
> >> t/Ontology.t          9  2304    50  100 200.00%  1-50
> >>
> >> t/TreeIO.t                       41    1   2.44%  42
> >>
> >> t/Variation_IO.t                 25    3  12.00%  15 20 25
> >>
> >> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> >>
> >> 120 subtests skipped.
> >>
> >> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 
> >> 98.14% okay.
> >>
> >> make: *** [test_dynamic] Error 255
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Thanks,
> >>
> >>
> >>
> >> Mary Johnson
> >>
> >> Sr. Network Engineer
> >>
> >> National Cancer Institute Center for Bioinformatics Contractor, 
> >> TerpSys http://www.terpsys.com/ <http://www.terpsys.com/>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > --
> > MAURICIO HERRERA CUADRA
> > arareko at campus.iztacala.unam.mx
> > Laboratorio de Gen?tica
> > Unidad de Morfofisiolog?a y Funci?n
> > Facultad de Estudios Superiores Iztacala, UNAM
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bix at sendu.me.uk  Wed Aug 15 16:50:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 15 Aug 2007 21:50:02 +0100
Subject: [Bioperl-l] Developer docs
In-Reply-To: <46C34861.8090400@campus.iztacala.unam.mx>
References: <46C33E26.2050004@mail.nih.gov>
	<46C34861.8090400@campus.iztacala.unam.mx>
Message-ID: <46C366FA.40609@sendu.me.uk>

Mauricio Herrera Cuadra wrote:
> You may want to bookmark this one:
> 
> http://bioperl.org/wiki/Developer_Information#BioPerl_Code

Yup. The important one is http://bioperl.org/wiki/Bioperl_Best_Practices 
, which I've just updated with the latest info on writing test scripts.


From bix at sendu.me.uk  Wed Aug 15 16:54:45 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 15 Aug 2007 21:54:45 +0100
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
Message-ID: <46C36815.5010908@sendu.me.uk>

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.

Unless you know you really must install Bioperl 1.4, install 1.5.2 instead.

http://www.bioperl.org/wiki/Release_1.5.2

If you use the Build.PL installation, at the very least you certainly 
won't get a make error.

http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#PRELIMINARY_PREPARATION


From cjfields at uiuc.edu  Wed Aug 15 17:16:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 16:16:27 -0500
Subject: [Bioperl-l] exonerate parser in bioperl-live fails when
	protein2dna comparison is performed
In-Reply-To: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>
References: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>
Message-ID: <F853DDF2-3165-4F88-A087-744D60682104@uiuc.edu>

I can confirm this with bioperl-live.  Bio::SearchIO::exonerate docs  
indicate protein2genome and est2genome model output is supported but  
doesn't specifically indicate that it can parse any other output.   
You can add an enhancement request to bugzilla indicating this  
deficiency or, if you are inclined, add the functionality yourself  
and donate the code.

chris

On Aug 15, 2007, at 11:05 AM, Tania Oh wrote:

> Dear All,
>
> I was trying to use the Bio::SearchIO::Alignment::Exonerate module  
> to run and parse my exonerate output. But I've noticed that the  
> parser which is actually Bio::SearchIO::Exonerate works if the  
> model used in Exonerate is --model est2genome. I used exonerate  
> with the model --model protein2dna and the parser was unable to  
> parse the hsps.
>
>
> Below is a simple of code I used for testing the output from  
> exonerate:
>
> use Bio::SearchIO;
> use strict;
> <exonerate.output.works>
> my $searchio = Bio::SearchIO->new(-file => 'test_data/ 
> exonerate.output.dontwork
> <exonerate.output.dontwork>
> ',
>                                    -format => 'exonerate');
>
>   while( my $r = $searchio->next_result ) {
>           while(my $hit = $r->next_hit){
>                   while(my $hsp = $hit->next_hsp){
>                           print $hsp->start. "\t". $hsp->end. "\n";
>                   }
>           }
>
>     print $r->query_name, "\n";
>   }
>
>
> There are 2 files attached to show the examples of using either the  
> est2genome or protein2dna model:
> 1. exonerate.output.works  - produced from the command line:
> exonerate -q exonerate_cdna.fa -t exonerate_genomic.fa --model  
> est2genome --bestn 1 > exonerate.output.works
>
> 2. exonerate.output.dontwork - produced from the command line:
> exonerate -q test_aa.fa -t test_cds.fa --model protein2dna >  
> exonerate.output.dontwork
>
>
> Line 239 in Bio::searchIO::exonerate (cut and pasted below)
>
> elsif(  s/^vulgar:\s+(\S+)\s+         # query sequence id
>                  (\d+)\s+(\d+)\s+([\-\+])\s+   # query start-end- 
> strand
>                  (\S+)\s+                      # target sequence id
>                  (\d+)\s+(\d+)\s+([\-\+])\s+   # target start-end- 
> strand
>                  (\d+)\s+                      # score
>                  //ox ) {
>
> parses the vulgar line of an --model est2genome exonerate output  
> well. An example of the (complex) vulgar line which I've truncated  
> for readability is:
> vulgar: MUSSPSYN 3 1279 + 4.143962167-143965267 28 3074 + 6137 M 8  
> 8 G 0 1 M 231 231 5 0 2 I 0 253 3 0
>
> whereas the vulgar line I've obtained from a --model protein2dna  
> exonerate output is much simpler and the parser fails to pick it up:
> vulgar: SJCHGC00851 0 204 . SJCHGC00851 2 614 + 1059 M 204 612
>
> Has anyone encountered this situation before? I've not changed the  
> parser as exonerate is widely used for it's est2genome model, and  
> thought I'd run it pass the list to see if there is a work around  
> solution.
>
> many thanks in advance,
> tania
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonmar at mail.nih.gov  Wed Aug 15 17:45:36 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 17:45:36 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <E16950D3-9F60-4862-9325-57CA26107649@uiuc.edu>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805716@NIHCESMLBX11.nih.gov>

Version 1.5.2 worked fine!  Thanks to all of you for your quick response.  I wish all of our vendors were that quick in getting back to me:)


Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu] 
Sent: Wednesday, August 15, 2007 4:41 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

The term 'stable' is relative in this case; tons of bugs fixes were  
incorporated in the 1.5.2 release.  There are a few dev-specific  
issues we'll need to resolve prior to a new release; once those are  
out of the way we'll try to get a new 'stable' out.

chris

On Aug 15, 2007, at 3:32 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> I saw the 1.5.2 version, but it stated that this was a developer  
> release and that 1.4 was the latest stable version, so I went with  
> 1.4.  I'll give 1.5.2 a try.
>
> Thanks,
>
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> You'll definitely want to update to the latest (v 1.5.2).  We hope to
> get a new stable release out sometime soon and possibly move to a
> more regular release cycle.
>
> chris
>
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
>
>> This is version 1.4.
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/
>>
>>
>>
>> -----Original Message-----
>> From: Mauricio Herrera Cuadra  
>> [mailto:arareko at campus.iztacala.unam.mx]
>> Sent: Wednesday, August 15, 2007 1:46 PM
>> To: Johnson, Mary (NIH/NCI) [C]
>> Cc: bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Need assistance with make error
>>
>> Which version of bioperl you're trying to install?
>>
>> Johnson, Mary (NIH/NCI) [C] wrote:
>>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>>> following 'make Error 255' when running make test.  I'm not sure  
>>> what
>>> this error indicates, and whether I should continue with a force
>>> install?  Could you please advise.
>>>
>>>
>>>
>>>
>>>
>>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> ---
>>> -------
>>>
>>> t/BioFetch_DB.t                  27    1   3.70%  8
>>>
>>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>>
>>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>>
>>> t/TreeIO.t                       41    1   2.44%  42
>>>
>>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>>
>>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>>
>>> 120 subtests skipped.
>>>
>>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,
>>> 98.14%
>>> okay.
>>>
>>> make: *** [test_dynamic] Error 255
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Mary Johnson
>>>
>>> Sr. Network Engineer
>>>
>>> National Cancer Institute Center for Bioinformatics
>>> Contractor, TerpSys
>>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> MAURICIO HERRERA CUADRA
>> arareko at campus.iztacala.unam.mx
>> Laboratorio de Gen?tica
>> Unidad de Morfofisiolog?a y Funci?n
>> Facultad de Estudios Superiores Iztacala, UNAM
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From neetisomaiya at gmail.com  Thu Aug 16 00:22:18 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 09:52:18 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <46C1D557.7090101@pharm.stonybrook.edu>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
	<46C1D557.7090101@pharm.stonybrook.edu>
Message-ID: <764978cf0708152122oba56e13qef83544cdde7e795@mail.gmail.com>

Hi Siddhartha,

Thanks a lot for your mail.
It would be great if you could send me your parser, I will see how I can
modify it for my purpose.

Thanks and Regards,
Neeti.

On 8/14/07, Siddhartha Basu <basu at pharm.stonybrook.edu> wrote:
>
> neeti somaiya wrote:
> > Hi Andrew,
> >
> > I think the homologene data files have changed now on the ftp, from what
> you
> > had used.
> > It is now homologene.data and homologene.xml.
> > I tried using your parser, but because it was written on the file
> > hmlg.trip.ftp, it doesnt work anymore.
> >
> > I came across a parser
> >
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> > .
> > I am looking at it to see if it works for me. NOt sure if it will.
> >
> > ~Neeti.
>
> Hi Neeti,
> I have recently written a parser for 'homologene' xml data specific for
> my purpose. I am not sure whether it will suit your purpose but it could
> be extended for general purpose parsing, so i am putting it forward.
> Here is how it works .......
>
> * It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
> * It does SAX based parsing (currently uses XML::SAX::ExpatXS)
> * Returns a graph(uses Graph module of perl) object where each node is a
> homologue entry with its corresponding entrez gene id. Each node also
> contain the following attributes ...
>         * Refseq protein id.
>         * Protein id (pid)
>         * ncbi taxon id.
> * The edge attribute contain information about the ortholog(true/false)
> relationship between two nodes.
> * The rest of tags currently are not being extracted. However, parsing
> the rest of the tags should not be very difficult.
>
> Generally i get homologene xml stream from an 'efetch' through
> Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and
> then works on it.
>
> So, to make it more generic and work on local file
>
> * We need another class that reads the chunk between
> <HG-Entry>.....</HG-Entry> and sends it to the parser.
> * Add supports for most of the tags.
> * Massage the data to a bioperl compatible object.
>
> The first two i could work it out and for the last one i have to figure
> out the bioperl object that could be suitable (like  Bio::Cluster or
> Bio::NetWork::Node/Edge).
>
> Let me know if it sounds interesting and i will send you the code.
>
> -siddhartha
>
>
> >
> > On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
> >> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
> >>
> >>> Hi,
> >>>
> >>> Does anyone know of any Homologene parser, if available?
> >>> Please let me know.
> >>>
> >>> Thanks and Regards,
> >>> Neeti.
> >> Hi Neeti,
> >>
> >> Quite a long time ago now I wrote an Homologene parser and posted it
> >> to the mailing list:
> >>
> >> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
> >>
> >> I don't know if this still works but you could use it as a starting
> >> point. There may also be something newer out there too, I don't know.
> >> If you search the mailing list archives you'll get a few messages
> >> around the topic.
> >>
> >> Cheers, Andrew.
> >>
> >>
> >> Andrew Macgregor
> >> Centre for Comparative Genomics, Murdoch University
> >> Email: amacgregor at ccg.murdoch.edu.au
> >> Tel: (08) 9360 2961
> >>
> >>
> >>
> >>
> >
> >
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Thu Aug 16 01:56:21 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 11:26:21 +0530
Subject: [Bioperl-l] PDB Parser
Message-ID: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>

Hi,

After a lot of search I could find this link from where PDB files can be
downloaded :
ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/
Is there any other link where one can download all pdb data from?

I tried using Bio::Structure::IO::pdb with some code like :-
use Bio::Structure::IO;

    $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
                                   -format => 'pdb');

    while ( my $struc = $in->next_structure() ) {
       print "Structure ", $struc->id,"\n";
    }

It works well. But I am not able to find documentation of other methods
which will give me various specific details available in a pdb file, right
from title, keywords, references to structure details, atoms, coordinates
etc. There must be different methods to fetch and parse each of this data
from a pdb file, right? Where can I find the details? Any example code of
the same would also be of great use.

Thanks and Regards,
Neeti Somaiya.

-- 
-Neeti
Even my blood says, B positive


From hrh at sanger.ac.uk  Thu Aug 16 04:48:16 2007
From: hrh at sanger.ac.uk (Hans Rudolf Hotz)
Date: Thu, 16 Aug 2007 09:48:16 +0100 (BST)
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0708160942310.14241@deskpro50.dynamic.sanger.ac.uk>


On Thu, 16 Aug 2007, neeti somaiya wrote:

> Hi,
>
> After a lot of search I could find this link from where PDB files can be
> downloaded :
> ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/
> Is there any other link where one can download all pdb data from?

try: ftp://pdb.protein.osaka-u.ac.jp/v3/pub/pdb/   or
      ftp://ftp.ebi.ac.uk/pub/databases/rcsb/pdb-remediated/

it is not BioPerl, but James Tisdall's book: O'Reilly: "Begiining Perl for 
Bioinformatics" has a nice introduction into parsing PDB files


Regards, Hans


>
> I tried using Bio::Structure::IO::pdb with some code like :-
> use Bio::Structure::IO;
>
>    $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>                                   -format => 'pdb');
>
>    while ( my $struc = $in->next_structure() ) {
>       print "Structure ", $struc->id,"\n";
>    }
>
> It works well. But I am not able to find documentation of other methods
> which will give me various specific details available in a pdb file, right
> from title, keywords, references to structure details, atoms, coordinates
> etc. There must be different methods to fetch and parse each of this data
> from a pdb file, right? Where can I find the details? Any example code of
> the same would also be of great use.
>
> Thanks and Regards,
> Neeti Somaiya.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.


From neetisomaiya at gmail.com  Thu Aug 16 05:30:42 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 15:00:42 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
	<C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>
Message-ID: <764978cf0708160230o4ade944er8c8529199f3a0262@mail.gmail.com>

Hi,

For now I am using the homologene parser available here :-
http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
,
for parsing the homologene.data file. But the README at the ftp site says
HOMOLOGENE.XML has much more data, I am still to see how to parse this one.

~Neeti.


On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>
> On 14/08/2007, at 11:21 AM, Chris Fields wrote:
>
> > It looks like Heikki responded and thought a good place for it
> > would be Bio::SeqIO, but it didn't go anywhere I suppose.  I see
> > that a few other posts suggest it could be placed in Bio::Cluster
> > as well which I'm not familiar with.  We could add it in if you
> > were still interested, just need to find a good place for it; might
> > be nice to have a Parse::RecDescent-based parser.
> >
> > chris
> >
>
> Hi Chris,
>
> I was also doing some parsing of UniGene at the time but found
> RecDescent was too slow and went back to regexes. That code found
> it's way into Bio::Cluster. Occasionally I see a message with someone
> looking for a Homologene parser but not very often, so I'm not sure
> it is worth the effort of moving the code into bioperl.
>
> Cheers, Andrew.
>


-- 
-Neeti
Even my blood says, B positive


From bix at sendu.me.uk  Thu Aug 16 05:59:08 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 16 Aug 2007 10:59:08 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
Message-ID: <46C41FEC.2000206@sendu.me.uk>

neeti somaiya wrote:
> I tried using Bio::Structure::IO::pdb with some code like :-
> use Bio::Structure::IO;
> 
>     $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>                                    -format => 'pdb');
> 
>     while ( my $struc = $in->next_structure() ) {
>        print "Structure ", $struc->id,"\n";
>     }
> 
> It works well. But I am not able to find documentation of other methods
> which will give me various specific details available in a pdb file, right
> from title, keywords, references to structure details, atoms, coordinates
> etc. There must be different methods to fetch and parse each of this data
> from a pdb file, right? Where can I find the details?

$struct is a Bio::Structure::Entry, so look at the docs for that:
http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html

You'll probably want to look at the docs for the other Structure modules 
as well:
http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html


I agree, the documentation in this area could be improved. 
Bio::Structure::StructureI could actually contain something, and 
Bio::Structure should actually exist or not be referenced in the docs.


From ewijaya at gmail.com  Thu Aug 16 00:18:57 2007
From: ewijaya at gmail.com (Edward Wijaya)
Date: Thu, 16 Aug 2007 12:18:57 +0800
Subject: [Bioperl-l] How to create contrasting colors in every singe track -
	Bio::Graphics
Message-ID: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>

Dear experts,

I am trying to draw a figures that shows binding sites hits for various
program (see attached) for example.

Now, I have a problem in creating contrasting colour for each of
the Programs (MEME, AlignACE, etc).  I want to avoid "graded segments",
so that I can have more contrasting color, e.g: red, blue, yellow, etc.

Can anybody suggest how can we achieve that?

My full source code can be found here: http://dpaste.com/16985/
The portion of the script is this:

__BEGIN__
    my %prog_color = (
        "Actual"   => 800000,
        "ALIGNACE" => 230000,
        "BP"       => 80000,
        "MDSCAN"   => 5000,
        "MITRA"    => 10000,
        "MTSAMP"   => 200000,
        "SPACE"    => 40000,
        "NONE"     => 0,
    );

    foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
        my $track = $panel->add_track(
            -glyph     => 'graded_segments',
            -key       => "SEQ " . $seqid,
            -connector => "dashed",
            -label     => 1,
            -fontcolor => 'red',
            -bgcolor   => 'blue',
            -bump      => +1,
            -height    => 8,
            -min_score => 0,
            -max_score => 500000
        );
# rest of the script
__END__

Regards,
Edward
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hits.png
Type: image/png
Size: 2509 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070816/31057225/attachment-0003.png>

From pratchusha.kamireddy at aamu.edu  Wed Aug 15 23:45:22 2007
From: pratchusha.kamireddy at aamu.edu (pratchusha kamireddy)
Date: Wed, 15 Aug 2007 22:45:22 -0500 (CDT)
Subject: [Bioperl-l] Request for Activeperl software
Message-ID: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>

Hello
  I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am working under Dr.Kantety in Plant and Soil Science Department.I am the beginner to learn perl programming. I need Activeperl software to run the perl programs. Can you help me in this regard like: where can I dowmload this software, how can i Install this and how can i use this. I am eagerlu waiting for your reply.Please help me in this regard.
   Thanking you
   Pratchusha Kamireddy


From spiros at lokku.com  Thu Aug 16 09:32:05 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Thu, 16 Aug 2007 14:32:05 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
References: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <bba689ec0708160632w315b00d5na3bf55d97ac03728@mail.gmail.com>

Hi,

You can download ActivePerl from ActiveStates website at

http://www.activestate.com/Products/ActivePerl/

Get a book: http://www.oreilly.com/catalog/lperl3/

Visit:

http://perl-begin.org/
http://learn.perl.org/

Usenet:

http://www.nntp.perl.org/group/perl.beginners/

Spiros

On 8/16/07, pratchusha kamireddy <pratchusha.kamireddy at aamu.edu> wrote:
> Hello
>   I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am working under Dr.Kantety in Plant and Soil Science Department.I am the beginner to learn perl programming. I need Activeperl software to run the perl programs. Can you help me in this regard like: where can I dowmload this software, how can i Install this and how can i use this. I am eagerlu waiting for your reply.Please help me in this regard.
>    Thanking you
>    Pratchusha Kamireddy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From razi.khaja at gmail.com  Thu Aug 16 09:37:09 2007
From: razi.khaja at gmail.com (Razi Khaja)
Date: Thu, 16 Aug 2007 09:37:09 -0400
Subject: [Bioperl-l] How to create contrasting colors in every singe
	track - Bio::Graphics
In-Reply-To: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
References: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
Message-ID: <62e9dabc0708160637o36380ecbv69fe479d0a26989d@mail.gmail.com>

You would probably want to consider a "Graph-Coloring" algorithm in
order to optimally pick contrasting colors for the features being
displayed.  This might be overkill for what your trying to accomplish
and may not be possible (depending on how many features you have in
your dataset ... ie. how big your graph is).

In anycase, some resources are:
http://en.wikipedia.org/wiki/Graph_coloring
http://web.cs.ualberta.ca/~joe/Coloring/

If your problem is simpler, see the modifications to your program Ive
made below:

Razi Khaja

On 8/16/07, Edward Wijaya <ewijaya at gmail.com> wrote:
> Dear experts,
>
> I am trying to draw a figures that shows binding sites hits for various
> program (see attached) for example.
>
> Now, I have a problem in creating contrasting colour for each of
> the Programs (MEME, AlignACE, etc).  I want to avoid "graded segments",
> so that I can have more contrasting color, e.g: red, blue, yellow, etc.
>
> Can anybody suggest how can we achieve that?
>
> My full source code can be found here: http://dpaste.com/16985/
> The portion of the script is this:
>
> __BEGIN__
>     my %prog_color = (
>         "Actual"   => 800000,
>         "ALIGNACE" => 230000,
>         "BP"       => 80000,
>         "MDSCAN"   => 5000,
>         "MITRA"    => 10000,
>         "MTSAMP"   => 200000,
>         "SPACE"    => 40000,
>         "NONE"     => 0,
>     );
>
       my %color = ( 'MEME' => 'red', 'ALIGNACE => 'blue');

>     foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
           my( @feild ) = split( /\s+/, $nlist{$seqid} );
           my $prog_name = $feild[3];

>         my $track = $panel->add_track(
>             -glyph     => 'graded_segments',
>             -key       => "SEQ " . $seqid,
>             -connector => "dashed",
>             -label     => 1,
>             -fontcolor => 'red',
               -bgcolor   => $color{ $prog_name },
>             -bump      => +1,
>             -height    => 8,
>             -min_score => 0,
>             -max_score => 500000
>         );
> # rest of the script
> __END__
>
> Regards,
> Edward
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From bix at sendu.me.uk  Thu Aug 16 09:49:52 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 16 Aug 2007 14:49:52 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
References: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <46C45600.4040906@sendu.me.uk>

pratchusha kamireddy wrote:
> I am Pratchusha Kamireddy doing masters in Alabama A&M University. I
> am working under Dr.Kantety in Plant and Soil Science Department.I am
> the beginner to learn perl programming. I need Activeperl software to
> run the perl programs. Can you help me in this regard like: where can
> I dowmload this software, how can i Install this and how can i use
> this. I am eagerlu waiting for your reply.Please help me in this
> regard.

Firstly, Google is your friend:
http://www.google.co.uk/search?q=activeperl

The first hit is the correct one:

http://www.activestate.com/Products/activeperl/


I suppose your next question will be how to install Bioperl (if not, 
you're in the wrong place):

http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
(which also tells you where to get ActivePerl from)


From cjfields at uiuc.edu  Thu Aug 16 10:11:22 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:11:22 -0500
Subject: [Bioperl-l] How to create contrasting colors in every singe
	track - Bio::Graphics
In-Reply-To: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
References: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
Message-ID: <F3E88224-4AA2-451B-97FE-5DED15015FA2@uiuc.edu>


On Aug 15, 2007, at 11:18 PM, Edward Wijaya wrote:

> Dear experts,
>
> I am trying to draw a figures that shows binding sites hits for  
> various
> program (see attached) for example.
>
> Now, I have a problem in creating contrasting colour for each of
> the Programs (MEME, AlignACE, etc).  I want to avoid "graded  
> segments",
> so that I can have more contrasting color, e.g: red, blue, yellow,  
> etc.
>
> Can anybody suggest how can we achieve that?
>
> My full source code can be found here: http://dpaste.com/16985/
> The portion of the script is this:
>
> __BEGIN__
>     my %prog_color = (
>         "Actual"   => 800000,
>         "ALIGNACE" => 230000,
>         "BP"       => 80000,
>         "MDSCAN"   => 5000,
>         "MITRA"    => 10000,
>         "MTSAMP"   => 200000,
>         "SPACE"    => 40000,
>         "NONE"     => 0,
>     );
>
>     foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
>         my $track = $panel->add_track(
>             -glyph     => 'graded_segments',
>             -key       => "SEQ " . $seqid,
>             -connector => "dashed",
>             -label     => 1,
>             -fontcolor => 'red',
>             -bgcolor   => 'blue',
>             -bump      => +1,
>             -height    => 8,
>             -min_score => 0,
>             -max_score => 500000
>         );
> # rest of the script
> __END__
>
> Regards,
> Edward

I think you have two options:

1) Split the seqfeatures into different tracks based on the source  
(AlignACE, MP, etc), then give each it's own graded segment color.  I  
like this personally as it doesn't glob various results together onto  
one track and (at least to me) is easier to maintain.  It also allows  
one more flexibility in using varying scoring schemes.
2) Use a callback for bgcolor which changes the color explicitly  
based on the source/score.

The GenBank/EMBL section of the Bio::Graphics HOWTO reveals how to  
add different tracks, and there are several scattered examples on how  
to use callbacks.

http://www.bioperl.org/wiki/HOWTO:Graphics

chris


From cjfields at uiuc.edu  Thu Aug 16 10:12:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:12:30 -0500
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C41FEC.2000206@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
Message-ID: <5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>


On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:

> neeti somaiya wrote:
>> I tried using Bio::Structure::IO::pdb with some code like :-
>> use Bio::Structure::IO;
>>
>>     $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>>                                    -format => 'pdb');
>>
>>     while ( my $struc = $in->next_structure() ) {
>>        print "Structure ", $struc->id,"\n";
>>     }
>>
>> It works well. But I am not able to find documentation of other  
>> methods
>> which will give me various specific details available in a pdb  
>> file, right
>> from title, keywords, references to structure details, atoms,  
>> coordinates
>> etc. There must be different methods to fetch and parse each of  
>> this data
>> from a pdb file, right? Where can I find the details?
>
> $struct is a Bio::Structure::Entry, so look at the docs for that:
> http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
>
> You'll probably want to look at the docs for the other Structure  
> modules
> as well:
> http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
>
>
> I agree, the documentation in this area could be improved.
> Bio::Structure::StructureI could actually contain something, and
> Bio::Structure should actually exist or not be referenced in the docs.

There was a discussion a while back on refactoring the code within  
Bio::Structure to better deal with HETATM and other stuff.  As far as  
I'm concerned it's open for anyone wanted to tinker with it.

chris


From cjfields at uiuc.edu  Thu Aug 16 10:37:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:37:31 -0500
Subject: [Bioperl-l] Announcement: infernal/erpin/rnamotif parsers
Message-ID: <7CE60504-FA1A-4AFF-A02E-036B8E37C3F9@uiuc.edu>

To anyone using the aforementioned parsers:

I don't plan on continuing development of the Bio::Tools-related  
Infernal, RNAMotif, and ERPIN parsers at this time unless there is  
substantial interest in doing so.  Instead, I plan on focusing my  
efforts on the Bio::SearchIO-based parsers as I feel they are much  
better at representing the data present in the output.  In my opinion  
having two sets of parsers that accomplish essentially the same task  
is redundant and non-productive.  Again, if there is considerable  
interest in keeping them I suggest responding to this message,  
otherwise I would consider them deprecated and removed completely by  
rel 1.7 (maybe sooner).

Infernal: It's very likely that a new stable version (v. 1.0) of  
Infernal will be released in the near future.  I may upgrade the  
Bio::SearchIO-based parser in the meantime to parse the latest  
Infernal output (v 0.81), but I don't plan on supporting pre-1.0  
releases once the final version is out.  Infernal has been in  
developer release for some time now and the program output has  
changed dramatically over time; however, the format is expected to  
solidify once a stable release is made, which makes supporting the  
parser much easier over time.

Questions?  Gripes?

chris


From awitney at sgul.ac.uk  Thu Aug 16 10:07:02 2007
From: awitney at sgul.ac.uk (Adam Witney)
Date: Thu, 16 Aug 2007 15:07:02 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <C2EA1896.17575%awitney@sgul.ac.uk>


This would be the best place to start

http://www.activeperl.org/

Or more specifically for the language:

http://www.activeperl.org/store/activeperl/download/

(Which will require you to register with them)

adam


On 16/8/07 04:45, "pratchusha kamireddy" <pratchusha.kamireddy at aamu.edu>
wrote:

> Hello
>   I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am
> working under Dr.Kantety in Plant and Soil Science Department.I am the
> beginner to learn perl programming. I need Activeperl software to run the perl
> programs. Can you help me in this regard like: where can I dowmload this
> software, how can i Install this and how can i use this. I am eagerlu waiting
> for your reply.Please help me in this regard.
>    Thanking you
>    Pratchusha Kamireddy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From muratem at eng.uah.edu  Thu Aug 16 15:10:34 2007
From: muratem at eng.uah.edu (muratem at eng.uah.edu)
Date: Thu, 16 Aug 2007 14:10:34 -0500 (CDT)
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
Message-ID: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>

Hello

This might not be the correct list for this particular problem, but
hopefully someone can help. I am trying to install ...staden::read on a
Mac OS X 10.4. I tried installing cpan but it wouldn't work so I went to
the manual methods. Perl is on the system and appears to be installed
correctly for a Mac. Bioperl 1.5.2 was installed via fink and appears to
be OK also. I'm trying to install the Bio::SeqIO::staden::read module. I
downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the usual
perl Makefile.PL and make and get:

newyork:/usr/local/bioperl-ext-1.5.1 root# make
Makefile:1148: *** multiple target patterns.  Stop.

A snippet from the Makefile...

   1148 pm_to_blib: $(TO_INST_PM)
   1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
   1150           Bio/Ext/Align/libs/hscore.h
$(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
   1151           Bio/Ext/Align/libs/probability.c
$(INST_LIB)/Bio/Ext/Align/libs/probability.c \
   1152           Bio/Ext/Align/libs/linesubs.h
$(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
   1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/test.pl \
   1154           Bio/Ext/Align/libs/wiseoverlay.h
$(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
   1155           Bio/Ext/Align/libs/proteinsw.h
$(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
   1156           Bio/Ext/Align/libs/wisebase.h
$(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
   1157           Bio/Ext/Align/libs/seqaligndisplay.h
$(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
   1158           Bio/Ext/Align/libs/dyna.h
$(INST_LIB)/Bio/Ext/Align/libs/dyna.h \

The README says you don't have to build the whole package, so I descended
to the staden directory and did a Make and didn't get any problems
reported. But when I did a make test I get:

newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
"test_harness(0, '../blib/lib', '../blib/arch')" test.pl
test....Had problems bootstrapping Inline module 'Bio::SeqIO::staden::read'

Can't load
'/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle'
for module Bio::SeqIO::staden::read:
dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle,
2): Symbol not found: _curl_easy_init
  Referenced from:
/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle
  Expected in: dynamic lookup
 at /Library/Perl/5.8.6/Inline.pm line 500


 at test.pl line 0
INIT failed--call queue aborted, <DATA> line 1.
test....dubious
        Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 1-94
        Failed 94/94 tests, 0.00% okay
Failed Test Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
test.pl      255 65280    94  188 200.00%  1-94
Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00% okay.
make: *** [test_dynamic] Error 2

The missing symbol is apparently from libcurl. I have both libcurl.2.dylib
and libcurl.3.dylib with copies in multiple locations including /usr/lib,
/usr/local/lib and the usual Mac directories. I used the Mac otool to look
at the externals in read.bundle and it references libz.1.dylib and
libSystem.B.dylib. Could this be a case where there should have been a
link to libcurl and wasn't?

I've searched the list and see only the Inline versioning problem (which I
had and fixed). Has anybody seen this problem before or built the module
on a Mac? How did you do it? Is this a question for the Staden list on
sourceforge?

Thanks

Mike


From cjfields at uiuc.edu  Thu Aug 16 15:55:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 14:55:05 -0500
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
In-Reply-To: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
References: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
Message-ID: <9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>


On Aug 16, 2007, at 2:10 PM, muratem at eng.uah.edu wrote:

> Hello
>
> This might not be the correct list for this particular problem, but
> hopefully someone can help. I am trying to install ...staden::read  
> on a
> Mac OS X 10.4. I tried installing cpan but it wouldn't work so I  
> went to
> the manual methods. Perl is on the system and appears to be installed
> correctly for a Mac. Bioperl 1.5.2 was installed via fink and  
> appears to
> be OK also. I'm trying to install the Bio::SeqIO::staden::read  
> module. I
> downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the  
> usual
> perl Makefile.PL and make and get:
>
> newyork:/usr/local/bioperl-ext-1.5.1 root# make
> Makefile:1148: *** multiple target patterns.  Stop.
>
> A snippet from the Makefile...
>
>    1148 pm_to_blib: $(TO_INST_PM)
>    1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
> 'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
>    1150           Bio/Ext/Align/libs/hscore.h
> $(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
>    1151           Bio/Ext/Align/libs/probability.c
> $(INST_LIB)/Bio/Ext/Align/libs/probability.c \
>    1152           Bio/Ext/Align/libs/linesubs.h
> $(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
>    1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/ 
> test.pl \
>    1154           Bio/Ext/Align/libs/wiseoverlay.h
> $(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
>    1155           Bio/Ext/Align/libs/proteinsw.h
> $(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
>    1156           Bio/Ext/Align/libs/wisebase.h
> $(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
>    1157           Bio/Ext/Align/libs/seqaligndisplay.h
> $(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
>    1158           Bio/Ext/Align/libs/dyna.h
> $(INST_LIB)/Bio/Ext/Align/libs/dyna.h \
>
> The README says you don't have to build the whole package, so I  
> descended
> to the staden directory and did a Make and didn't get any problems
> reported. But when I did a make test I get:
>
> newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
> "test_harness(0, '../blib/lib', '../blib/arch')" test.pl
> test....Had problems bootstrapping Inline module  
> 'Bio::SeqIO::staden::read'
>
> Can't load
> '/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/ 
> Bio/SeqIO/staden/read/read.bundle'
> for module Bio::SeqIO::staden::read:
> dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/ 
> auto/Bio/SeqIO/staden/read/read.bundle,
> 2): Symbol not found: _curl_easy_init
>   Referenced from:
> /usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/ 
> SeqIO/staden/read/read.bundle
>   Expected in: dynamic lookup
>  at /Library/Perl/5.8.6/Inline.pm line 500
>
>
>  at test.pl line 0
> INIT failed--call queue aborted, <DATA> line 1.
> test....dubious
>         Test returned status 255 (wstat 65280, 0xff00)
> DIED. FAILED tests 1-94
>         Failed 94/94 tests, 0.00% okay
> Failed Test Stat Wstat Total Fail  Failed  List of Failed
> ---------------------------------------------------------------------- 
> ---------
> test.pl      255 65280    94  188 200.00%  1-94
> Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00%  
> okay.
> make: *** [test_dynamic] Error 2
>
> The missing symbol is apparently from libcurl. I have both libcurl. 
> 2.dylib
> and libcurl.3.dylib with copies in multiple locations including / 
> usr/lib,
> /usr/local/lib and the usual Mac directories. I used the Mac otool  
> to look
> at the externals in read.bundle and it references libz.1.dylib and
> libSystem.B.dylib. Could this be a case where there should have been a
> link to libcurl and wasn't?
>
> I've searched the list and see only the Inline versioning problem  
> (which I
> had and fixed). Has anybody seen this problem before or built the  
> module
> on a Mac? How did you do it? Is this a question for the Staden list on
> sourceforge?
>
> Thanks
>
> Mike

Haven't seen the problem you list.  I have installed it on Mac OS X  
(intel) w/o problems so I know it works; at least all tests passed  
though I remember Inline complaining for some reason.

You should try using bioperl-ext from CVS (it is really 1.5.1 but  
with updated docs and maybe a change or two).  The process is a  
little tricky but is documented in the README in the package.  You'll  
need the old io_lib (1.8.12 or earlier) from Staden if memory serves.

chris


From zhaodj at ioz.ac.cn  Thu Aug 16 22:13:16 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 17 Aug 2007 10:13:16 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
Message-ID: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>

Dear list members,

I have a question about the methods of bioperl objects.It is how and
where we can get the whole methods of a bioperl object.

Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
this object, some sample codes are given.The following five clauses
are excerpted from the synopsis.
(1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
(2)while ( my @rids = $factory->each_rid ) {
(3)$factory->remove_rid($rid);
(4)my $rc = $factory->retrieve_blast($rid);
(5)my $r = $factory->submit_blast($input);

The five clauses use five methods of the RemoteBlast object,i.e.
(1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
(5)submit_blast. However,I only find part of them(45) are listed in
the appendix while others(123) are absent. Are there some more
methods not explictly declared? I don't know.This will lead to the
partial understanding and utilization of the module.Therefore I come
here for the way to get the full methods of a bioperl object.

Thanks!
-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From zhaodj at ioz.ac.cn  Thu Aug 16 22:13:16 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 17 Aug 2007 10:13:16 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
Message-ID: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>

Dear list members,

I have a question about the methods of bioperl objects.It is how and
where we can get the whole methods of a bioperl object.

Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
this object, some sample codes are given.The following five clauses
are excerpted from the synopsis.
(1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
(2)while ( my @rids = $factory->each_rid ) {
(3)$factory->remove_rid($rid);
(4)my $rc = $factory->retrieve_blast($rid);
(5)my $r = $factory->submit_blast($input);

The five clauses use five methods of the RemoteBlast object,i.e.
(1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
(5)submit_blast. However,I only find part of them(45) are listed in
the appendix while others(123) are absent. Are there some more
methods not explictly declared? I don't know.This will lead to the
partial understanding and utilization of the module.Therefore I come
here for the way to get the full methods of a bioperl object.

Thanks!
-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From neetisomaiya at gmail.com  Fri Aug 17 02:23:08 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 17 Aug 2007 11:53:08 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
Message-ID: <764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>

Hi,

My main concern is just the pdb id and title. PDB id I am able to fetch
easily, but is there a method which can give me the title of the PDB
structure?

Like for example from the following :-

HEADER    DNA/RNA                                 05-DEC-94   100D
TITLE     CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
TITLE    2 R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO
TITLE    3 PHOSPHATE ONLY AND MINOR GROOVE TERTIARY BASE-PAIRING
COMPND    MOL_ID: 1;
COMPND   2 MOLECULE: DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*GP*)-
COMPND   3 R(*G)-3');
COMPND   4 CHAIN: A, B;
.
.
.
.

I just want "CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO PHOSPHATE ONLY AND
MINOR GROOVE TERTIARY BASE-PAIRING".

Thanks,
Neeti.

On 8/16/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
> On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:
>
> > neeti somaiya wrote:
> >> I tried using Bio::Structure::IO::pdb with some code like :-
> >> use Bio::Structure::IO;
> >>
> >>     $in  = Bio::Structure::IO->new(-file => " pdb100d.ent",
> >>                                    -format => 'pdb');
> >>
> >>     while ( my $struc = $in->next_structure() ) {
> >>        print "Structure ", $struc->id,"\n";
> >>     }
> >>
> >> It works well. But I am not able to find documentation of other
> >> methods
> >> which will give me various specific details available in a pdb
> >> file, right
> >> from title, keywords, references to structure details, atoms,
> >> coordinates
> >> etc. There must be different methods to fetch and parse each of
> >> this data
> >> from a pdb file, right? Where can I find the details?
> >
> > $struct is a Bio::Structure::Entry, so look at the docs for that:
> > http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
> >
> > You'll probably want to look at the docs for the other Structure
> > modules
> > as well:
> > http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
> >
> >
> > I agree, the documentation in this area could be improved.
> > Bio::Structure::StructureI could actually contain something, and
> > Bio::Structure should actually exist or not be referenced in the docs.
>
> There was a discussion a while back on refactoring the code within
> Bio::Structure to better deal with HETATM and other stuff.  As far as
> I'm concerned it's open for anyone wanted to tinker with it.
>
> chris
>


-- 
-Neeti
Even my blood says, B positive


From alexl at users.sourceforge.net  Fri Aug 17 03:22:16 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 00:22:16 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
Message-ID: <cg3ayi39sn.fsf@allele2.localdomain>

Hi all,

I'd like to clarify the license of bioperl.  Currently the LICENSE
only includes the text of the Artistic artist.  But the wiki
http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F
says:

 BioPerl is licensed under the same terms as Perl itself which is the
 Perl Artistic License (see
 http://www.perl.com/pub/a/language/misc/Artistic.html or
 http://www.opensource.org/licenses/artistic-license.html

and most of the modules in the source say:

 "You may distribute this module under the same terms as perl itself"

But the current distribution of Perl is actually dually-licensed under
the GPL or Artistic licenses (so the wiki is technically out of sync
with the "same terms as Perl itself"), see:

 http://dev.perl.org/licenses/

I assume that the intent of the bioperl authors is to license with the
same terms as Perl's *current* license (which would mean bioperl is
really effectively dually-licensed under the GPL or Artistic license).
If so, it would be good if the LICENSE text and the wiki were updated
to reflect this.

Also some of the source modules say "under the same terms as perl
itself", but then only mention the Artistic license.

This has important ramifications for distribution: I maintain the
Fedora package for bioperl and I have currently listed the license of
bioperl as "GPL or Artistic".  But if bioperl were distributed under
the Artistic license only then I would have to pull the package from
the distribution, because the Artistic 1.0 (original)-only license is
deprecated (but "GPL or Artistic" is OK):

http://fedoraproject.org/wiki/Licensing#head-d8cc605dd386091c8b6be97b8a43fb6a5d624ae1

Thanks!

Alex


From alexl at users.sourceforge.net  Fri Aug 17 03:42:07 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 00:42:07 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <cg3ayi39sn.fsf@allele2.localdomain> (Alex Lancaster's message of
	"Fri\, 17 Aug 2007 00\:22\:16 -0700")
References: <cg3ayi39sn.fsf@allele2.localdomain>
Message-ID: <nrsl6i1ub4.fsf@allele2.localdomain>

>>>>> "AL" == Alex Lancaster  writes:

[...]

AL> I assume that the intent of the bioperl authors is to license with
AL> the same terms as Perl's *current* license (which would mean
AL> bioperl is really effectively dually-licensed under the GPL or
AL> Artistic license).  If so, it would be good if the LICENSE text
AL> and the wiki were updated to reflect this.

Also note that since Perl's license is a dual-license "GPL or
Artistic" then people aren't required to submit their modifications
back to the bioperl distribution because they can choose to follow the
Artistic (rather than the GPL) license which doesn't require
modifications to be submitted back.  This means the point:

 "If you fix bugs, please let us know about them. This is not the GPL
 license so you are not required to submit the code fixes, but in the
 spirit of making a better product we hope you'll contribute back to
 the community any insight or code improvements."

listed here:

 http://www.bioperl.org/wiki/Licensing_BioPerl

would still stand, because you can choose the Artistic license, but
you could modify the clause to say:

 "If you fix bugs, please let us know about them. Because Bioperl is
 dual-licensed under the GPL or Artistic licenses, you can choose the
 Artistic license, which means that you are not required to submit the
 code fixes, but in the spirit of making a better product we hope
 you'll contribute back to the community any insight or code
 improvements."


From n.haigh at sheffield.ac.uk  Fri Aug 17 06:27:43 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 17 Aug 2007 11:27:43 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <46C5781F.60301@sheffield.ac.uk>

De-Jian,ZHAO wrote:
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
>   


You should check out the Deobfuscator at:
http://bioperl.org/cgi-bin/deob_interface.cgi

Search and choose the object of choice. e.g. Bio::Tools::Run::RemoteBlast

You will be provided a list of methods available to that object,
including all the methods up the inheritance hierarchy. Unfortunately,
some bioperl modules are documented more thoroughly than others.

Nath


From neetisomaiya at gmail.com  Fri Aug 17 06:42:09 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 17 Aug 2007 16:12:09 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
Message-ID: <764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>

Hi,

I have done it currently as follows :

 while ( my $struc = $in->next_structure() )
                {
                        my $title;

                        my $pdb_id = $struc->id;
                        print "Structure ", $pdb_id,"\n";

                        my $ac = $struc->annotation();

                        foreach my $key ( $ac->get_all_annotation_keys() )
                        {
                                if($key eq "title")
                                {
                                        my @values =
$ac->get_Annotations($key);
                                        foreach my $value (@values)
                                        {
                                                $title = $value->as_text;
                                                chomp($title);
                                                if($title =~ /Value\: (.*)/)
                                                {
                                                        $title = $1;
                                                }
                                                $title =~ s/\s+/ /g;

                                                print "Title ",$title,"\n";
                                                last;
                                        }
                                        last;
                                }
                  }
}

Is this ok?

On 8/17/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> My main concern is just the pdb id and title. PDB id I am able to fetch
> easily, but is there a method which can give me the title of the PDB
> structure?
>
> Like for example from the following :-
>
> HEADER    DNA/RNA                                 05-DEC-94   100D
> TITLE     CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
> TITLE    2 R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO
> TITLE    3 PHOSPHATE ONLY AND MINOR GROOVE TERTIARY BASE-PAIRING
> COMPND    MOL_ID: 1;
> COMPND   2 MOLECULE: DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*GP*)-
> COMPND   3 R(*G)-3');
> COMPND   4 CHAIN: A, B;
> .
> .
> .
> .
>
> I just want "CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
> R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO PHOSPHATE ONLY AND
> MINOR GROOVE TERTIARY BASE-PAIRING".
>
> Thanks,
> Neeti.
>
> On 8/16/07, Chris Fields <cjfields at uiuc.edu> wrote:
> >
> >
> > On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:
> >
> > > neeti somaiya wrote:
> > >> I tried using Bio::Structure::IO::pdb with some code like :-
> > >> use Bio::Structure::IO;
> > >>
> > >>     $in  = Bio::Structure::IO->new(-file => " pdb100d.ent",
> > >>                                    -format => 'pdb');
> > >>
> > >>     while ( my $struc = $in->next_structure() ) {
> > >>        print "Structure ", $struc->id,"\n";
> > >>     }
> > >>
> > >> It works well. But I am not able to find documentation of other
> > >> methods
> > >> which will give me various specific details available in a pdb
> > >> file, right
> > >> from title, keywords, references to structure details, atoms,
> > >> coordinates
> > >> etc. There must be different methods to fetch and parse each of
> > >> this data
> > >> from a pdb file, right? Where can I find the details?
> > >
> > > $struct is a Bio::Structure::Entry, so look at the docs for that:
> > > http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
> > >
> > > You'll probably want to look at the docs for the other Structure
> > > modules
> > > as well:
> > > http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
> > >
> > >
> > > I agree, the documentation in this area could be improved.
> > > Bio::Structure::StructureI could actually contain something, and
> > > Bio::Structure should actually exist or not be referenced in the docs.
> >
> >
> > There was a discussion a while back on refactoring the code within
> > Bio::Structure to better deal with HETATM and other stuff.  As far as
> > I'm concerned it's open for anyone wanted to tinker with it.
> >
> > chris
> >
>
>
>
> --
> -Neeti
> Even my blood says, B positive
>


-- 
-Neeti
Even my blood says, B positive


From n.haigh at sheffield.ac.uk  Fri Aug 17 06:27:43 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 17 Aug 2007 11:27:43 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <46C5781F.60301@sheffield.ac.uk>

De-Jian,ZHAO wrote:
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
>   


You should check out the Deobfuscator at:
http://bioperl.org/cgi-bin/deob_interface.cgi

Search and choose the object of choice. e.g. Bio::Tools::Run::RemoteBlast

You will be provided a list of methods available to that object,
including all the methods up the inheritance hierarchy. Unfortunately,
some bioperl modules are documented more thoroughly than others.

Nath


From bix at sendu.me.uk  Fri Aug 17 09:35:01 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 17 Aug 2007 14:35:01 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	
	<46C41FEC.2000206@sendu.me.uk>	
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>	
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
	<764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
Message-ID: <46C5A405.2070005@sendu.me.uk>

neeti somaiya wrote:
> Hi,
> 
> I have done it currently as follows :
[snip]
> Is this ok?

If it works, of course. There seems to be some redundant code there, 
however. I'm guessing this would be better (assuming your code worked in 
the first place):

while (my $struc = $in->next_structure()) {
     my $pdb_id = $struc->id;
     print "Structure ", $pdb_id,"\n";

     my $ac = $struc->annotation();
     my ($title) = $ac->get_Annotations('title');
     $title = $title->as_text;
     chomp($title);
     if ($title =~ /Value\: (.*)/) {
         $title = $1;
     }
     $title =~ s/\s+/ /g;

     print "Title ",$title,"\n";
}


From muratem at eng.uah.edu  Fri Aug 17 10:03:22 2007
From: muratem at eng.uah.edu (Mike Muratet)
Date: Fri, 17 Aug 2007 09:03:22 -0500 (CDT)
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
In-Reply-To: <9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>
References: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
	<9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>
Message-ID: <Pine.GSO.4.60.0708170902570.23859@eng.uah.edu>


On Thu, 16 Aug 2007, Chris Fields wrote:

> Date: Thu, 16 Aug 2007 14:55:05 -0500
> From: Chris Fields <cjfields at uiuc.edu>
> To: muratem at eng.uah.edu
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
> 
>
> On Aug 16, 2007, at 2:10 PM, muratem at eng.uah.edu wrote:
>
>> Hello
>> 
>> This might not be the correct list for this particular problem, but
>> hopefully someone can help. I am trying to install ...staden::read on a
>> Mac OS X 10.4. I tried installing cpan but it wouldn't work so I went to
>> the manual methods. Perl is on the system and appears to be installed
>> correctly for a Mac. Bioperl 1.5.2 was installed via fink and appears to
>> be OK also. I'm trying to install the Bio::SeqIO::staden::read module. I
>> downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the usual
>> perl Makefile.PL and make and get:
>> 
>> newyork:/usr/local/bioperl-ext-1.5.1 root# make
>> Makefile:1148: *** multiple target patterns.  Stop.
>> 
>> A snippet from the Makefile...
>> 
>>    1148 pm_to_blib: $(TO_INST_PM)
>>    1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
>> 'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
>>    1150           Bio/Ext/Align/libs/hscore.h
>> $(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
>>    1151           Bio/Ext/Align/libs/probability.c
>> $(INST_LIB)/Bio/Ext/Align/libs/probability.c \
>>    1152           Bio/Ext/Align/libs/linesubs.h
>> $(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
>>    1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/test.pl 
>> \
>>    1154           Bio/Ext/Align/libs/wiseoverlay.h
>> $(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
>>    1155           Bio/Ext/Align/libs/proteinsw.h
>> $(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
>>    1156           Bio/Ext/Align/libs/wisebase.h
>> $(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
>>    1157           Bio/Ext/Align/libs/seqaligndisplay.h
>> $(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
>>    1158           Bio/Ext/Align/libs/dyna.h
>> $(INST_LIB)/Bio/Ext/Align/libs/dyna.h \
>> 
>> The README says you don't have to build the whole package, so I descended
>> to the staden directory and did a Make and didn't get any problems
>> reported. But when I did a make test I get:
>> 
>> newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
>> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
>> "test_harness(0, '../blib/lib', '../blib/arch')" test.pl
>> test....Had problems bootstrapping Inline module 
>> 'Bio::SeqIO::staden::read'
>> 
>> Can't load
>> '/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/ 
>> Bio/SeqIO/staden/read/read.bundle'
>> for module Bio::SeqIO::staden::read:
>> dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/ 
>> auto/Bio/SeqIO/staden/read/read.bundle,
>> 2): Symbol not found: _curl_easy_init
>>   Referenced from:
>> /usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/ 
>> SeqIO/staden/read/read.bundle
>>   Expected in: dynamic lookup
>>  at /Library/Perl/5.8.6/Inline.pm line 500
>> 
>> 
>>  at test.pl line 0
>> INIT failed--call queue aborted, <DATA> line 1.
>> test....dubious
>>         Test returned status 255 (wstat 65280, 0xff00)
>> DIED. FAILED tests 1-94
>>         Failed 94/94 tests, 0.00% okay
>> Failed Test Stat Wstat Total Fail  Failed  List of Failed
>> ---------------------------------------------------------------------- 
>> ---------
>> test.pl      255 65280    94  188 200.00%  1-94
>> Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00% okay.
>> make: *** [test_dynamic] Error 2
>> 
>> The missing symbol is apparently from libcurl. I have both libcurl.2.dylib
>> and libcurl.3.dylib with copies in multiple locations including /usr/lib,
>> /usr/local/lib and the usual Mac directories. I used the Mac otool to look
>> at the externals in read.bundle and it references libz.1.dylib and
>> libSystem.B.dylib. Could this be a case where there should have been a
>> link to libcurl and wasn't?
>> 
>> I've searched the list and see only the Inline versioning problem (which I
>> had and fixed). Has anybody seen this problem before or built the module
>> on a Mac? How did you do it? Is this a question for the Staden list on
>> sourceforge?
>> 
>> Thanks
>> 
>> Mike
>
> Haven't seen the problem you list.  I have installed it on Mac OS X (intel) 
> w/o problems so I know it works; at least all tests passed though I remember 
> Inline complaining for some reason.
>
> You should try using bioperl-ext from CVS (it is really 1.5.1 but with 
> updated docs and maybe a change or two).  The process is a little tricky but 
> is documented in the README in the package.  You'll need the old io_lib 
> (1.8.12 or earlier) from Staden if memory serves.
>
> chris
>

Thanks, I'll give that a try.

Mike


From alexl at users.sourceforge.net  Fri Aug 17 11:23:33 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 08:23:33 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	(Kevin Brown's message of "Fri\, 17 Aug 2007 08\:11\:40 -0700")
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
Message-ID: <n9ir7e18y2.fsf@allele2.localdomain>

>>>>> "KB" == Kevin Brown  writes:

[...]

>> Also note that since Perl's license is a dual-license "GPL or
>> Artistic" then people aren't required to submit their modifications
>> back to the bioperl distribution because they can choose to follow
>> the Artistic (rather than the GPL) license which doesn't require
>> modifications to be submitted back.  This means the point:

KB> You aren't required to submit patches even under the GPL.  If I
KB> make changes and don't distribute them then I have no requirement
KB> to reveal my changes to the bioperl source code.  Also the GPL
KB> does not require that the code be made freely available to all,
KB> just that users of GPL'd software can request the source from the
KB> vendor/distributor and should not find lots of little hoops to
KB> jump through to get it.  You can even charge to get access if that
KB> charge is to cover the cost of the expense to get it (such as the
KB> cost of a cd + mail delivery charge).

Sure, I was just pointing out that you can avoid even these things if
you choose the Artistic license.  I have no problem with the GPL, but
some people do.  The other possibility (if the current Perl "GPL or
Artistic" is not a possibility) is simply upgrading to the "Artistic
2.0" license adopted by the Perl Foundation for Perl 6 and later (I
think?):

http://www.perlfoundation.org/artistic_license_2_0

it's a GPL-compatible free software license.

Alex


From Kevin.M.Brown at asu.edu  Fri Aug 17 11:11:40 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 17 Aug 2007 08:11:40 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <nrsl6i1ub4.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
Message-ID: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>

> AL> I assume that the intent of the bioperl authors is to 
> license with 
> AL> the same terms as Perl's *current* license (which would 
> mean bioperl 
> AL> is really effectively dually-licensed under the GPL or Artistic 
> AL> license).  If so, it would be good if the LICENSE text 
> and the wiki 
> AL> were updated to reflect this.
> 
> Also note that since Perl's license is a dual-license "GPL or 
> Artistic" then people aren't required to submit their 
> modifications back to the bioperl distribution because they 
> can choose to follow the Artistic (rather than the GPL) 
> license which doesn't require modifications to be submitted 
> back.  This means the point:

You aren't required to submit patches even under the GPL.  If I make
changes and don't distribute them then I have no requirement to reveal
my changes to the bioperl source code.  Also the GPL does not require
that the code be made freely available to all, just that users of GPL'd
software can request the source from the vendor/distributor and should
not find lots of little hoops to jump through to get it.  You can even
charge to get access if that charge is to cover the cost of the expense
to get it (such as the cost of a cd + mail delivery charge).


From cjfields at uiuc.edu  Fri Aug 17 12:07:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 17 Aug 2007 11:07:47 -0500
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <n9ir7e18y2.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
Message-ID: <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>


On Aug 17, 2007, at 10:23 AM, Alex Lancaster wrote:

>>>>>> "KB" == Kevin Brown  writes:
>
> [...]
>
>>> Also note that since Perl's license is a dual-license "GPL or
>>> Artistic" then people aren't required to submit their modifications
>>> back to the bioperl distribution because they can choose to follow
>>> the Artistic (rather than the GPL) license which doesn't require
>>> modifications to be submitted back.  This means the point:
>
> KB> You aren't required to submit patches even under the GPL.  If I
> KB> make changes and don't distribute them then I have no requirement
> KB> to reveal my changes to the bioperl source code.  Also the GPL
> KB> does not require that the code be made freely available to all,
> KB> just that users of GPL'd software can request the source from the
> KB> vendor/distributor and should not find lots of little hoops to
> KB> jump through to get it.  You can even charge to get access if that
> KB> charge is to cover the cost of the expense to get it (such as the
> KB> cost of a cd + mail delivery charge).
>
> Sure, I was just pointing out that you can avoid even these things if
> you choose the Artistic license.  I have no problem with the GPL, but
> some people do.  The other possibility (if the current Perl "GPL or
> Artistic" is not a possibility) is simply upgrading to the "Artistic
> 2.0" license adopted by the Perl Foundation for Perl 6 and later (I
> think?):
>
> http://www.perlfoundation.org/artistic_license_2_0
>
> it's a GPL-compatible free software license.
>
> Alex

Switching to Artistic 2.0 is probably the best way to go.  We'll need  
a more involved discussion but I don't think there'll be too many  
objections.  You mention GPL-compatibility; is that for v2 and v3?

chris


From gonzaled at tcd.ie  Fri Aug 17 13:03:35 2007
From: gonzaled at tcd.ie (David Gonzalez)
Date: Fri, 17 Aug 2007 18:03:35 +0100
Subject: [Bioperl-l] Bio::SeqIO::swiss species parsing bug?
Message-ID: <46C5D4E7.6000605@tcd.ie>

	Hi,

	I had a problem with a swissprot file in which the genus and species
were being left undefined, and I believe it could be a bug in the
swiss.pm module.


	When I tried to parse the file with Bio::SeqIO, I got the following
error messages:

Use of uninitialized value in pattern match (m//) at
/sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 965, <GEN0> line 12.
Use of uninitialized value in string eq at
/sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 967, <GEN0> line 12.

	The fields I wanted from the file (gene_id , etc.. ) were fine however,
so it was being parsed.

	I checked the output with Data::Dumper and I found the following in the
species entry; the species is left undefined, and the common name is absent.

 	'species' => bless( {
                             '_ncbi_taxid' => 'Not',
                             '_classification' => [
                                                   	undef,
                                                   	undef,
                                                   	'Aedes',
                                                  						    	'Culicini',
                                                        'Culicinae',
                                                        'Culicidae',
                                                        'Culicoidea',
                                                        'Nematocera',
                                                        'Diptera',
                                                        'Endopterygota',
                                                        'Neoptera',
                                                        'Pterygota',
                                                        'Insecta',
                                                        'Hexapoda',
                               							'Arthropoda',
                                         							'Metazoa',
                                                        'Eukaryota'
                                                            ]
                                     }, 'Bio::Species' ),

	The species line in the file is formatted according to the swissprot
specifications and includes a common name

OS   Aedes aegypti (yellow fever mosquito)
OC   Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; Neoptera;
OC   Endopterygota; Diptera; Nematocera; Culicoidea; Culicidae; Culicinae;
OC   Culicini; Aedes.
OX   NCBI_TaxID=Not defined;

	I think the problem is in the line 905 of the swiss.pm file:

902	if(/^OS\s+(\S.+)/ && (! defined($binomial))) {
903	    $osline .= " " if $osline;
904	    $osline .= $1;
905	    if($osline =~ s/(,|, and|\.)$//) {
906		($binomial, $descr) = $osline =~ /(\S[^\(]+)(.*)/;
907             ($ns_name) = $binomial;
908             $ns_name =~ s/\s+$//; #####


	The problem seems to be that there are no punctuation signs, so 905
returns false. The swissprot format does not require the line to end in
'.' I think although it normally does. By just removing the requirement
for the substitution the output of Data::Dumper seemed normal

	....
	'_common_name' => 'yellow fever mosquito',
        '_ncbi_taxid' => 'Not',
        '_classification' => [
                              'aegypti',
                              'Aedes',
                              'Culicini',
	....

	I am using the fink installed bioperl:
	bioperl-pm586   1.4-5   Perl module for biology

	I don't know if this has  been reported/solved in the newer versions of
bioperl.

	David

-- 
David Gonzalez Knowles
Smurfit Institute of Genetics
Trinity College
Dublin


From cjfields at uiuc.edu  Fri Aug 17 13:20:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 17 Aug 2007 12:20:21 -0500
Subject: [Bioperl-l] Bio::SeqIO::swiss species parsing bug?
In-Reply-To: <46C5D4E7.6000605@tcd.ie>
References: <46C5D4E7.6000605@tcd.ie>
Message-ID: <04912FDE-2AA4-414C-9CE4-A0BA5E9C89C9@uiuc.edu>


On Aug 17, 2007, at 12:03 PM, David Gonzalez wrote:

> 	Hi,
>
> 	I had a problem with a swissprot file in which the genus and species
> were being left undefined, and I believe it could be a bug in the
> swiss.pm module.
>
>
> 	When I tried to parse the file with Bio::SeqIO, I got the following
> error messages:
>
> Use of uninitialized value in pattern match (m//) at
> /sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 965, <GEN0> line 12.
> Use of uninitialized value in string eq at
> /sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 967, <GEN0> line 12.
> ...
> 	I am using the fink installed bioperl:
> 	bioperl-pm586   1.4-5   Perl module for biology
>
> 	I don't know if this has  been reported/solved in the newer  
> versions of
> bioperl.
>
> 	David
>
> -- 
> David Gonzalez Knowles
> Smurfit Institute of Genetics
> Trinity College
> Dublin

That looks like bioperl 1.4, which is several years old.  You should  
update to the latest official release (1.5.2), then see if the  
problem persists.

chris


From alexl at users.sourceforge.net  Sat Aug 18 07:33:34 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Sat, 18 Aug 2007 04:33:34 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> (Chris Fields's
	message of "Fri\, 17 Aug 2007 11\:07\:47 -0500")
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
Message-ID: <8td4xlyt4h.fsf@allele2.localdomain>

>>>>> "CF" == Chris Fields  writes:

[...]

>> Sure, I was just pointing out that you can avoid even these things
>> if you choose the Artistic license.  I have no problem with the
>> GPL, but some people do.  The other possibility (if the current
>> Perl "GPL or Artistic" is not a possibility) is simply upgrading to
>> the "Artistic 2.0" license adopted by the Perl Foundation for Perl
>> 6 and later (I think?):

>> http://www.perlfoundation.org/artistic_license_2_0

>> it's a GPL-compatible free software license.

CF> Switching to Artistic 2.0 is probably the best way to go.  We'll
CF> need a more involved discussion but I don't think there'll be too
CF> many objections.  You mention GPL-compatibility; is that for v2
CF> and v3?

IANAL, but looking at:

http://www.perlfoundation.org/artistic_2_0_notes

http://www.gnu.org/licenses/license-list.html (scroll down to
"Artistic 2.0")

it looks like you can choose any GPL license (i.e. v1 to v3).

I was really more concerned with clarifying what the bioperl license
was *right now*, because "the same license as Perl" implies the
so-called "disjunctive" "GPL or Artistic license":

http://www.gnu.org/licenses/license-list.html#PerlLicense

which is what I've marked the Fedora package as (since it listed "the
same license as Perl" in most of the source files), which is fine for
Fedora.

Fedora may possibly (still under discussion I believe) require removal
of any package that is licensed under the original (1.0) Artistic
alone and it would be a real shame if that required bioperl being
pulled from the repo.  I imagine the intent of the bioperl
contributors is that it should be under the same terms as Perl,
whatever that happens to be (which just happens to be GPL or Artistic,
which is fine).  A clarification to that effect would be useful.

Cheers,
Alex


From zhaodj at ioz.ac.cn  Sat Aug 18 11:06:41 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Sat, 18 Aug 2007 23:06:41 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <46C5781F.60301@sheffield.ac.uk>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
	<46C5781F.60301@sheffield.ac.uk>
Message-ID: <52869.159.226.67.49.1187449601.squirrel@mail.ioz.ac.cn>

Thank you,Nathan.
The Deobfuscator is very helpful.

On Fri, Aug 17, 2007 18:27, Nathan Haigh wrote:
> De-Jian,ZHAO wrote:
>> Dear list members,
>>
>> I have a question about the methods of bioperl objects.It is how
>> and
>> where we can get the whole methods of a bioperl object.
>>
>> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
>> this object, some sample codes are given.The following five
>> clauses
>> are excerpted from the synopsis.
>> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>> (2)while ( my @rids = $factory->each_rid ) {
>> (3)$factory->remove_rid($rid);
>> (4)my $rc = $factory->retrieve_blast($rid);
>> (5)my $r = $factory->submit_blast($input);
>>
>> The five clauses use five methods of the RemoteBlast object,i.e.
>> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
>> (5)submit_blast. However,I only find part of them(45) are listed
>> in
>> the appendix while others(123) are absent. Are there some more
>> methods not explictly declared? I don't know.This will lead to the
>> partial understanding and utilization of the module.Therefore I
>> come
>> here for the way to get the full methods of a bioperl object.
>>
>> Thanks!
>>
>
>
> You should check out the Deobfuscator at:
> http://bioperl.org/cgi-bin/deob_interface.cgi
>
> Search and choose the object of choice. e.g.
> Bio::Tools::Run::RemoteBlast
>
> You will be provided a list of methods available to that object,
> including all the methods up the inheritance hierarchy.
> Unfortunately,
> some bioperl modules are documented more thoroughly than others.
>
> Nath
>


From hlapp at gmx.net  Sat Aug 18 12:13:28 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 18 Aug 2007 12:13:28 -0400
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <8td4xlyt4h.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
	<8td4xlyt4h.fsf@allele2.localdomain>
Message-ID: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>


On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote:

> I imagine the intent of the bioperl
> contributors is that it should be under the same terms as Perl,
> whatever that happens to be (which just happens to be GPL or Artistic,
> which is fine).

I fully agree.

>   A clarification to that effect would be useful.

Agreed, too. Would you mind changing that language on the wiki, since  
you seem to have a fairly good grasp on the issue?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Aug 18 12:42:04 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 18 Aug 2007 11:42:04 -0500
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
	<8td4xlyt4h.fsf@allele2.localdomain>
	<8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>
Message-ID: <D3B67BC2-CB56-420F-B4E3-E0A57FEA7E80@uiuc.edu>


On Aug 18, 2007, at 11:13 AM, Hilmar Lapp wrote:

>
> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote:
>
>> I imagine the intent of the bioperl
>> contributors is that it should be under the same terms as Perl,
>> whatever that happens to be (which just happens to be GPL or  
>> Artistic,
>> which is fine).
>
> I fully agree.
>
>>   A clarification to that effect would be useful.
>
> Agreed, too. Would you mind changing that language on the wiki, since
> you seem to have a fairly good grasp on the issue?
>
> 	-hilmar

Looks like the modules mostly state 'You may distribute this module  
under the same terms as perl itself', but there are likely a few  
which need to be changed.  Might be worth running a quick code audit  
to see what's present.

chris


From avilella at gmail.com  Sat Aug 18 16:38:10 2007
From: avilella at gmail.com (Albert Vilella)
Date: Sat, 18 Aug 2007 21:38:10 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <358f4d650708181338s5a5caadbscfa85786327f4304@mail.gmail.com>

I particularly like to code and debug at the same time. When you are using
the perl debugger, you can do an:

<DB> m $object

and it will show up all the information and methods for that object.

Cheers,

    Albert.

On 8/17/07, De-Jian,ZHAO <zhaodj at ioz.ac.cn> wrote:
>
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
> --
> De-Jian Zhao
> Institute of Zoology,Chinese Academy of Sciences
> +86-10-64807217
> zhaodj at ioz.ac.cn
>
>
>
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From avilella at gmail.com  Sat Aug 18 16:38:10 2007
From: avilella at gmail.com (Albert Vilella)
Date: Sat, 18 Aug 2007 21:38:10 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <358f4d650708181338s5a5caadbscfa85786327f4304@mail.gmail.com>

I particularly like to code and debug at the same time. When you are using
the perl debugger, you can do an:

<DB> m $object

and it will show up all the information and methods for that object.

Cheers,

    Albert.

On 8/17/07, De-Jian,ZHAO <zhaodj at ioz.ac.cn> wrote:
>
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
> --
> De-Jian Zhao
> Institute of Zoology,Chinese Academy of Sciences
> +86-10-64807217
> zhaodj at ioz.ac.cn
>
>
>
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From neetisomaiya at gmail.com  Mon Aug 20 00:33:17 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 10:03:17 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C5A405.2070005@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
	<764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
Message-ID: <764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>

Hi,

Thanks for the responses.
Another question I had was, I am interested only in pdb id and title, and
for this I am downloading and unzipping each of the full pdb structure
files, parsing to get just id and title. Is there any other data source
which can give me just id and title of pdb structures, without me having to
download the full file of each structre?

Thanks,
Neeti.

On 8/17/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> neeti somaiya wrote:
> > Hi,
> >
> > I have done it currently as follows :
> [snip]
> > Is this ok?
>
> If it works, of course. There seems to be some redundant code there,
> however. I'm guessing this would be better (assuming your code worked in
> the first place):
>
> while (my $struc = $in->next_structure()) {
>      my $pdb_id = $struc->id;
>      print "Structure ", $pdb_id,"\n";
>
>      my $ac = $struc->annotation();
>      my ($title) = $ac->get_Annotations('title');
>      $title = $title->as_text;
>      chomp($title);
>      if ($title =~ /Value\: (.*)/) {
>          $title = $1;
>      }
>      $title =~ s/\s+/ /g;
>
>      print "Title ",$title,"\n";
> }
>


-- 
-Neeti
Even my blood says, B positive


From jaudall at gmail.com  Mon Aug 20 00:39:18 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Sun, 19 Aug 2007 21:39:18 -0700
Subject: [Bioperl-l] concatenating aln splices
Message-ID: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>

Based on several criteria, I've extracted several splices from a
single alignment and I'm trying to concatenate my selected sequences
together.  Unfortunately, one of the sequences in the original
alignment only has gap characters for one or more of the splices.  I'd
like to keep the gap splices because other downstream aligned bases
depend on them.  I get these two warning messages splicing my
alignments together:

-------------------- WARNING ---------------------
MSG: Got a sequence with no letters in it cannot guess alphabet []
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Slice [232-233] of sequence [X2A/1-202] contains no residues.
Sequence excluded from the new alignment.
---------------------------------------------------

and now because of missing gaps, I get this error when trying to
concatenate them:

-------------------- WARNING ---------------------
MSG: expecting 236 not 203 from X2A
---------------------------------------------------

------------- EXCEPTION  -------------
MSG: All sequences in the alignment must be the same length
STACK Bio::AlignIO::phylip::write_aln
/sw/lib/perl5/5.8.6/Bio/AlignIO/phylip.pm:292

I don't mind the warnings, in fact I like them, but is there a way to
stop the splice function from removing the 'gap' sequence from the
alignment?  Perhaps catching the warning and inserting the gaps
afterwards might work, but I'm wondering if there's is a simpler
modification of SimpleAlign.pm that might work.  Any thoughts?

Josh


From bix at sendu.me.uk  Mon Aug 20 03:43:45 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 20 Aug 2007 08:43:45 +0100
Subject: [Bioperl-l] concatenating aln splices
In-Reply-To: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
References: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
Message-ID: <46C94631.2060704@sendu.me.uk>

Joshua Udall wrote:
> Based on several criteria, I've extracted several splices from a
> single alignment and I'm trying to concatenate my selected sequences
> together.  Unfortunately, one of the sequences in the original
> alignment only has gap characters for one or more of the splices.  I'd
> like to keep the gap splices because other downstream aligned bases
> depend on them.
[snip]
> I don't mind the warnings, in fact I like them, but is there a way to
> stop the splice function from removing the 'gap' sequence from the
> alignment?  Perhaps catching the warning and inserting the gaps
> afterwards might work, but I'm wondering if there's is a simpler
> modification of SimpleAlign.pm that might work.  Any thoughts?

Let us see some code, so we can get a better idea of what you're doing 
and what you've tried.

You can avoid losing sequences during a slice by not doing a slice. 
Instead, remove_columns(). This way you don't have to splice alignments 
together; you go from original alignment to 'spliced' version in one step.


From Oliver.Wafzig at sygnis.de  Mon Aug 20 04:42:55 2007
From: Oliver.Wafzig at sygnis.de (Oliver Wafzig)
Date: Mon, 20 Aug 2007 10:42:55 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
Message-ID: <200708201042.55292.Oliver.Wafzig@sygnis.de>

On Monday 20 August 2007 06:33, neeti somaiya wrote:
> Another question I had was, I am interested only in pdb id and title, and
> for this I am downloading and unzipping each of the full pdb structure
> files, parsing to get just id and title. Is there any other data source

Hi Neeti,
this is a non bioperl way to download the data.
Use the SRS server on the EBI page to download only id and title lines from 
pdb.

1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
2) Search for 'PDB' on the 'library page' and select it.
3) Use the standard query form. Select 'id' in the dropdown list and 
insert '*' (wildcard).
4) Create a view by selecting 'ID' and 'Title', then click the search button.
5) Click the save results button.
6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of entries 
to download' field. Press 'save'.

If the download is slow, read the 'download tips' on the download page and 
split the results in chunks. 

-- 
Oliver


From neetisomaiya at gmail.com  Mon Aug 20 09:05:01 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 18:35:01 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <200708201042.55292.Oliver.Wafzig@sygnis.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
Message-ID: <764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>

Thanks for your response.
Actually I am looking for something standalone and not on the web, as in
something which I can download onto my machine and parse later to get id and
title.

On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>
> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> > Another question I had was, I am interested only in pdb id and title,
> and
> > for this I am downloading and unzipping each of the full pdb structure
> > files, parsing to get just id and title. Is there any other data source
>
> Hi Neeti,
> this is a non bioperl way to download the data.
> Use the SRS server on the EBI page to download only id and title lines
> from
> pdb.
>
> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
> 2) Search for 'PDB' on the 'library page' and select it.
> 3) Use the standard query form. Select 'id' in the dropdown list and
> insert '*' (wildcard).
> 4) Create a view by selecting 'ID' and 'Title', then click the search
> button.
> 5) Click the save results button.
> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> entries
> to download' field. Press 'save'.
>
> If the download is slow, read the 'download tips' on the download page and
> split the results in chunks.
>
> --
> Oliver
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive


From bernd at kirx.de  Mon Aug 20 12:57:28 2007
From: bernd at kirx.de (Bernd Mueller)
Date: Mon, 20 Aug 2007 18:57:28 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
Message-ID: <46C9C7F8.3020608@kirx.de>

Hello,

Maybe you wanna try the Database-EUtilities module from bioperl. They 
are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

I tried them for a similar search on pubmed but without any reasonable 
results because my target was too focused.

 From EUtilities documentation on 
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases

"Protein Database

The Protein database contains sequence data from the translated coding 
regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein 
sequences submitted to Protein Information Resource (PIR), SWISS-PROT, 
Protein Research Foundation (PRF), and Protein Data Bank (PDB) 
(sequences from solved structures). "

So PDB is included in eutilities from NCBI.

Regards,
Bernd

neeti somaiya wrote:
> Thanks for your response.
> Actually I am looking for something standalone and not on the web, as in
> something which I can download onto my machine and parse later to get id and
> title.
> 
> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>> Another question I had was, I am interested only in pdb id and title,
>> and
>>> for this I am downloading and unzipping each of the full pdb structure
>>> files, parsing to get just id and title. Is there any other data source
>> Hi Neeti,
>> this is a non bioperl way to download the data.
>> Use the SRS server on the EBI page to download only id and title lines
>> from
>> pdb.
>>
>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>> 2) Search for 'PDB' on the 'library page' and select it.
>> 3) Use the standard query form. Select 'id' in the dropdown list and
>> insert '*' (wildcard).
>> 4) Create a view by selecting 'ID' and 'Title', then click the search
>> button.
>> 5) Click the save results button.
>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>> entries
>> to download' field. Press 'save'.
>>
>> If the download is slow, read the 'download tips' on the download page and
>> split the results in chunks.
>>
>> --
>> Oliver
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> 
> 
> 

-- 
Dipl.-Inform.(FH)
Bernd Mueller
phone: +49 179 2336692
email: bernd at kirx.de


From neetisomaiya at gmail.com  Mon Aug 20 13:39:01 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 23:09:01 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C9C7F8.3020608@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
Message-ID: <764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>

Hi,

Thanks for all the responses.
I got the solution from RCBS people :-

Dear Dr. Somaiya,

Thank you for your email message.

Please try the following:
1) Go to http://www.pdb.org/pdb/statistics/holdings.do and select the
number in the bottom right corner of the table (currently 45213).
2) From the menu on the left select 'Tabulate'>>'Custom Report' and
under 'Primary Citation' select 'Title'
3) At the bottom, select 'Create Report' and then one of the 'Download'
options.

Please let us know if we can be of additional assistance.

Sincerely,
Rachel Green

On 8/20/07, Bernd Mueller <bernd at kirx.de> wrote:
>
> Hello,
>
> Maybe you wanna try the Database-EUtilities module from bioperl. They
> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> I tried them for a similar search on pubmed but without any reasonable
> results because my target was too focused.
>
> From EUtilities documentation on
>
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases
>
> "Protein Database
>
> The Protein database contains sequence data from the translated coding
> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> (sequences from solved structures). "
>
> So PDB is included in eutilities from NCBI.
>
> Regards,
> Bernd
>
> neeti somaiya wrote:
> > Thanks for your response.
> > Actually I am looking for something standalone and not on the web, as in
> > something which I can download onto my machine and parse later to get id
> and
> > title.
> >
> > On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
> >> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> >>> Another question I had was, I am interested only in pdb id and title,
> >> and
> >>> for this I am downloading and unzipping each of the full pdb structure
> >>> files, parsing to get just id and title. Is there any other data
> source
> >> Hi Neeti,
> >> this is a non bioperl way to download the data.
> >> Use the SRS server on the EBI page to download only id and title lines
> >> from
> >> pdb.
> >>
> >> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
> >> 2) Search for 'PDB' on the 'library page' and select it.
> >> 3) Use the standard query form. Select 'id' in the dropdown list and
> >> insert '*' (wildcard).
> >> 4) Create a view by selecting 'ID' and 'Title', then click the search
> >> button.
> >> 5) Click the save results button.
> >> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> >> entries
> >> to download' field. Press 'save'.
> >>
> >> If the download is slow, read the 'download tips' on the download page
> and
> >> split the results in chunks.
> >>
> >> --
> >> Oliver
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> >
> >
>
> --
> Dipl.-Inform.(FH)
> Bernd Mueller
> phone: +49 179 2336692
> email: bernd at kirx.de
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive


From jaudall at gmail.com  Mon Aug 20 14:30:26 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Mon, 20 Aug 2007 12:30:26 -0600
Subject: [Bioperl-l] concatenating aln splices
In-Reply-To: <46C94631.2060704@sendu.me.uk>
References: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
	<46C94631.2060704@sendu.me.uk>
Message-ID: <52cea20c0708201130u29af2e10w78a852d7f88c23d1@mail.gmail.com>

Thanks, Sendu!  That suggestion was exactly what I needed.  I have it worked
out now with the remove_columns function.  Much easier that way :)

Josh

On 8/20/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> Joshua Udall wrote:
> > Based on several criteria, I've extracted several splices from a
> > single alignment and I'm trying to concatenate my selected sequences
> > together.  Unfortunately, one of the sequences in the original
> > alignment only has gap characters for one or more of the splices.  I'd
> > like to keep the gap splices because other downstream aligned bases
> > depend on them.
> [snip]
> > I don't mind the warnings, in fact I like them, but is there a way to
> > stop the splice function from removing the 'gap' sequence from the
> > alignment?  Perhaps catching the warning and inserting the gaps
> > afterwards might work, but I'm wondering if there's is a simpler
> > modification of SimpleAlign.pm that might work.  Any thoughts?
>
> Let us see some code, so we can get a better idea of what you're doing
> and what you've tried.
>
> You can avoid losing sequences during a slice by not doing a slice.
> Instead, remove_columns(). This way you don't have to splice alignments
> together; you go from original alignment to 'spliced' version in one step.
>


From cjfields at uiuc.edu  Mon Aug 20 14:51:14 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 20 Aug 2007 13:51:14 -0500
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C9C7F8.3020608@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
Message-ID: <4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>

Just curious, but what kind of query were you trying?  It might be  
worth trying to work through it to add as an example to the cookbook  
page.

chris

On Aug 20, 2007, at 11:57 AM, Bernd Mueller wrote:

> Hello,
>
> Maybe you wanna try the Database-EUtilities module from bioperl. They
> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> I tried them for a similar search on pubmed but without any reasonable
> results because my target was too focused.
>
>  From EUtilities documentation on
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=helpentrez.section.EntrezHelp.The_Databases
>
> "Protein Database
>
> The Protein database contains sequence data from the translated coding
> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as  
> protein
> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> (sequences from solved structures). "
>
> So PDB is included in eutilities from NCBI.
>
> Regards,
> Bernd
>
> neeti somaiya wrote:
>> Thanks for your response.
>> Actually I am looking for something standalone and not on the web,  
>> as in
>> something which I can download onto my machine and parse later to  
>> get id and
>> title.
>>
>> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>>> Another question I had was, I am interested only in pdb id and  
>>>> title,
>>> and
>>>> for this I am downloading and unzipping each of the full pdb  
>>>> structure
>>>> files, parsing to get just id and title. Is there any other data  
>>>> source
>>> Hi Neeti,
>>> this is a non bioperl way to download the data.
>>> Use the SRS server on the EBI page to download only id and title  
>>> lines
>>> from
>>> pdb.
>>>
>>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>>> 2) Search for 'PDB' on the 'library page' and select it.
>>> 3) Use the standard query form. Select 'id' in the dropdown list and
>>> insert '*' (wildcard).
>>> 4) Create a view by selecting 'ID' and 'Title', then click the  
>>> search
>>> button.
>>> 5) Click the save results button.
>>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>>> entries
>>> to download' field. Press 'save'.
>>>
>>> If the download is slow, read the 'download tips' on the download  
>>> page and
>>> split the results in chunks.
>>>
>>> --
>>> Oliver
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>>
>
> -- 
> Dipl.-Inform.(FH)
> Bernd Mueller
> phone: +49 179 2336692
> email: bernd at kirx.de
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bernd at kirx.de  Mon Aug 20 15:03:29 2007
From: bernd at kirx.de (Bernd Mueller)
Date: Mon, 20 Aug 2007 21:03:29 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
Message-ID: <46C9E581.1010907@kirx.de>

I attached my script.

Actually I tried to download all articles to a certain search term with
that script. The problem was that the retrieved documents were not free
as mentioned in the documentation of EUtilities on the NCBI page. So
many of the downloaded documents in xml-format were just dummies
containing only the abstract but not the fulltext article.

Bernd

Chris Fields wrote:
> Just curious, but what kind of query were you trying?  It might be worth 
> trying to work through it to add as an example to the cookbook page.
> 
> chris
> 
> On Aug 20, 2007, at 11:57 AM, Bernd Mueller wrote:
> 
>> Hello,
>>
>> Maybe you wanna try the Database-EUtilities module from bioperl. They
>> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>>
>> I tried them for a similar search on pubmed but without any reasonable
>> results because my target was too focused.
>>
>>  From EUtilities documentation on
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases 
>>
>>
>> "Protein Database
>>
>> The Protein database contains sequence data from the translated coding
>> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
>> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
>> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
>> (sequences from solved structures). "
>>
>> So PDB is included in eutilities from NCBI.
>>
>> Regards,
>> Bernd
>>
>> neeti somaiya wrote:
>>> Thanks for your response.
>>> Actually I am looking for something standalone and not on the web, as in
>>> something which I can download onto my machine and parse later to get 
>>> id and
>>> title.
>>>
>>> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>>>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>>>> Another question I had was, I am interested only in pdb id and title,
>>>> and
>>>>> for this I am downloading and unzipping each of the full pdb structure
>>>>> files, parsing to get just id and title. Is there any other data 
>>>>> source
>>>> Hi Neeti,
>>>> this is a non bioperl way to download the data.
>>>> Use the SRS server on the EBI page to download only id and title lines
>>>> from
>>>> pdb.
>>>>
>>>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>>>> 2) Search for 'PDB' on the 'library page' and select it.
>>>> 3) Use the standard query form. Select 'id' in the dropdown list and
>>>> insert '*' (wildcard).
>>>> 4) Create a view by selecting 'ID' and 'Title', then click the search
>>>> button.
>>>> 5) Click the save results button.
>>>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>>>> entries
>>>> to download' field. Press 'save'.
>>>>
>>>> If the download is slow, read the 'download tips' on the download 
>>>> page and
>>>> split the results in chunks.
>>>>
>>>> -- 
>>>> Oliver
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>>
>>>
>>
>> --Dipl.-Inform.(FH)
>> Bernd Mueller
>> phone: +49 179 2336692
>> email: bernd at kirx.de
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 
> 

-- 
Dipl.-Inform.(FH)
Bernd Mueller
phone: +49 179 2336692
email: bernd at kirx.de


-------------- next part --------------
A non-text attachment was scrubbed...
Name: myBioPerl.pl
Type: application/x-perl
Size: 1983 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070820/af579f0a/attachment-0003.bin>

From jayoung at fhcrc.org  Mon Aug 20 18:09:04 2007
From: jayoung at fhcrc.org (Janet Young)
Date: Mon, 20 Aug 2007 15:09:04 -0700
Subject: [Bioperl-l] Assembly::IO write_assembly and remove_seq
Message-ID: <EE800ED8-52E7-4D80-A18F-EDBABB90056C@fhcrc.org>

Hi all,

I realized last week that write_assembly isn't implemented in  
Assemble::IO
(see http://bioperl.org/pipermail/bioperl-l/2006-May/021619.html )
I know this has been asked before, but I wondered if anything has  
changed - does anyone have any plans to write a write_assembly  
method? Alternatively, any suggestions for an alternative solution to  
what I'm trying to do?

I'm trying to write a script to make improvements to the assembly  
that phredPhrap comes out with - it seems to quite frequently throw  
an unrelated sequence into a contig with either no matching sequence  
at all, or very little matching sequence. Mysterious. Anyway, my  
script can recognize the bad sequences easily enough, and thought I'd  
be able to remove them and then write the modified assembly. No joy.  
One very inelegant solution I've played with is that I can add some  
"markedHighQuality" tags to the discrepant sequences in the ace file,  
meaning that next time phredPhrap is run, it sometimes manages not to  
assemble the sequences that shouldn't be there. I'm not sure this  
will work in all cases, and it seems like quite an unsatisfactory way  
to do it.

For the same reason, I'm hoping someone can tell me what remove_seq  
does to a contig object? I'm using it and I don't get any error  
messages (returns 1), but when I check the contig object afterwards  
with get_seq_ids, the sequence I wanted to remove didn't seem to go  
away. Also, when I check out the primary_tags for that contig in the  
objects returned by get_features_collection, nothing seems to have  
changed. So I'm not sure whether the sequence really was removed from  
anything at all, and if it was, which object did it get removed  
from?  (a snippet of my code is below)
           my @seqids  = $contig->get_seq_ids();
           print OUT "seqids @seqids\n";
           my $seqobj = $contig->get_seq_by_name($seq);
           $contig->remove_seq($seqobj) || die "failed to remove seq\n";
           @seqids  = $contig->get_seq_ids();
           print OUT "seqids @seqids\n";

thanks for any advice,

Janet Young


-------------------------------------------------------------------

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung at fhcrc.org

http://www.fhcrc.org/labs/trask/

-------------------------------------------------------------------


From cjfields at uiuc.edu  Tue Aug 21 00:06:26 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 20 Aug 2007 23:06:26 -0500
Subject: [Bioperl-l] EUtilities, was Re:  PDB Parser
In-Reply-To: <46C9E581.1010907@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
	<46C9E581.1010907@kirx.de>
Message-ID: <7BE17595-9BC0-498B-AFA9-03ED0C853BFC@uiuc.edu>

Bernd,

Just in case you weren't aware, I have changed several aspects of  
EUtilities since the 1.5.2 release, so any code in the HOWTO cookbook  
applies ONLY to the version found in CVS (there is a big note at the  
top stating such).  This should be the finalized API which I intend  
on supporting from this point on.  The reason I indicate that is  
there are several giveaways which indicate you are using the older  
API from 1.5.2 (using next_cookie, for instance).

The following modification of your script (using the API in bioperl- 
live) works for me.  You should be able to do something similar with  
the older API as well but I haven't tried.  Note that PMC full-text  
retrieval only works if the article is declared 'open-access'; not  
all journals allow that.  Also, any full-text is only available as  
XML which (I'm guessing here) is transformed to HTML for PMC.

....
my $agent = Bio::DB::EUtilities->new(-eutil      => 'esearch',
-db         => $db,
-term       => $query,
-usehistory => 'y');

my $ct = $agent->get_count;

print "Count = $ct\n";

my $history = $agent->next_History;

if ($fetch eq 'yes') {
   my ($retmax, $retstart) = (1,0);
   while ($retstart < $ct) {
	  $agent->set_parameters(
               -eutil => 'efetch',
               -history => $history,
               -rettype => 'xml',
               -retmax => $retmax,
               -retstart => $retstart,
		  );
           $agent->get_Response(-file => ">./papers/paper_ 
$retstart.xml");
           $retstart += $retmax;
   }
}

------------------------------

It may also be possible to grab the LinkOut for these and try to nab  
the PDF or use the DOI, but I haven't tried anything like that.

chris

On Aug 20, 2007, at 2:03 PM, Bernd Mueller wrote:

> I attached my script.
>
> Actually I tried to download all articles to a certain search term  
> with
> that script. The problem was that the retrieved documents were not  
> free
> as mentioned in the documentation of EUtilities on the NCBI page. So
> many of the downloaded documents in xml-format were just dummies
> containing only the abstract but not the fulltext article.
>
> Bernd
>
> Chris Fields wrote:
>> Just curious, but what kind of query were you trying?  It might be  
>> worth trying to work through it to add as an example to the  
>> cookbook page.
>> chris


From n.haigh at sheffield.ac.uk  Tue Aug 21 04:19:59 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 21 Aug 2007 09:19:59 +0100
Subject: [Bioperl-l] subversion progress
Message-ID: <46CAA02F.60803@sheffield.ac.uk>

Hi,

I was just wondering if there was any further progress towards the svn
migration recently? What is still needing to be done?

Cheers
Nath


From neetisomaiya at gmail.com  Tue Aug 21 05:41:22 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Tue, 21 Aug 2007 15:11:22 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
Message-ID: <764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>

Hi,

I wanted to automate my pdb script, right from downloading of data. As per
the solution given by RCSB about custom report for pdb ids and titles only,
I was trying something like the code below, but it doesnt seem to work :-

my $url = '
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv';
use LWP::Simple;
my $content = get $url;
die "Couldn't get $url" unless defined $content;

Can anyone tell how I can do it, if there is any other way to do it, or if I
am going wrong somewhere, or if it is't possible for this case at all.

Please help.

On 8/20/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> Thanks for all the responses.
> I got the solution from RCBS people :-
>
> Dear Dr. Somaiya,
>
> Thank you for your email message.
>
> Please try the following:
> 1) Go to http://www.pdb.org/pdb/statistics/holdings.do and select the
> number in the bottom right corner of the table (currently 45213).
> 2) From the menu on the left select 'Tabulate'>>'Custom Report' and
> under 'Primary Citation' select 'Title'
> 3) At the bottom, select 'Create Report' and then one of the 'Download'
> options.
>
> Please let us know if we can be of additional assistance.
>
> Sincerely,
> Rachel Green
>
> On 8/20/07, Bernd Mueller <bernd at kirx.de> wrote:
> >
> > Hello,
> >
> > Maybe you wanna try the Database-EUtilities module from bioperl. They
> > are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
> >
> > I tried them for a similar search on pubmed but without any reasonable
> > results because my target was too focused.
> >
> > From EUtilities documentation on
> >
> > http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases
> >
> > "Protein Database
> >
> > The Protein database contains sequence data from the translated coding
> > regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
> >
> > sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> > Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> > (sequences from solved structures). "
> >
> > So PDB is included in eutilities from NCBI.
> >
> > Regards,
> > Bernd
> >
> > neeti somaiya wrote:
> > > Thanks for your response.
> > > Actually I am looking for something standalone and not on the web, as
> > in
> > > something which I can download onto my machine and parse later to get
> > id and
> > > title.
> > >
> > > On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
> > >> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> > >>> Another question I had was, I am interested only in pdb id and
> > title,
> > >> and
> > >>> for this I am downloading and unzipping each of the full pdb
> > structure
> > >>> files, parsing to get just id and title. Is there any other data
> > source
> > >> Hi Neeti,
> > >> this is a non bioperl way to download the data.
> > >> Use the SRS server on the EBI page to download only id and title
> > lines
> > >> from
> > >> pdb.
> > >>
> > >> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk ).
> > >> 2) Search for 'PDB' on the 'library page' and select it.
> > >> 3) Use the standard query form. Select 'id' in the dropdown list and
> > >> insert '*' (wildcard).
> > >> 4) Create a view by selecting 'ID' and 'Title', then click the search
> > >> button.
> > >> 5) Click the save results button.
> > >> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> > >> entries
> > >> to download' field. Press 'save'.
> > >>
> > >> If the download is slow, read the 'download tips' on the download
> > page and
> > >> split the results in chunks.
> > >>
> > >> --
> > >> Oliver
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >>
> > >
> > >
> > >
> >
> > --
> > Dipl.-Inform.(FH)
> > Bernd Mueller
> > phone: +49 179 2336692
> > email: bernd at kirx.de
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
>
>
>
> --
> -Neeti
> Even my blood says, B positive
>


-- 
-Neeti
Even my blood says, B positive


From cjfields at uiuc.edu  Tue Aug 21 10:40:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 09:40:03 -0500
Subject: [Bioperl-l] subversion progress
In-Reply-To: <46CAA02F.60803@sheffield.ac.uk>
References: <46CAA02F.60803@sheffield.ac.uk>
Message-ID: <5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>

Not sure myself, to tell the truth.  Pretty much everything was ready  
to go (i.e. svn commits work, commits post to bioperl-guts, etc.);  
the only possible exception was svn->cvs syncing.  I believe the  
decision for svn access is to stick with ssh only for now for  
simplicity's sake.  I may have to go back into the archives to  
refresh my memory on all the details...

I think a time for the switchover just has to be set so that  
everybody is adequately forewarned, and the docs for getting started  
on SVN need to be updated accordingly.

chris

On Aug 21, 2007, at 3:19 AM, Nathan Haigh wrote:

> Hi,
>
> I was just wondering if there was any further progress towards the svn
> migration recently? What is still needing to be done?
>
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jwalker at watson.wustl.edu  Tue Aug 21 11:20:46 2007
From: jwalker at watson.wustl.edu (Jason Walker)
Date: Tue, 21 Aug 2007 10:20:46 -0500
Subject: [Bioperl-l] RemoteBlast not handling NCBI Error message
Message-ID: <46CB02CE.1080803@watson.wustl.edu>

I've noticed RemoteBlast does not handle a specific error message from 
NCBI correctly.  retrieve_blast() should return 0 if waiting, -1 on 
error, or the results when completed.  It looks like the method relies 
on a specific tag in the NCBI return,  'QBlastInfoBegin'.  The error 
message I'm getting does not have this tag or a value of 
'Status=ERROR'.  After contacting NCBI 'Blast-help', they stated that 
QBlastInfoBegin should not be expected from all GET requests.  The error 
can be reproduced by using RID CM2YJJW501R, until it expires tomorrow.

my $rid = 'CM2YJJW501R';
my $factory = Bio::Tools::Run::RemoteBlast->new( -verbose => 1,);
my $rc = $factory->retrieve_blast($rid);
print $rc ."\n";

The content returned from NCBI looks like:
<hr><font color="red">ERROR: An error has occurred on the server, Too 
many HSPs to save all
 Contact Blast-help at ncbi.nlm.nih.gov and include your RID: 
CM2YJJW501R</font><hr>

I added a conditional statement as seen below to correct my local copy.  
I'm not sure this is the best fix, but it works.
sub retrieve_blast {
    ...
    if( /QBlastInfoBegin/i ) {
        $s = 1;
    } elsif( $s ) {
        if( /Status=(WAITING|ERROR|READY)/i ) {
            ...
         }
    } elsif( /^(?:#\s)?[\w-]*?BLAST\w+/ ) {
        $waiting = 0;
        last;
    } elsif ( /ERROR/i ) {
        close($TMP);
        open(my $ERR, "<$tempfile") or $self->throw("cannot open file 
$tempfile");
        $self->warn(join("", <$ERR>));
        close $ERR;
        return -1;
    }
    ...
}

Thanks,
Jason Walker


From cjfields at uiuc.edu  Tue Aug 21 12:15:36 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 11:15:36 -0500
Subject: [Bioperl-l] RemoteBlast not handling NCBI Error message
In-Reply-To: <46CB02CE.1080803@watson.wustl.edu>
References: <46CB02CE.1080803@watson.wustl.edu>
Message-ID: <348D8645-5DC2-4606-9650-EB08D8053F3D@uiuc.edu>


On Aug 21, 2007, at 10:20 AM, Jason Walker wrote:

> I've noticed RemoteBlast does not handle a specific error message from
> NCBI correctly.  retrieve_blast() should return 0 if waiting, -1 on
> error, or the results when completed.  It looks like the method relies
> on a specific tag in the NCBI return,  'QBlastInfoBegin'.  The error
> message I'm getting does not have this tag or a value of
> 'Status=ERROR'.  After contacting NCBI 'Blast-help', they stated that
> QBlastInfoBegin should not be expected from all GET requests.  The  
> error
> can be reproduced by using RID CM2YJJW501R, until it expires tomorrow.
> ...
> I added a conditional statement as seen below to correct my local  
> copy.
> I'm not sure this is the best fix, but it works.
> sub retrieve_blast {
>     ...
>     if( /QBlastInfoBegin/i ) {
>         $s = 1;
>     } elsif( $s ) {
>         if( /Status=(WAITING|ERROR|READY)/i ) {
>             ...
>          }
>     } elsif( /^(?:#\s)?[\w-]*?BLAST\w+/ ) {
>         $waiting = 0;
>         last;
>     } elsif ( /ERROR/i ) {
>         close($TMP);
>         open(my $ERR, "<$tempfile") or $self->throw("cannot open file
> $tempfile");
>         $self->warn(join("", <$ERR>));
>         close $ERR;
>         return -1;
>     }
>     ...
> }
>
> Thanks,
> Jason Walker

I have added this to RemoteBlast in bioperl cvs.  Thanks for the notice!

chris


From bernd.web at gmail.com  Tue Aug 21 12:32:09 2007
From: bernd.web at gmail.com (Bernd Web)
Date: Tue, 21 Aug 2007 18:32:09 +0200
Subject: [Bioperl-l] SearchIO-BLAST
Message-ID: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>

Dear all,

Recently, I stumbled on something with parsing BLAST reports.  To a
plain text blast report from NCBI a ">aaa" got prepended. This
(fasta-like header) changes the $result->hits array.
The amount of hits is now 2*num_hits + 1. Clearly, this is related to
faulty input, but still the effect of this line is great. Does someone
see what is causing this, and should the BLAST parser maybe be
slightly more relaxed wrt pre/appended text? I have not seen yet why
this extra fastaheader line has such a "large" effect.

A short example BLASTN output is attached.
Example code is:

use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast',
                           -file   => 'apoe_plain.bls');
while( my $result = $in->next_result ) {
  print "Num of hits: ", $result->num_hits, "\n";
  my @hits = $result->hits;
  foreach my $el (@hits) {
  	print $el->name, "\n";
  }


Kind regards,
Bernd
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apoe_plain.bls
Type: application/octet-stream
Size: 7890 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070821/a367809e/attachment-0003.obj>

From cjfields at uiuc.edu  Tue Aug 21 17:53:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 16:53:44 -0500
Subject: [Bioperl-l] SearchIO-BLAST
In-Reply-To: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>
References: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>
Message-ID: <59FF775C-8CAC-4947-A5BA-835ADD45CD32@uiuc.edu>

I can confirm this (I'm using bioperl-live).  The output I get is:

Num of hits: 9
ref|NM_000039.1|
ref|NT_113960.1|Hs22_111679
ref|NT_033899.7|Hs11_34054
ref|NW_925173.1|HsCraAADB02_444
ref|NM_000039.1|
ref|NT_113960.1|Hs22_111679
ref|NT_033899.7|Hs11_34054
ref|NW_925173.1|HsCraAADB02_444
ref|NW_925173.1|HsCraAADB02_444

The extra '>' is definitely throwing the event calls for a loop; the  
2x increase is b/c an extra iteration is started when '>' is  
encountered (changing the event handler reduces the number to 5).   
The extra hit is from the '>' at the beginning.

I hate to say it, but this is an instance where we can't be more  
flexible, primarily b/c '>' is a legit token the parser looks for (it  
is the beginning of the hit block in reports).  Finding it as the  
initial token in the report is also legitimate for some older BLAST  
output, so we also can't simply bypass it.  You'll unfortunately have  
to preparse the reports to get rid of those lines prior to feeding  
them to the BLAST text report parser.

chris

On Aug 21, 2007, at 11:32 AM, Bernd Web wrote:

> Dear all,
>
> Recently, I stumbled on something with parsing BLAST reports.  To a
> plain text blast report from NCBI a ">aaa" got prepended. This
> (fasta-like header) changes the $result->hits array.
> The amount of hits is now 2*num_hits + 1. Clearly, this is related to
> faulty input, but still the effect of this line is great. Does someone
> see what is causing this, and should the BLAST parser maybe be
> slightly more relaxed wrt pre/appended text? I have not seen yet why
> this extra fastaheader line has such a "large" effect.
>
> A short example BLASTN output is attached.
> Example code is:
>
> use Bio::SearchIO;
> my $in = new Bio::SearchIO(-format => 'blast',
>                            -file   => 'apoe_plain.bls');
> while( my $result = $in->next_result ) {
>   print "Num of hits: ", $result->num_hits, "\n";
>   my @hits = $result->hits;
>   foreach my $el (@hits) {
>   	print $el->name, "\n";
>   }
>
>
> Kind regards,
> Bernd
> <apoe_plain.bls>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Tue Aug 21 23:03:55 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 21 Aug 2007 23:03:55 -0400
Subject: [Bioperl-l] subversion progress
In-Reply-To: <5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>
References: <46CAA02F.60803@sheffield.ac.uk>
	<5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>
Message-ID: <51A5996D-A976-47FD-8807-20F6EBAF9E42@gmx.net>


On Aug 21, 2007, at 10:40 AM, Chris Fields wrote:

> I think a time for the switchover just has to be set so that
> everybody is adequately forewarned, and the docs for getting started
> on SVN need to be updated accordingly.

That was my recollection too. -hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Wed Aug 22 03:51:42 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 22 Aug 2007 08:51:42 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>	<46C9C7F8.3020608@kirx.de>	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
	<764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
Message-ID: <46CBEB0E.8030200@sendu.me.uk>

neeti somaiya wrote:
> Hi,
> 
> I wanted to automate my pdb script, right from downloading of data. As per
> the solution given by RCSB about custom report for pdb ids and titles only,
> I was trying something like the code below, but it doesnt seem to work :-
> 
> my $url = '
> http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
> VStructureSummary.structureId~VCitation.title&format=csv';
> use LWP::Simple;
> my $content = get $url;
> die "Couldn't get $url" unless defined $content;
> 
> Can anyone tell how I can do it, if there is any other way to do it, or if I
> am going wrong somewhere, or if it is't possible for this case at all.

Use LWP::UserAgent so you can see what's going on.

my $ua = LWP::UserAgent->new;
$ua->timeout(10);
my $response = $ua->get($url);
if ($response->is_success) {
   print $response->content;
}
else {
   die $response->status_line;
}


Gives:
500 Internal Server Error

Most likely the server is expecting some kind of cookie and falls over 
when you try to visit that url without it. So start where they told you 
to and grab pages successively, keeping any cookies.


From neetisomaiya at gmail.com  Wed Aug 22 06:06:38 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 22 Aug 2007 15:36:38 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46CBEB0E.8030200@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
	<764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
	<46CBEB0E.8030200@sendu.me.uk>
Message-ID: <764978cf0708220306u77cedf22xdd132b324e306f33@mail.gmail.com>

Thanks a lot. It worked for me.

use LWP::UserAgent;
use HTTP::Cookies;

$ua = LWP::UserAgent->new;
$ua->cookie_jar(HTTP::Cookies->new(file => "lwpcookies.txt",
                                     autosave => 1));

$request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=HoldingsQuery&moleculeType=ignore&experimentalMethod=ignore'
);

$response = $ua->request($request);

if ($response->is_success)
{
        print "\nSuccessfully connected to url
http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=HoldingsQuery&moleculeType=ignore&experimentalMethod=ignore\n
";

        $request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/results/tabularForm.do');

        $response = $ua->request($request);

        if ($response->is_success)
        {
                print "\nSuccessfully connected to url
http://www.pdb.org/pdb/results/tabularForm.do\n";

                $request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv');

                $response = $ua->request($request);

                if ($response->is_success)
                {
                        print "\nSuccessfully connected to url
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv\n";
                       open(FH,">tabularResults.csv");
                        print FH $response->content;
                        close(FH);
                }
                else
                {
                        die $response->status_line;
                }
        }
        else
        {
                die $response->status_line;
        }
}
else
{
  die $response->status_line;
}


On 8/22/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> neeti somaiya wrote:
> > Hi,
> >
> > I wanted to automate my pdb script, right from downloading of data. As
> per
> > the solution given by RCSB about custom report for pdb ids and titles
> only,
> > I was trying something like the code below, but it doesnt seem to work
> :-
> >
> > my $url = '
> >
> http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
> > VStructureSummary.structureId~VCitation.title&format=csv';
> > use LWP::Simple;
> > my $content = get $url;
> > die "Couldn't get $url" unless defined $content;
> >
> > Can anyone tell how I can do it, if there is any other way to do it, or
> if I
> > am going wrong somewhere, or if it is't possible for this case at all.
>
> Use LWP::UserAgent so you can see what's going on.
>
> my $ua = LWP::UserAgent->new;
> $ua->timeout(10);
> my $response = $ua->get($url);
> if ($response->is_success) {
>    print $response->content;
> }
> else {
>    die $response->status_line;
> }
>
>
> Gives:
> 500 Internal Server Error
>
> Most likely the server is expecting some kind of cookie and falls over
> when you try to visit that url without it. So start where they told you
> to and grab pages successively, keeping any cookies.
>


-- 
-Neeti
Even my blood says, B positive


From jay at jays.net  Wed Aug 22 08:54:29 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 22 Aug 2007 07:54:29 -0500
Subject: [Bioperl-l] wiki: Current Events
Message-ID: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>

http://www.bioperl.org/wiki/Main_Page

Please change:

< BOSC 2007 will be held July 19-20, 2007
 > BOSC 2007 was held July 19-20, 2007

I'd change it but the page is locked. Even when I'm logged in.   :)

Thanks,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Wed Aug 22 09:58:32 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 22 Aug 2007 08:58:32 -0500
Subject: [Bioperl-l] wiki: Current Events
In-Reply-To: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>
References: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>
Message-ID: <A7C5314E-662C-4160-85B1-0225B95C0BD2@uiuc.edu>

Done.

chris

On Aug 22, 2007, at 7:54 AM, Jay Hannah wrote:

> http://www.bioperl.org/wiki/Main_Page
>
> Please change:
>
> < BOSC 2007 will be held July 19-20, 2007
>> BOSC 2007 was held July 19-20, 2007
>
> I'd change it but the page is locked. Even when I'm logged in.   :)
>
> Thanks,
>
> Jay Hannah
> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Wed Aug 22 15:45:42 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Thu, 23 Aug 2007 01:15:42 +0530 (IST)
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw image according to
 input file ?
In-Reply-To: <A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
Message-ID: <44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>

Dear All,

Is there any option in Bio::Graphics to draw image based on the hits as
explained in the hits file.

For example I am using an input file:
# hit   score   start   end
Query   0       1       101
Sequence_Segment_1      0       1       101
PD:LRR_1|CS:AAC34139        0.16        1        23
PD:LRR_1|CS:AAC34139        3.6        1        22
PD:LRR_1|CS:AAC34139        1.8        1        22
PD:LRR_1|CS:AAC34139        1.3        1        22
PD:LRR_1|CS:XP_640228        2.5        2        23
..... Cropped
PD:LRR_1|CS:NP_611007        55        3        23
PD:LRR_1|CS:NP_611007        3.7        3        24
PD:LRR_1|CS:NP_611007        4.5        3        24
PD:LRR_1|CS:NP_611007        0.71        3        24
If you look at the image, you can see that, its all jumbled up and it
doesnt make any sense in the first look. I am looking for an option to
draw each of the  glyph one by one (say \n), rather that accomodating it
internally by the Bio::Graphics.

PS. Image is attached with this mail.
I am using  Dr. L. Stein's example :

use strict;
use Bio::Graphics;
use Bio::SeqFeature::Generic;
my $panel = Bio::Graphics::Panel->new(-length => 700,
                                      -width  => 800,
                                      -pad_left => 10,
                                      -pad_right => 10,
                                     );

my $full_length = Bio::SeqFeature::Generic->new(-start=>1,-end=>700);
$panel->add_track($full_length,
                  -glyph   => 'arrow',
                  -tick    => 2,
                  -fgcolor => 'black',
                  -double  => 1,
                 );

my $track = $panel->add_track(
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.png
Type: image/png
Size: 27974 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/be285f43/attachment-0003.png>

From cjfields at uiuc.edu  Thu Aug 23 00:53:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 22 Aug 2007 23:53:55 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
Message-ID: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>

As many of the devs know, there are a number of Feature/Annotation  
issues that need to be resolved prior to a 1.6 release:

http://www.bioperl.org/wiki/Release_Schedule#SeqFeature. 
2FAnnotation_changes:_Keep_or_roll_back.3F

There has been little work done over the last 2 1/2 years to undo or  
rectify problems associated with those additions; I feel like those  
of us still routinely contributing have been left holding the bag.   
There has also been very little attempt to document any of this  
adequately enough; as an example see POD for  
Bio::SeqFeature::Annotated (what little there is).

I would like to suggest the radical idea of rolling back AnnotatableI/ 
SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags  
are simple scalars) and possibly work in implementing Ewan's  
SeqFeature::TypedSeqFeatureI for those who want strong data types  
(i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various  
AnnotatableI changes, odd inheritance, and operator overloading have  
really obfuscated the code to the point where no one wants to touch  
it in case it breaks something important.  However, I believe it is  
the one serious impediment to a new stable release.

My thought is we simplify all the relevant interfaces, essentially  
reverting back to rel 1.4.  For instance, we move the various  
Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.   
Bio::SeqFeature::Annotated would implement Bio::AnnotatableI  
directly, and (if needed) also implement  
Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on  
Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI  
methods correctly, just as any other class would when implementing an  
abstract interface.  I have played around with this a bit and managed  
to get most tests working again for Bio::SeqFeature::Generic and  
FeatureIO but a number of others break.

If needed I can try this out on a branch (a bit ironic, since the  
changes instigating this mess should have been tested on a branch!).   
Maybe this will get the ball rolling towards a 1.6 release.  Any  
thoughts?

chris


From shameer at ncbs.res.in  Thu Aug 23 03:06:34 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Thu, 23 Aug 2007 12:36:34 +0530 (IST)
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw image
 according to input file ?
In-Reply-To: <44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
	<44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
Message-ID: <34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>

Dear All,

I will make my question simple :
Is there any way to force the 'Bio::graphics' module to print only one
glyph in a track ?

PS. More Detailed explanation is in my earlier mail (Dont want to spam the
community with my same mail)

Eagerly waiting for a reply.
Thanks,
-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From cain.cshl at gmail.com  Thu Aug 23 04:54:40 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 23 Aug 2007 04:54:40 -0400
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw
	image	according to input file ?
In-Reply-To: <34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
	<44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
	<34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>
Message-ID: <1187859296.2546.6.camel@103.48.216.10.in-addr.arpa>

Shameer,

I don't think that's really what you want.  It seems to me that sorting
them in some useful way (say, by score) would make more sense.  There is
an example using the -sort_order option in Lincoln's howto.

Scott


On Thu, 2007-08-23 at 12:36 +0530, Shameer Khadar wrote:
> Dear All,
> 
> I will make my question simple :
> Is there any way to force the 'Bio::graphics' module to print only one
> glyph in a track ?
> 
> PS. More Detailed explanation is in my earlier mail (Dont want to spam the
> community with my same mail)
> 
> Eagerly waiting for a reply.
> Thanks,
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/6066f0ec/attachment-0003.bin>

From cjfields at uiuc.edu  Thu Aug 23 10:14:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 09:14:51 -0500
Subject: [Bioperl-l] extra rel. 1.6 suggestion
Message-ID: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>

Some interesting points by Sendu:

http://www.bioperl.org/wiki/Release_Schedule#Need_tests

which I agree with completely.

Maybe the best way out if this is a variation on something that was  
suggested before, which was 'splitting' the code into groups.  What  
if we set up a way to automatically gauge test coverage,  
documentation, etc.?  If I remember correctly Nathan had something  
running at one point which did this.

If so, we could determine which code is potentially 'non-compliant'  
and needs to be fixed (tests added, docs brought up to spec, so on),  
and thus prioritize at the minimum what needs to be done for a 1.6  
release.  If it's deemed not worth worrying about (no active  
development, author is out of contact, we have more important  
priorities) we split that code off into a separate 'dev' package.   
That would save some of the headache of trying to split maintenance  
of ~1000 modules up on only a few devs.

Thoughts?

chris


From bix at sendu.me.uk  Thu Aug 23 10:57:21 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 23 Aug 2007 15:57:21 +0100
Subject: [Bioperl-l] extra rel. 1.6 suggestion
In-Reply-To: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
References: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
Message-ID: <46CDA051.40408@sendu.me.uk>

Chris Fields wrote:
> Maybe the best way out if this is a variation on something that was  
> suggested before, which was 'splitting' the code into groups.  What  
> if we set up a way to automatically gauge test coverage,  
> documentation, etc.?  If I remember correctly Nathan had something  
> running at one point which did this.

You can generate this yourself by doing
./Build testcover

Mauricio was going to sort out having this run daily with the results 
displayed on the website... Mauricio?

The major 'annoyance' is that the coverage results don't get generated 
if any test fails. But they shouldn't be failing anyway ;)


From cain.cshl at gmail.com  Thu Aug 23 15:53:37 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 23 Aug 2007 15:53:37 -0400
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
Message-ID: <1187898817.2562.19.camel@localhost.localdomain>

Hi Chris,

GBrowse would be unaffected by this as it doesn't use
Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
hand will almost certainly break horribly, as it depends on the strong
typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
your ideas out in a branch that I could checkout and test on, that would
be good.

Thanks,
Scott


On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
> As many of the devs know, there are a number of Feature/Annotation  
> issues that need to be resolved prior to a 1.6 release:
> 
> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature. 
> 2FAnnotation_changes:_Keep_or_roll_back.3F
> 
> There has been little work done over the last 2 1/2 years to undo or  
> rectify problems associated with those additions; I feel like those  
> of us still routinely contributing have been left holding the bag.   
> There has also been very little attempt to document any of this  
> adequately enough; as an example see POD for  
> Bio::SeqFeature::Annotated (what little there is).
> 
> I would like to suggest the radical idea of rolling back AnnotatableI/ 
> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags  
> are simple scalars) and possibly work in implementing Ewan's  
> SeqFeature::TypedSeqFeatureI for those who want strong data types  
> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various  
> AnnotatableI changes, odd inheritance, and operator overloading have  
> really obfuscated the code to the point where no one wants to touch  
> it in case it breaks something important.  However, I believe it is  
> the one serious impediment to a new stable release.
> 
> My thought is we simplify all the relevant interfaces, essentially  
> reverting back to rel 1.4.  For instance, we move the various  
> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.   
> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI  
> directly, and (if needed) also implement  
> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on  
> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI  
> methods correctly, just as any other class would when implementing an  
> abstract interface.  I have played around with this a bit and managed  
> to get most tests working again for Bio::SeqFeature::Generic and  
> FeatureIO but a number of others break.
> 
> If needed I can try this out on a branch (a bit ironic, since the  
> changes instigating this mess should have been tested on a branch!).   
> Maybe this will get the ball rolling towards a 1.6 release.  Any  
> thoughts?
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/11ce47d3/attachment-0003.bin>

From N.Haigh at sheffield.ac.uk  Thu Aug 23 16:32:12 2007
From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 23 Aug 2007 21:32:12 +0100
Subject: [Bioperl-l] extra rel. 1.6 suggestion
In-Reply-To: <46CDA051.40408@sendu.me.uk>
References: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
	<46CDA051.40408@sendu.me.uk>
Message-ID: <1187901132.46cdeeccce68d@webmail.shef.ac.uk>

Quoting Sendu Bala <bix at sendu.me.uk>:

> Chris Fields wrote:
> > Maybe the best way out if this is a variation on something that was  
> > suggested before, which was 'splitting' the code into groups.  What  
> > if we set up a way to automatically gauge test coverage,  
> > documentation, etc.?  If I remember correctly Nathan had something  
> > running at one point which did this.
> 
> You can generate this yourself by doing
> ./Build testcover

What I did was to patch Devel::Cover to include JavaScript to allow soring of the results by clicking a header in the table. This way, it was easier
to find those modules with poor POD coverage, and any other coverage metric. The developer(s) of Devel::Cover are introducing this into their next
release, ut who knows when that release will be. I could provide a diff, but we may be able to check out Devel::Cover from cvs/svn until the 0.62 is
made.

> 
> Mauricio was going to sort out having this run daily with the results 
> displayed on the website... Mauricio?
> 
> The major 'annoyance' is that the coverage results don't get generated 
> if any test fails. But they shouldn't be failing anyway ;)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Thu Aug 23 17:33:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 16:33:25 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <1187898817.2562.19.camel@localhost.localdomain>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<1187898817.2562.19.camel@localhost.localdomain>
Message-ID: <38B989E4-34CA-42CD-A608-9D2A095E7ADF@uiuc.edu>

Scott,

So far most of FeatureIO.t passes, with only a few exceptions dealing  
with the from_feature method (I know what the problem is there).  A  
large number of other tests crash horribly (not so surprising), so  
I'll have to go through those.  Ergo any changes and testing will  
definitely be conducted on a branch then merged back to main trunk  
once everything is okay.  I'll probably start a branch in the next  
few days or so.

Here's what I have been working on so far, which I think is reasonable:

1) Move all *_tag_* related methods out of Bio::AnnotatableI and into  
Bio::SeqFeature::Annotatable.

2) Reinstate the same tag methods in Bio::SeqFeatureI and remove  
Bio::AnnotatableI from the inheritance tree.

3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it  
already was, strangely enough).  Now it simple implements the proper  
methods from the interface classes SeqFeatureI and AnnotatableI.

4) Revert Bio::SeqFeature::Generic tags back to simple untyped  
strings (reimplement the 1.4 rel methods).

I'm interested in seeing whether this results in a significant  
performance increase in SeqIO since the Annotation instantiation is  
removed.

ToDo: I plan on removing the operator overloading in Bio::Annotation,  
which was a serious sticking point with most of the devs.  This won't  
be done until after tests pass for everything else.

What we will need at some point which I can't provide:  
Bio::SeqFeature::Annotated has no docs (no synopsis, no  
description).  The reason I bring this up is Sendu and I are  
seriously considering running an automated code audits in order to  
gauge which modules lack docs, test coverage, etc..  We're likely  
splitting those without adequate test/doc coverage off into a  
separate 'dev' release.

chris

On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:

> Hi Chris,
>
> GBrowse would be unaffected by this as it doesn't use
> Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
> hand will almost certainly break horribly, as it depends on the strong
> typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
> your ideas out in a branch that I could checkout and test on, that  
> would
> be good.
>
> Thanks,
> Scott
>
>
> On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
>> As many of the devs know, there are a number of Feature/Annotation
>> issues that need to be resolved prior to a 1.6 release:
>>
>> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
>> 2FAnnotation_changes:_Keep_or_roll_back.3F
>>
>> There has been little work done over the last 2 1/2 years to undo or
>> rectify problems associated with those additions; I feel like those
>> of us still routinely contributing have been left holding the bag.
>> There has also been very little attempt to document any of this
>> adequately enough; as an example see POD for
>> Bio::SeqFeature::Annotated (what little there is).
>>
>> I would like to suggest the radical idea of rolling back  
>> AnnotatableI/
>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>> are simple scalars) and possibly work in implementing Ewan's
>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
>> AnnotatableI changes, odd inheritance, and operator overloading have
>> really obfuscated the code to the point where no one wants to touch
>> it in case it breaks something important.  However, I believe it is
>> the one serious impediment to a new stable release.
>>
>> My thought is we simplify all the relevant interfaces, essentially
>> reverting back to rel 1.4.  For instance, we move the various
>> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
>> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
>> directly, and (if needed) also implement
>> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
>> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
>> methods correctly, just as any other class would when implementing an
>> abstract interface.  I have played around with this a bit and managed
>> to get most tests working again for Bio::SeqFeature::Generic and
>> FeatureIO but a number of others break.
>>
>> If needed I can try this out on a branch (a bit ironic, since the
>> changes instigating this mess should have been tested on a branch!).
>> Maybe this will get the ball rolling towards a 1.6 release.  Any
>> thoughts?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                          
> cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From smarkel at accelrys.com  Thu Aug 23 17:59:37 2007
From: smarkel at accelrys.com (Scott Markel)
Date: Thu, 23 Aug 2007 14:59:37 -0700
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <38B989E4-34CA-42CD-A608-9D2A095E7ADF@uiuc.edu>
Message-ID: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>

Chris,

Pipeline Pilot's Sequence Analysis Collection wraps BioPerl.
Once you think the branch changes have converged a bit we'd
be happy to try running our regression suite and report what
we find.

Scott

Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at accelrys.com
Accelrys, Inc.                      mobile: +1 858 205 3653
10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
San Diego, CA 92121                 fax:    +1 858 799 5222
USA                                 web:    http://www.accelrys.com


bioperl-l-bounces at lists.open-bio.org wrote on 23.08.2007 14:33:25:

> Scott,
> 
> So far most of FeatureIO.t passes, with only a few exceptions dealing 
> with the from_feature method (I know what the problem is there).  A 
> large number of other tests crash horribly (not so surprising), so 
> I'll have to go through those.  Ergo any changes and testing will 
> definitely be conducted on a branch then merged back to main trunk 
> once everything is okay.  I'll probably start a branch in the next 
> few days or so.
> 
> Here's what I have been working on so far, which I think is reasonable:
> 
> 1) Move all *_tag_* related methods out of Bio::AnnotatableI and into 
> Bio::SeqFeature::Annotatable.
> 
> 2) Reinstate the same tag methods in Bio::SeqFeatureI and remove 
> Bio::AnnotatableI from the inheritance tree.
> 
> 3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it 
> already was, strangely enough).  Now it simple implements the proper 
> methods from the interface classes SeqFeatureI and AnnotatableI.
> 
> 4) Revert Bio::SeqFeature::Generic tags back to simple untyped 
> strings (reimplement the 1.4 rel methods).
> 
> I'm interested in seeing whether this results in a significant 
> performance increase in SeqIO since the Annotation instantiation is 
> removed.
> 
> ToDo: I plan on removing the operator overloading in Bio::Annotation, 
> which was a serious sticking point with most of the devs.  This won't 
> be done until after tests pass for everything else.
> 
> What we will need at some point which I can't provide: 
> Bio::SeqFeature::Annotated has no docs (no synopsis, no 
> description).  The reason I bring this up is Sendu and I are 
> seriously considering running an automated code audits in order to 
> gauge which modules lack docs, test coverage, etc..  We're likely 
> splitting those without adequate test/doc coverage off into a 
> separate 'dev' release.
> 
> chris
> 
> On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > GBrowse would be unaffected by this as it doesn't use
> > Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
> > hand will almost certainly break horribly, as it depends on the strong
> > typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
> > your ideas out in a branch that I could checkout and test on, that 
> > would
> > be good.
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
> >> As many of the devs know, there are a number of Feature/Annotation
> >> issues that need to be resolved prior to a 1.6 release:
> >>
> >> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
> >> 2FAnnotation_changes:_Keep_or_roll_back.3F
> >>
> >> There has been little work done over the last 2 1/2 years to undo or
> >> rectify problems associated with those additions; I feel like those
> >> of us still routinely contributing have been left holding the bag.
> >> There has also been very little attempt to document any of this
> >> adequately enough; as an example see POD for
> >> Bio::SeqFeature::Annotated (what little there is).
> >>
> >> I would like to suggest the radical idea of rolling back 
> >> AnnotatableI/
> >> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
> >> are simple scalars) and possibly work in implementing Ewan's
> >> SeqFeature::TypedSeqFeatureI for those who want strong data types
> >> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
> >> AnnotatableI changes, odd inheritance, and operator overloading have
> >> really obfuscated the code to the point where no one wants to touch
> >> it in case it breaks something important.  However, I believe it is
> >> the one serious impediment to a new stable release.
> >>
> >> My thought is we simplify all the relevant interfaces, essentially
> >> reverting back to rel 1.4.  For instance, we move the various
> >> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
> >> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
> >> directly, and (if needed) also implement
> >> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
> >> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
> >> methods correctly, just as any other class would when implementing an
> >> abstract interface.  I have played around with this a bit and managed
> >> to get most tests working again for Bio::SeqFeature::Generic and
> >> FeatureIO but a number of others break.
> >>
> >> If needed I can try this out on a branch (a bit ironic, since the
> >> changes instigating this mess should have been tested on a branch!).
> >> Maybe this will get the ball rolling towards a 1.6 release.  Any
> >> thoughts?
> >>
> >> chris
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ---------------------------------------------------------------------- 

> > --
> > Scott Cain, Ph. D. 
> > cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/) 
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> -- 
> Click on the link below to report this email as spam
> https://www.mailcontrol.com/sr/Z!
> PZbyWH8JjiAfutpwULH4r7uW5Ugf1xtM+hyl21+efKtFgsAvNc3weh2hLqBsx8qT3rbOWim!
> Vn7A6djKguyK4O2gER4dLr9AKQF+tbnNRe+5lUPSgNICEO3B01XGW5n2DPe!
> yEtP3js8LAfwb38Bepj7AEJrDzVAG8yHc2pI5Y2U7!
> XHn0N1xbhPb0KSgNCfpTRCAMi3+BBkPbzT1bgrPmgUSJxQ9e 


From cjfields at uiuc.edu  Thu Aug 23 20:39:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 19:39:30 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>
References: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>
Message-ID: <241563BB-F96A-4631-B504-F73699FDE84B@uiuc.edu>

Having an independent test would be great!  The reason I suggest  
there may be a speedup: one complaint popping up after 1.5 was the  
slowdown in sequence parsing, which could be related to the 'heavier'  
objectified tags.

chris

On Aug 23, 2007, at 4:59 PM, Scott Markel wrote:

> Chris,
>
> Pipeline Pilot's Sequence Analysis Collection wraps BioPerl.
> Once you think the branch changes have converged a bit we'd
> be happy to try running our regression suite and report what
> we find.
>
> Scott
>
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect  email:  smarkel at accelrys.com
> Accelrys, Inc.                      mobile: +1 858 205 3653
> 10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
> San Diego, CA 92121                 fax:    +1 858 799 5222
> USA                                 web:    http://www.accelrys.com
>
>
> bioperl-l-bounces at lists.open-bio.org wrote on 23.08.2007 14:33:25:
>
>> Scott,
>>
>> So far most of FeatureIO.t passes, with only a few exceptions dealing
>> with the from_feature method (I know what the problem is there).  A
>> large number of other tests crash horribly (not so surprising), so
>> I'll have to go through those.  Ergo any changes and testing will
>> definitely be conducted on a branch then merged back to main trunk
>> once everything is okay.  I'll probably start a branch in the next
>> few days or so.
>>
>> Here's what I have been working on so far, which I think is  
>> reasonable:
>>
>> 1) Move all *_tag_* related methods out of Bio::AnnotatableI and into
>> Bio::SeqFeature::Annotatable.
>>
>> 2) Reinstate the same tag methods in Bio::SeqFeatureI and remove
>> Bio::AnnotatableI from the inheritance tree.
>>
>> 3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it
>> already was, strangely enough).  Now it simple implements the proper
>> methods from the interface classes SeqFeatureI and AnnotatableI.
>>
>> 4) Revert Bio::SeqFeature::Generic tags back to simple untyped
>> strings (reimplement the 1.4 rel methods).
>>
>> I'm interested in seeing whether this results in a significant
>> performance increase in SeqIO since the Annotation instantiation is
>> removed.
>>
>> ToDo: I plan on removing the operator overloading in Bio::Annotation,
>> which was a serious sticking point with most of the devs.  This won't
>> be done until after tests pass for everything else.
>>
>> What we will need at some point which I can't provide:
>> Bio::SeqFeature::Annotated has no docs (no synopsis, no
>> description).  The reason I bring this up is Sendu and I are
>> seriously considering running an automated code audits in order to
>> gauge which modules lack docs, test coverage, etc..  We're likely
>> splitting those without adequate test/doc coverage off into a
>> separate 'dev' release.
>>
>> chris
>>
>> On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> GBrowse would be unaffected by this as it doesn't use
>>> Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
>>> hand will almost certainly break horribly, as it depends on the  
>>> strong
>>> typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you  
>>> could try
>>> your ideas out in a branch that I could checkout and test on, that
>>> would
>>> be good.
>>>
>>> Thanks,
>>> Scott
>>>
>>>
>>> On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
>>>> As many of the devs know, there are a number of Feature/Annotation
>>>> issues that need to be resolved prior to a 1.6 release:
>>>>
>>>> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
>>>> 2FAnnotation_changes:_Keep_or_roll_back.3F
>>>>
>>>> There has been little work done over the last 2 1/2 years to  
>>>> undo or
>>>> rectify problems associated with those additions; I feel like those
>>>> of us still routinely contributing have been left holding the bag.
>>>> There has also been very little attempt to document any of this
>>>> adequately enough; as an example see POD for
>>>> Bio::SeqFeature::Annotated (what little there is).
>>>>
>>>> I would like to suggest the radical idea of rolling back
>>>> AnnotatableI/
>>>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>>>> are simple scalars) and possibly work in implementing Ewan's
>>>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>>>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
>>>> AnnotatableI changes, odd inheritance, and operator overloading  
>>>> have
>>>> really obfuscated the code to the point where no one wants to touch
>>>> it in case it breaks something important.  However, I believe it is
>>>> the one serious impediment to a new stable release.
>>>>
>>>> My thought is we simplify all the relevant interfaces, essentially
>>>> reverting back to rel 1.4.  For instance, we move the various
>>>> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
>>>> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
>>>> directly, and (if needed) also implement
>>>> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
>>>> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
>>>> methods correctly, just as any other class would when  
>>>> implementing an
>>>> abstract interface.  I have played around with this a bit and  
>>>> managed
>>>> to get most tests working again for Bio::SeqFeature::Generic and
>>>> FeatureIO but a number of others break.
>>>>
>>>> If needed I can try this out on a branch (a bit ironic, since the
>>>> changes instigating this mess should have been tested on a  
>>>> branch!).
>>>> Maybe this will get the ball rolling towards a 1.6 release.  Any
>>>> thoughts?
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>
>>> --
>>> Scott Cain, Ph. D.
>>> cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> -- 
>> Click on the link below to report this email as spam
>> https://www.mailcontrol.com/sr/Z!
>> PZbyWH8JjiAfutpwULH4r7uW5Ugf1xtM+hyl21 
>> +efKtFgsAvNc3weh2hLqBsx8qT3rbOWim!
>> Vn7A6djKguyK4O2gER4dLr9AKQF+tbnNRe+5lUPSgNICEO3B01XGW5n2DPe!
>> yEtP3js8LAfwb38Bepj7AEJrDzVAG8yHc2pI5Y2U7!
>> XHn0N1xbhPb0KSgNCfpTRCAMi3+BBkPbzT1bgrPmgUSJxQ9e
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Thu Aug 23 23:34:12 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 23 Aug 2007 23:34:12 -0400
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
Message-ID: <CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>


On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:

> There has been little work done over the last 2 1/2 years to undo or
> rectify problems associated with those additions; I feel like those
> of us still routinely contributing have been left holding the bag.

Not by intention, but unfortunately that's probably a fair  
assessment. (And I'm one of those guilty of inaction.)

> [...]
> I would like to suggest the radical idea of rolling back AnnotatableI/
> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
> are simple scalars) and possibly work in implementing Ewan's
> SeqFeature::TypedSeqFeatureI for those who want strong data types
> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).

I fully support this; to me that sounds exactly like the way to go.

> The various AnnotatableI changes, odd inheritance, and operator  
> overloading have
> really obfuscated the code to the point where no one wants to touch
> it in case it breaks something important.  However, I believe it is
> the one serious impediment to a new stable release.

Yes, I think you're hitting the nail on the head.

Chris, if you take the lead on this and carry it through we will all  
owe you hugely. I'm not sure how many beers that would compare to,  
but I'll throw in some. (Who else do I owe beer? I'm losing track.  
Strangely nobody tried to redeem beer from me in Vienna. Maybe in  
Toronto?)

Seriously, rectifying this problem would lift a huge weight.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From florent.angly at gmail.com  Fri Aug 24 00:43:23 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 23 Aug 2007 21:43:23 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
Message-ID: <46CE61EB.5000300@gmail.com>

Dear list members,

I would like to "produce" an alignment of a contig, or more exactly 
visualize it in a such a fashion based on the aligned sequences provided 
to be by a sequence assembler:

Consensus: ACGTACGTTG
Sequence1: ACG-AC
Sequence2:  CGTACGT
Sequence3:     AC-TTG

It sounds like a very trivial task but after searching for a long time, 
it seems impossible using the methods BioPerl provides.

Using the Bio::Align classes, it seems like the only way is if the 
sequences have the same aligned length, i.e. like this:

Consensus: ACGTACGTTG
Sequence1: ACG-AC----
Sequence2: -CGTACGT--
Sequence3: ----AC-TTG

It's not very satisfactory if I have to pad the sequences with gaps 
manually. In the context of a phylogenetic alignment, it might make 
sense, but not for contigs.

For assemblies whole sequences are mapped on contigs. Bio::LocatableSeq 
does not help here because it defines locations _within_ the sequence 
(the name LocatableSeq was pretty misleading to me).

I think it's all very strange that contigs have the coordinates of the 
aligned sequences composing them but there is no straightforward way to 
exploit this information.

So what's the bottom line? Am I missing something obvious, an 
out-of-the-box solution? Is it a "missing feature" of BioPerl that is 
planned to be implemented in the future or that should be requested? 
Should I pad my sequences with dashes or spaces after assembly? Or is it 
expected that my aligned reads coming from my assembly be padded with 
lots of gaps at their beginning and end? What's the BioPerl philosophy here?

Thanks for giving me pointers,

Florent


From bix at sendu.me.uk  Fri Aug 24 04:35:23 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 09:35:23 +0100
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CE61EB.5000300@gmail.com>
References: <46CE61EB.5000300@gmail.com>
Message-ID: <46CE984B.3060701@sendu.me.uk>

Florent Angly wrote:
> Dear list members,
> 
> I would like to "produce" an alignment of a contig, or more exactly 
> visualize it in a such a fashion based on the aligned sequences provided 
> to be by a sequence assembler:
> 
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC
> Sequence2:  CGTACGT
> Sequence3:     AC-TTG
> 
> It sounds like a very trivial task but after searching for a long time, 
> it seems impossible using the methods BioPerl provides.

Isn't Bio::Assembly::Contig what you need?

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Assembly/Contig.html


From zhaodj at ioz.ac.cn  Fri Aug 24 05:34:07 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 24 Aug 2007 17:34:07 +0800 (CST)
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CE61EB.5000300@gmail.com>
References: <46CE61EB.5000300@gmail.com>
Message-ID: <51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>

On Fri, Aug 24, 2007 12:43, Florent Angly wrote:
> Dear list members,
>
> I would like to "produce" an alignment of a contig, or more
exactly
> visualize it in a such a fashion based on the aligned sequences
> provided
> to be by a sequence assembler:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC
> Sequence2:  CGTACGT
> Sequence3:     AC-TTG
>
> It sounds like a very trivial task but after searching for a long
time,
> it seems impossible using the methods BioPerl provides.
>
> Using the Bio::Align classes, it seems like the only way is if the
sequences have the same aligned length, i.e. like this:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC----
> Sequence2: -CGTACGT--
> Sequence3: ----AC-TTG
>
> It's not very satisfactory if I have to pad the sequences with
gaps
> manually. In the context of a phylogenetic alignment, it might
make
> sense, but not for contigs.

How do you pad the sequences with gaps manually? Just replace the
hyphens with blanks? If yes, you can program in perl to automate
this process.

> For assemblies whole sequences are mapped on contigs.
> Bio::LocatableSeq
> does not help here because it defines locations _within_ the
> sequence
> (the name LocatableSeq was pretty misleading to me).
>
> I think it's all very strange that contigs have the coordinates of
the
> aligned sequences composing them but there is no straightforward
way
> to
> exploit this information.
>
> So what's the bottom line? Am I missing something obvious, an
> out-of-the-box solution? Is it a "missing feature" of BioPerl that
is
> planned to be implemented in the future or that should be
requested?
> Should I pad my sequences with dashes or spaces after assembly? Or
is it
> expected that my aligned reads coming from my assembly be padded
with
> lots of gaps at their beginning and end? What's the BioPerl
> philosophy here?
>
> Thanks for giving me pointers,
>
> Florent
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From marian.thieme at arcor.de  Fri Aug 24 06:05:55 2007
From: marian.thieme at arcor.de (Marian Thieme)
Date: Fri, 24 Aug 2007 12:05:55 +0200
Subject: [Bioperl-l] ReseqChip, module/package name
Message-ID: <46CEAD83.2050904@arcor.de>

Hi,

2 questions about the naming of the module I did submit
(see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)

1.) The package:
because there exists already an expression package I suggest to create a
new package called resequencing

2.) I would suggest that the module is called RedundantFragments or
AdditionalFragments

so we would have something like:

Bio::Resequencing::AdditionalFragments

Any other ideas ?

Marian

By the way can anybody change my email adress to marian.thieme at arcor.de
in bugzilla as well as in the bioperl list, please ?!! didnt achieve
that by my own...


From mcons004 at fiu.edu  Thu Aug 23 23:30:44 2007
From: mcons004 at fiu.edu (mcons004 at fiu.edu)
Date: Thu, 23 Aug 2007 23:30:44 -0400 (EDT)
Subject: [Bioperl-l] please some help
Message-ID: <20070823233044.BJQ45014@mailstore2.fiu.edu>

  Hello,
     I am new to this software and I am having some trouble starting. The version of Bioperl I am working on is v5.8.6. My OS is Unix (Mac OS X). I am trying to use Bioperl with a file called blastParser to process a file which is the output of a "blastall" operation.
  
 The code that gives me error is:
> perl blastParser.pl junk.out 1 1 1.0
 and the error message says:
Can't locate Bio/SearchIO.pm in @INC (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level

 You online info says I probably means that the module Bio::SearchIO.pm is not instaled and I can either install Bundle::Bioperl or install that specific module by hand. Could you give me some tips in this? I am new working with Unix, and Bioperl so I am a little confused. Any information will be helpful for me. Thanks


From bix at sendu.me.uk  Fri Aug 24 10:38:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 15:38:39 +0100
Subject: [Bioperl-l] please some help
In-Reply-To: <20070823233044.BJQ45014@mailstore2.fiu.edu>
References: <20070823233044.BJQ45014@mailstore2.fiu.edu>
Message-ID: <46CEED6F.1080101@sendu.me.uk>

mcons004 at fiu.edu wrote:
> Hello, I am new to this software and I am having some trouble
> starting. The version of Bioperl I am working on is v5.8.6. My OS is
> Unix (Mac OS X). I am trying to use Bioperl with a file called
> blastParser to process a file which is the output of a "blastall"
> operation.
> 
> The code that gives me error is:
>> perl blastParser.pl junk.out 1 1 1.0
> and the error message says: Can't locate Bio/SearchIO.pm in @INC
> (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level
> 
> 
> You online info says I probably means that the module
> Bio::SearchIO.pm is not instaled and I can either install
> Bundle::Bioperl or install that specific module by hand. Could you
> give me some tips in this? I am new working with Unix, and Bioperl so
> I am a little confused.

You need to install Bioperl first. You can find instructions here:
http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix

If this is your own Mac (you have the root/admin password), when it 
tells you to run cpan (">perl -MCPAN -e shell" or ">cpan"), start the 
command with 'sudo'. So:

 >sudo cpan


From florent.angly at gmail.com  Fri Aug 24 12:07:04 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Fri, 24 Aug 2007 09:07:04 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
Message-ID: <46CF0228.2000404@gmail.com>

Thanks for all the replies.

Sendu Bala wrote:

> Isn't Bio::Assembly::Contig what you need?
>
> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Assembly/Contig.html
>
I'm using this module already to manipulate the contigs, but there's no
option that I know of to _display_ the contigs in the way I described.
(Sorry, the title of my email was misleading.)


De-Jian,ZHAO wrote:
> How do you pad the sequences with gaps manually? Just replace the
> hyphens with blanks? If yes, you can program in perl to automate
> this process.
>   
How do I pad the sequences manually?? I calculate how many gaps have to
go left and right of the aligned sequence based on its length, its
position in the aligned consensus and the consensus length.
my $newseq = '-' x $leftnum . $seq . '-'x$rightnum
By the way, the sequences cannot be stored with blanks in them...

I think the best way to provide an out-of-the-box solution for
displaying contigs the described way would be to _not_ use Bio::Align at
all, but rather to create a new assembly IO module like
Bio::Assembly::IO::simpleout for example. That would be useful.

The reason I wanted to visualize these contigs is because I made a
Bio::Assembly::IO module for TIGR Assembler files that I intend on
submitting to BioPerl. I wanted to make sure first that I did not have
any obvious bug in my contig coordinates. I've read the documentation on
the Wiki so if a BioPerl developer would please like lo step up and
contact me directly for checking my code, that would be nice =)

Florent


From cjfields at uiuc.edu  Fri Aug 24 12:07:36 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 11:07:36 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip, module/package name
In-Reply-To: <46CEAD83.2050904@arcor.de>
References: <46CEAD83.2050904@arcor.de>
Message-ID: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>

Marian,

First, apologies about not getting on this sooner.  It's shaping up  
to be a busy year!

The new package: How about Bio::Expression::Tools::MitoChip?  My  
reasoning: I don't think it's necessary to define a new  
Bio::Resequencing namespace for just one module; my inclination is  
towards using Bio::Expression namespace as Bio::Tools have been  
traditionally reserved for output parsers.  I am unsure what the  
Bio::Expression status is (very little is documented, no tests are  
written, nothing on the mail list archives); maybe Allen can answer  
that?  I don't see anything that precludes you from using that  
namespace as long as your tools are fairly well-defined (they are)  
and have tests (they do).

Also, your module deals with doing one specific thing (extraction and  
incorporation of information about redundant fragments) for the Affy  
MitoChip.  It might be worth genericizing the class a bit so that you  
can add new parser or analysis methods w/o having to define new  
classes to deal with the same Mitochip data.

Mail list: The mail list subscription page (http://bioperl.org/ 
mailman/listinfo/bioperl-l) allows you to subscribe or change  
subscription options (at the bottom of the page).

Bugzilla: if you are logged into Bugzilla under your old email, there  
is an option at the bottom of the page (Edit : Prefs) where you can  
change your email address and other preferences.

chris

On Aug 24, 2007, at 5:05 AM, Marian Thieme wrote:

> Hi,
>
> 2 questions about the naming of the module I did submit
> (see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)
>
> 1.) The package:
> because there exists already an expression package I suggest to  
> create a
> new package called resequencing
>
> 2.) I would suggest that the module is called RedundantFragments or
> AdditionalFragments
>
> so we would have something like:
>
> Bio::Resequencing::AdditionalFragments
>
> Any other ideas ?
>
> Marian
>
> By the way can anybody change my email adress to  
> marian.thieme at arcor.de
> in bugzilla as well as in the bioperl list, please ?!! didnt achieve
> that by my own...
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Aug 24 12:23:12 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 11:23:12 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
Message-ID: <4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>

On Aug 23, 2007, at 10:34 PM, Hilmar Lapp wrote:

> On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:
>
>> There has been little work done over the last 2 1/2 years to undo or
>> rectify problems associated with those additions; I feel like those
>> of us still routinely contributing have been left holding the bag.
>
> Not by intention, but unfortunately that's probably a fair  
> assessment. (And I'm one of those guilty of inaction.)

Not completely.  You, Jason, Chris M., and several others expressed  
yourselves quite clearly (move the code to a branch and test).  I  
think that everyone was trying to be diplomatic about it and so never  
attempted to do anything except get it working correctly.

>> [...]
>> I would like to suggest the radical idea of rolling back  
>> AnnotatableI/
>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>> are simple scalars) and possibly work in implementing Ewan's
>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).
>
> I fully support this; to me that sounds exactly like the way to go.

Okay, I'll probably go ahead and get a branch started today.  I'll  
have to look at Ewan's interface in more detail; it's possible a new  
SeqFeature implementation will need to be written up to incorporate it.

>> The various AnnotatableI changes, odd inheritance, and operator  
>> overloading have
>> really obfuscated the code to the point where no one wants to touch
>> it in case it breaks something important.  However, I believe it is
>> the one serious impediment to a new stable release.
>
> Yes, I think you're hitting the nail on the head.
>
> Chris, if you take the lead on this and carry it through we will  
> all owe you hugely. I'm not sure how many beers that would compare  
> to, but I'll throw in some. (Who else do I owe beer? I'm losing  
> track. Strangely nobody tried to redeem beer from me in Vienna.  
> Maybe in Toronto?)
>
> Seriously, rectifying this problem would lift a huge weight.
>
> 	-hilmar

It would be nice to get regular releases started again.  I think  
this'll help.

chris


From marian.thieme at arcor.de  Fri Aug 24 13:01:07 2007
From: marian.thieme at arcor.de (Marian Thieme)
Date: Fri, 24 Aug 2007 19:01:07 +0200
Subject: [Bioperl-l] Bio::Expression & Re: ReseqChip, module/package name
Message-ID: <46CF0ED3.8000708@arcor.de>

> The new package: How about Bio::Expression::Tools::MitoChip?  My  
> reasoning: I don't think it's necessary to define a new  
> Bio::Resequencing namespace for just one module; my inclination is  
> towards using Bio::Expression namespace as Bio::Tools have been  
> traditionally reserved for output parsers.  I am unsure what the  
> Bio::Expression status is (very little is documented, no tests are  
> written, nothing on the mail list archives); maybe Allen can answer  
> that?  I don't see anything that precludes you from using that  
> namespace as long as your tools are fairly well-defined (they are)  
> and have tests (they do).

The problem I see, with Bio::Expression, is that Resequencing chips are
not belongs to Expression chips.
(Expression chips are designed to hybridisize RNA strands and hence
measure RNA expression levels, on the other hand a resequencing chip is
based on DNA, also the design and the probe length is quite different).
So, from my point of view it make sence to differ between dna and rna
chips, at least.

>
> Also, your module deals with doing one specific thing (extraction and  
> incorporation of information about redundant fragments) for the Affy  
> MitoChip.  It might be worth genericizing the class a bit so that you  
> can add new parser or analysis methods w/o having to define new  
> classes to deal with the same Mitochip data.

OK, need to think about that.

>
> Mail list: The mail list subscription page (http://bioperl.org/
<http://www.arcor.de/home/link.php?url=http%3A%2F%2Fbioperl.org%2F&ts=1187974826&hash=13eb66beff4317844b3e2448aa7af12a>

> mailman/listinfo/bioperl-l) allows you to subscribe or change  
> subscription options (at the bottom of the page).
>
cleared

> Bugzilla: if you are logged into Bugzilla under your old email, there  
> is an option at the bottom of the page (Edit : Prefs) where you can  
> change your email address and other preferences.
>
unfortunatly I dont recieve a mail to confirm the change. did try that
twice..


Marian


From bix at sendu.me.uk  Fri Aug 24 12:43:22 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 17:43:22 +0100
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CF0228.2000404@gmail.com>
References: <46CE61EB.5000300@gmail.com>	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
Message-ID: <46CF0AAA.4090301@sendu.me.uk>

Florent Angly wrote:
> Thanks for all the replies.
> 
> Sendu Bala wrote:
> 
>> Isn't Bio::Assembly::Contig what you need?
>
> I'm using this module already to manipulate the contigs, but there's 
> no option that I know of to _display_ the contigs in the way I 
> described.
[snip]
> I think the best way to provide an out-of-the-box solution for 
> displaying contigs the described way would be to _not_ use Bio::Align
> at all, but rather to create a new assembly IO module like 
> Bio::Assembly::IO::simpleout for example. That would be useful.

Yes...


> The reason I wanted to visualize these contigs is because I made a 
> Bio::Assembly::IO module for TIGR Assembler files that I intend on 
> submitting to BioPerl.

That's wonderful... might I cheekily suggest that the solution to your
problem is to extend your IO module so that it does the 'O' as well? Ie.
unlike the other IO modules, write_assembly() is actually implemented.
Then you can round-trip to ensure your next_assembly() method has no bugs.


> I've read the documentation on the Wiki so if a BioPerl developer
> would please like lo step up and contact me directly for checking my
> code, that would be nice =)

If no one does, post it as an enhancement request to bugzilla. A test
script is a must.

http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests


From cjfields at uiuc.edu  Fri Aug 24 13:16:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 12:16:10 -0500
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CF0228.2000404@gmail.com>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
Message-ID: <32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>


On Aug 24, 2007, at 11:07 AM, Florent Angly wrote:
...

> De-Jian,ZHAO wrote:
>> How do you pad the sequences with gaps manually? Just replace the
>> hyphens with blanks? If yes, you can program in perl to automate
>> this process.
>>
> How do I pad the sequences manually?? I calculate how many gaps  
> have to
> go left and right of the aligned sequence based on its length, its
> position in the aligned consensus and the consensus length.
> my $newseq = '-' x $leftnum . $seq . '-'x$rightnum
> By the way, the sequences cannot be stored with blanks in them...
>
> I think the best way to provide an out-of-the-box solution for
> displaying contigs the described way would be to _not_ use  
> Bio::Align at
> all, but rather to create a new assembly IO module like
> Bio::Assembly::IO::simpleout for example. That would be useful.
>
> The reason I wanted to visualize these contigs is because I made a
> Bio::Assembly::IO module for TIGR Assembler files that I intend on
> submitting to BioPerl. I wanted to make sure first that I did not have
> any obvious bug in my contig coordinates. I've read the  
> documentation on
> the Wiki so if a BioPerl developer would please like lo step up and
> contact me directly for checking my code, that would be nice =)
>
> Florent

A similar question has been previously asked on the same subject:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2827/focus=2869

Jason's suggestion was to have a Bio::Assembly::Contig method get_aln 
() which produces a Bio::SimpleAlign object containing appropriately  
padded seqs compatible for AlignIO output.  However, the method was  
never implemented.

Personally, the way I would try going about this would be to  
implement the Contig::get_aln() method, padding with bioperl- 
compliant alignment gap symbols (currently -.*?=~), so if anyone  
wanted they could write to any AlignIO-implemented format (MSF,  
Clustal, etc).  In your Bio::Assembly::IO::simpleout module implement  
write_assembly() and use the Contig::get_aln() method where needed to  
grab the SimpleAlign, then simply substitute gap symbols with spaces  
when writing contig output.

In general, any new code is attached to a bugzilla report as an  
enhancement request:

http://bugzilla.open-bio.org/

One of the devs will work on getting the code incorporated into  
bioperl.  Make sure the code is documented (http://www.bioperl.org/ 
wiki/Advanced_BioPerl), and attach appropriate tests (http:// 
www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests) and test data.

chris


From cjfields at uiuc.edu  Fri Aug 24 13:20:16 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 12:20:16 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <9824900.1187973171940.JavaMail.ngmail@webmail17>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
	<9824900.1187973171940.JavaMail.ngmail@webmail17>
Message-ID: <A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>


On Aug 24, 2007, at 11:32 AM, marian.thieme at arcor.de wrote:

>> ...
> The problem I see, with Bio::Expression, is that Resequencing chips  
> are not belongs to Expression chips.
> (Expression chips are designed to hybridisize RNA strands and hence  
> measure RNA expression levels, on the other hand a resequencing  
> chip is based on DNA, also the design and the probe length is quite  
> different). So, from my point of view it make sence to differ  
> between dna and rna chips, at least.

Then maybe the more generic Bio::Microarray namespace is the way to  
go, with the module name Bio::Microarray::Tools:: MitoChip.  If  
needed other tools can be added as needed.

>> Also, your module deals with doing one specific thing (extraction and
>> incorporation of information about redundant fragments) for the Affy
>> MitoChip.  It might be worth genericizing the class a bit so that you
>> can add new parser or analysis methods w/o having to define new
>> classes to deal with the same Mitochip data.
>
> OK, need to think about that.

It all depends on how much you intend to contribute; if you plan on  
adding to it over time we can talk about starting up a developer  
account.

>> Mail list: The mail list subscription page (http://bioperl.org/
>> mailman/listinfo/bioperl-l) allows you to subscribe or change
>> subscription options (at the bottom of the page).
>>
> cleared
>
>> Bugzilla: if you are logged into Bugzilla under your old email, there
>> is an option at the bottom of the page (Edit : Prefs) where you can
>> change your email address and other preferences.
>>
> unfortunatly I dont recieve a mail to confirm the change. did try  
> that twice..
>
>
> Marian

I tested it out and received the email at both addresses (as it  
states).  If you respond to either email it should implement the  
change in three days time.  If it doesn't you can email support at  
open.bio.org to see if there is a problem.

chris


From florent.angly at gmail.com  Fri Aug 24 13:58:13 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Fri, 24 Aug 2007 10:58:13 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
	<32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>
Message-ID: <46CF1C35.3050100@gmail.com>

Chris Fields wrote:
>
> A similar question has been previously asked on the same subject:
>
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2827/focus=2869
>
> Jason's suggestion was to have a Bio::Assembly::Contig method 
> get_aln() which produces a Bio::SimpleAlign object containing 
> appropriately padded seqs compatible for AlignIO output.  However, the 
> method was never implemented.
>
> Personally, the way I would try going about this would be to implement 
> the Contig::get_aln() method, padding with bioperl-compliant alignment 
> gap symbols (currently -.*?=~), so if anyone wanted they could write 
> to any AlignIO-implemented format (MSF, Clustal, etc).  In your 
> Bio::Assembly::IO::simpleout module implement write_assembly() and use 
> the Contig::get_aln() method where needed to grab the SimpleAlign, 
> then simply substitute gap symbols with spaces when writing contig 
> output.
>
> In general, any new code is attached to a bugzilla report as an 
> enhancement request:
>
> http://bugzilla.open-bio.org/
>
> One of the devs will work on getting the code incorporated into 
> bioperl.  Make sure the code is documented 
> (http://www.bioperl.org/wiki/Advanced_BioPerl), and attach appropriate 
> tests (http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests) and 
> test data.
>
> chris
>
>
Thanks Chris for the pointers, I will be looking into these things.
Florent


From hlapp at gmx.net  Fri Aug 24 14:25:57 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 24 Aug 2007 14:25:57 -0400
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
	<9824900.1187973171940.JavaMail.ngmail@webmail17>
	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
Message-ID: <BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>


On Aug 24, 2007, at 1:20 PM, Chris Fields wrote:

>>> ...
>> The problem I see, with Bio::Expression, is that Resequencing chips
>> are not belongs to Expression chips.
>> (Expression chips are designed to hybridisize RNA strands and hence
>> measure RNA expression levels, on the other hand a resequencing
>> chip is based on DNA, also the design and the probe length is quite
>> different). So, from my point of view it make sence to differ
>> between dna and rna chips, at least.
>
> Then maybe the more generic Bio::Microarray namespace is the way to
> go, with the module name Bio::Microarray::Tools:: MitoChip.  If
> needed other tools can be added as needed.
>

Makes sense to me too. Presumably, regardless of DNA or RNA being  
hybridized or length of probes, the data that comes out of them is  
quite similar in a general nature (namely hybridization signals)?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From marian.thieme at arcor.de  Fri Aug 24 12:32:51 2007
From: marian.thieme at arcor.de (marian.thieme at arcor.de)
Date: Fri, 24 Aug 2007 18:32:51 +0200 (CEST)
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
 module/package name
In-Reply-To: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
Message-ID: <9824900.1187973171940.JavaMail.ngmail@webmail17>

> The new package: How about Bio::Expression::Tools::MitoChip?  My  
> reasoning: I don't think it's necessary to define a new  
> Bio::Resequencing namespace for just one module; my inclination is  
> towards using Bio::Expression namespace as Bio::Tools have been  
> traditionally reserved for output parsers.  I am unsure what the  
> Bio::Expression status is (very little is documented, no tests are  
> written, nothing on the mail list archives); maybe Allen can answer  
> that?  I don't see anything that precludes you from using that  
> namespace as long as your tools are fairly well-defined (they are)  
> and have tests (they do).

The problem I see, with Bio::Expression, is that Resequencing chips are not belongs to Expression chips.
(Expression chips are designed to hybridisize RNA strands and hence measure RNA expression levels, on the other hand a resequencing chip is based on DNA, also the design and the probe length is quite different). So, from my point of view it make sence to differ between dna and rna chips, at least.

> 
> Also, your module deals with doing one specific thing (extraction and  
> incorporation of information about redundant fragments) for the Affy  
> MitoChip.  It might be worth genericizing the class a bit so that you  
> can add new parser or analysis methods w/o having to define new  
> classes to deal with the same Mitochip data.

OK, need to think about that.

> 
> Mail list: The mail list subscription page (http://bioperl.org/ 
> mailman/listinfo/bioperl-l) allows you to subscribe or change  
> subscription options (at the bottom of the page).
> 
cleared

> Bugzilla: if you are logged into Bugzilla under your old email, there  
> is an option at the bottom of the page (Edit : Prefs) where you can  
> change your email address and other preferences.
> 
unfortunatly I dont recieve a mail to confirm the change. did try that twice..


Marian

> On Aug 24, 2007, at 5:05 AM, Marian Thieme wrote:
> 
> > Hi,
> >
> > 2 questions about the naming of the module I did submit
> > (see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)
> >
> > 1.) The package:
> > because there exists already an expression package I suggest to  
> > create a
> > new package called resequencing
> >
> > 2.) I would suggest that the module is called RedundantFragments or
> > AdditionalFragments
> >
> > so we would have something like:
> >
> > Bio::Resequencing::AdditionalFragments
> >
> > Any other ideas ?
> >
> > Marian
> >
> > By the way can anybody change my email adress to  
> > marian.thieme at arcor.de
> > in bugzilla as well as in the bioperl list, please ?!! didnt achieve
> > that by my own...
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT F?R ALLE NEUEINSTEIGER
Jetzt bei Arcor: g?nstig und schnell mit DSL - das All-Inclusive-Paket
f?r clevere Doppel-Sparer, nur  34,95 ?  inkl. DSL- und ISDN-Grundgeb?hr!
http://www.arcor.de/rd/emf-dsl-2


From cjfields at uiuc.edu  Fri Aug 24 17:12:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 16:12:25 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
	<4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>
Message-ID: <ABED5057-CFB5-4AAA-9D23-B6A069575BF6@uiuc.edu>

Okay, I have started a new branch in cvs (tagged featann_rollback).   
I'll start looking through everything within the next few days to get  
a general idea of what needs to be done.  All I know is the changes  
were extensive and included modifications to tests.

If anyone has comments I have added a wiki page here:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

chris

On Aug 24, 2007, at 11:23 AM, Chris Fields wrote:

> On Aug 23, 2007, at 10:34 PM, Hilmar Lapp wrote:
>
>> On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:
>>
>>> There has been little work done over the last 2 1/2 years to undo or
>>> rectify problems associated with those additions; I feel like those
>>> of us still routinely contributing have been left holding the bag.
>>
>> Not by intention, but unfortunately that's probably a fair
>> assessment. (And I'm one of those guilty of inaction.)
>
> Not completely.  You, Jason, Chris M., and several others expressed
> yourselves quite clearly (move the code to a branch and test).  I
> think that everyone was trying to be diplomatic about it and so never
> attempted to do anything except get it working correctly.
>
>>> [...]
>>> I would like to suggest the radical idea of rolling back
>>> AnnotatableI/
>>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>>> are simple scalars) and possibly work in implementing Ewan's
>>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).
>>
>> I fully support this; to me that sounds exactly like the way to go.
>
> Okay, I'll probably go ahead and get a branch started today.  I'll
> have to look at Ewan's interface in more detail; it's possible a new
> SeqFeature implementation will need to be written up to incorporate  
> it.
>
>>> The various AnnotatableI changes, odd inheritance, and operator
>>> overloading have
>>> really obfuscated the code to the point where no one wants to touch
>>> it in case it breaks something important.  However, I believe it is
>>> the one serious impediment to a new stable release.
>>
>> Yes, I think you're hitting the nail on the head.
>>
>> Chris, if you take the lead on this and carry it through we will
>> all owe you hugely. I'm not sure how many beers that would compare
>> to, but I'll throw in some. (Who else do I owe beer? I'm losing
>> track. Strangely nobody tried to redeem beer from me in Vienna.
>> Maybe in Toronto?)
>>
>> Seriously, rectifying this problem would lift a huge weight.
>>
>> 	-hilmar
>
> It would be nice to get regular releases started again.  I think
> this'll help.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From marian at arcor.de  Fri Aug 24 14:48:20 2007
From: marian at arcor.de (marian)
Date: Fri, 24 Aug 2007 20:48:20 +0200
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
 module/package name
In-Reply-To: <BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>	<46CEAD83.2050904@arcor.de>	<9824900.1187973171940.JavaMail.ngmail@webmail17>	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
	<BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
Message-ID: <46CF27F4.8030608@arcor.de>

Hilmar Lapp schrieb:
> On Aug 24, 2007, at 1:20 PM, Chris Fields wrote:
>
>   
>>>> ...
>>>>         
>>> The problem I see, with Bio::Expression, is that Resequencing chips
>>> are not belongs to Expression chips.
>>> (Expression chips are designed to hybridisize RNA strands and hence
>>> measure RNA expression levels, on the other hand a resequencing
>>> chip is based on DNA, also the design and the probe length is quite
>>> different). So, from my point of view it make sence to differ
>>> between dna and rna chips, at least.
>>>       
>> Then maybe the more generic Bio::Microarray namespace is the way to
>> go, with the module name Bio::Microarray::Tools:: MitoChip.  If
>> needed other tools can be added as needed.
>>
>>     
>
> Makes sense to me too. Presumably, regardless of DNA or RNA being  
> hybridized or length of probes, the data that comes out of them is  
> quite similar in a general nature (namely hybridization signals)?
>
> 	-hilmar
>   

Bio::Microarray::Tools::MitoChip would be OK to me. I merely meant, that it 
isnt an expression chip and you also wont/cant analyze expression data with 
the tool I am talking about.

Marian


From cjfields at uiuc.edu  Fri Aug 24 18:36:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 17:36:46 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
Message-ID: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>

One thing I am noticing with the rollback to tag as strings is that  
tags with an undefined value are not set; I'm assuming when tags were  
Bio::AnnotationI they were instantiated regardless with an undef  
value.  When attempting to call an undef tag with get_tag_values() I  
get:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: asking for tag value that does not exist signalPeptideLength
STACK: Error::throw
STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/ 
bioperl-live/blib/lib/Bio/Root/Root.pm:357
STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/ 
featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
STACK: t/targetp.t:189
-----------------------------------------------------------

I personally think of this as a feature (why set a tag at all if it  
is undef?).  However, are there any circumstances where we might want  
this behavior?  Do we want to simply return w/o a value if a tag name  
isn't found (i.e. remove the exception)?

chris


From hlapp at gmx.net  Fri Aug 24 19:02:43 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 24 Aug 2007 19:02:43 -0400
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
Message-ID: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>

You're supposed to call has_tag() first before you can assume that  
you can call get_tag_values() w/o an exception. That was the original  
API.

	-hilmar

On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:

> One thing I am noticing with the rollback to tag as strings is that
> tags with an undefined value are not set; I'm assuming when tags were
> Bio::AnnotationI they were instantiated regardless with an undef
> value.  When attempting to call an undef tag with get_tag_values() I
> get:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: asking for tag value that does not exist signalPeptideLength
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
> bioperl-live/blib/lib/Bio/Root/Root.pm:357
> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
> STACK: t/targetp.t:189
> -----------------------------------------------------------
>
> I personally think of this as a feature (why set a tag at all if it
> is undef?).  However, are there any circumstances where we might want
> this behavior?  Do we want to simply return w/o a value if a tag name
> isn't found (i.e. remove the exception)?
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Aug 25 00:05:58 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 23:05:58 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
Message-ID: <6392DF1D-D91B-4B6E-812B-38FC0EA0D234@uiuc.edu>

Makes sense.  Okay, I'll leave the exception in.  Thanks!

chris

On Aug 24, 2007, at 6:02 PM, Hilmar Lapp wrote:

> You're supposed to call has_tag() first before you can assume that
> you can call get_tag_values() w/o an exception. That was the original
> API.
>
> 	-hilmar
>
> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>
>> One thing I am noticing with the rollback to tag as strings is that
>> tags with an undefined value are not set; I'm assuming when tags were
>> Bio::AnnotationI they were instantiated regardless with an undef
>> value.  When attempting to call an undef tag with get_tag_values() I
>> get:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: asking for tag value that does not exist signalPeptideLength
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>> STACK: t/targetp.t:189
>> -----------------------------------------------------------
>>
>> I personally think of this as a feature (why set a tag at all if it
>> is undef?).  However, are there any circumstances where we might want
>> this behavior?  Do we want to simply return w/o a value if a tag name
>> isn't found (i.e. remove the exception)?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sat Aug 25 03:50:29 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 25 Aug 2007 08:50:29 +0100
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
Message-ID: <46CFDF45.8030200@sheffield.ac.uk>

This sort of highlights a comment I made previously about how do you
test for a stable API?

It seems to me that unless you have intricate knowledge about the
changes that took place, you will find it difficult to know when an API
change has occurred. Is it possible to run the 1.4 test suite against
existing code to ensure tests pass? What if the 1.4 tests contained
bugs? This approach would need good code coverage by the tests to ensure
things work the same i.e. test code in HEAD against the test suite from
the previous stable release's branch - would/should this work
conceptually?**

Nath

Hilmar Lapp wrote:
> You're supposed to call has_tag() first before you can assume that  
> you can call get_tag_values() w/o an exception. That was the original  
> API.
>
> 	-hilmar
>
> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>
>   
>> One thing I am noticing with the rollback to tag as strings is that
>> tags with an undefined value are not set; I'm assuming when tags were
>> Bio::AnnotationI they were instantiated regardless with an undef
>> value.  When attempting to call an undef tag with get_tag_values() I
>> get:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: asking for tag value that does not exist signalPeptideLength
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>> STACK: t/targetp.t:189
>> -----------------------------------------------------------
>>
>> I personally think of this as a feature (why set a tag at all if it
>> is undef?).  However, are there any circumstances where we might want
>> this behavior?  Do we want to simply return w/o a value if a tag name
>> isn't found (i.e. remove the exception)?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>   


From cjfields at uiuc.edu  Sat Aug 25 10:36:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 25 Aug 2007 09:36:08 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <46CFDF45.8030200@sheffield.ac.uk>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
	<46CFDF45.8030200@sheffield.ac.uk>
Message-ID: <3F3C311E-3CD5-436B-987F-FD7695904647@uiuc.edu>

The rollback branch is off of HEAD, not 1.4, so any bugs fixed since  
then and any modules/tests added will be present.  So far everything  
has worked relatively well; you can check the history of this page to  
track what has happened so far:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

The only problem code remaining for the first round of changes is a  
single method in Bio::SeqFeature::Annotated (if the tests are to be  
trusted) and one test in Bio::SeqFeature::AnnotationAdaptor using  
Hilmar's original test suite.  Most of those were tests breaking  
Feature/Annotation API outlined in the HOWTO (calling get_Annotations  
directly from a Bio::SeqI or Bio::SeqFeatureI for instance), or  
examples where has_tag() was not used.  I agree good test coverage  
would probably help catch some of those still silently lingering in  
code, but I don't think it can find everything; that's the reason I  
indicate there will need extensive testing.  That applies within the  
suite but also by users in the wild.

The SeqFeatureI and AnnotatableI API is defined very specifically in  
the Feature/Annotation HOWTO, so if anything the introduced changes  
violated much of that and started a domino effect of users  
unknowingly violating the API (me among them).  Also, just b/c a test  
passes doesn't mean it is the ->correct<- result; it is very easy to  
just throw something from Data::Dumper into an is() test and have it  
pass.  As an example, it appears there was a bit of cheating going on  
with AnnotationAdaptor.t in particular, where expected numbers were  
changed to conform to results w/o explanation.  Which is the correct  
answer?  I trust Hilmar's original test suite over the (rushed) changes.

chris

On Aug 25, 2007, at 2:50 AM, Nathan S. Haigh wrote:

> This sort of highlights a comment I made previously about how do you
> test for a stable API?
>
> It seems to me that unless you have intricate knowledge about the
> changes that took place, you will find it difficult to know when an  
> API
> change has occurred. Is it possible to run the 1.4 test suite against
> existing code to ensure tests pass? What if the 1.4 tests contained
> bugs? This approach would need good code coverage by the tests to  
> ensure
> things work the same i.e. test code in HEAD against the test suite  
> from
> the previous stable release's branch - would/should this work
> conceptually?**
>
> Nath
>
> Hilmar Lapp wrote:
>> You're supposed to call has_tag() first before you can assume that
>> you can call get_tag_values() w/o an exception. That was the original
>> API.
>>
>> 	-hilmar
>>
>> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>>
>>
>>> One thing I am noticing with the rollback to tag as strings is that
>>> tags with an undefined value are not set; I'm assuming when tags  
>>> were
>>> Bio::AnnotationI they were instantiated regardless with an undef
>>> value.  When attempting to call an undef tag with get_tag_values() I
>>> get:
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: asking for tag value that does not exist signalPeptideLength
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>>> STACK: t/targetp.t:189
>>> -----------------------------------------------------------
>>>
>>> I personally think of this as a feature (why set a tag at all if it
>>> is undef?).  However, are there any circumstances where we might  
>>> want
>>> this behavior?  Do we want to simply return w/o a value if a tag  
>>> name
>>> isn't found (i.e. remove the exception)?
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sat Aug 25 18:12:49 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 25 Aug 2007 17:12:49 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
Message-ID: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>

I have finished rolling back most of the specific changes made prior  
to the 1.5 release and have relevant tests passing:

http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round

Operator overloading of Bio::Annotation objects will be trickier to  
debug as tons of tests fail when the overloading is removed:

http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round

I'll start looking into fixes.  I don't like overloads from a  
personal standpoint (problems w/ long-term code maintenance), but was  
there a more specific reason for removing them?

chris


From hlapp at gmx.net  Sun Aug 26 00:58:46 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 00:58:46 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
Message-ID: <3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>

The reason was to provide for backward compatibility with the  
original API in which tag values were scalars, not objects. The idea  
was that if someone relied on that and treats the object as a scalar  
(comparison, printing, etc), the operator overloading would take care  
of that.

So by going back to the original API the overloading should become  
obsolete, at least theoretically.

The overloading can cause some very subtle issues that I pointed out  
in an earlier email. It's one of those really "clever" tricks that  
just make it very hard for newcomers to understand what's going on in  
their code.

	-hilmar

On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:

> I have finished rolling back most of the specific changes made prior
> to the 1.5 release and have relevant tests passing:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>
> Operator overloading of Bio::Annotation objects will be trickier to
> debug as tons of tests fail when the overloading is removed:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>
> I'll start looking into fixes.  I don't like overloads from a
> personal standpoint (problems w/ long-term code maintenance), but was
> there a more specific reason for removing them?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From n.haigh at sheffield.ac.uk  Sun Aug 26 03:35:36 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 26 Aug 2007 08:35:36 +0100
Subject: [Bioperl-l] please some help
In-Reply-To: <20070823233044.BJQ45014@mailstore2.fiu.edu>
References: <20070823233044.BJQ45014@mailstore2.fiu.edu>
Message-ID: <46D12D48.8080301@sheffield.ac.uk>

mcons004 at fiu.edu wrote:
>   Hello,
>      I am new to this software and I am having some trouble starting. The version of Bioperl I am working on is v5.8.6. My OS is Unix (Mac OS X). I am trying to use Bioperl with a file called blastParser to process a file which is the output of a "blastall" operation.
>   
>  The code that gives me error is:
>> perl blastParser.pl junk.out 1 1 1.0
>  and the error message says:
> Can't locate Bio/SearchIO.pm in @INC (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level
> 
>  You online info says I probably means that the module Bio::SearchIO.pm is not instaled and I can either install Bundle::Bioperl or install that specific module by hand. Could you give me some tips in this? I am new working with Unix, and Bioperl so I am a little confused. Any information will be helpful for me. Thanks
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

 From what you have said, it appears you need some basic info to 
understand what you are trying to achieve.

The Perl programming language requires the Perl interpreter in order to 
execute a Perl script. The Perl interpreter is usually installed as 
standard with Unix/Linux based Operating Systems. The version you 
mention (5.8.6) will not be the version of Bioperl but the version of 
the Perl interpreter you have installed - you can check this by typing 
"perl -v" at a command prompt.

Given your apparent lack of understanding of the Unix OS, it is likely 
that you don't have Bioperl installed. You should have a look at:
http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink

Nath


From cjfields at uiuc.edu  Sun Aug 26 15:22:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 26 Aug 2007 14:22:24 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
Message-ID: <B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>

I managed to find your comments (as well as ones from Ewan, Jason,  
and a few others) on the mail list archives, so I'll link to them.   
The problem will be fixing the several places where overloading is  
assumed but no longer exists (i.e. in write_* methods), but we can  
probably pinpoint those by throwing or warning when overloading is  
assumed.

My thought is to either modify as_text() or add a new display_text()  
method to all AnnotationI that explicitly does what the overloading  
implied (print the annotation in a specified or assumed way).  We  
could then delegate to that in the stringification overload (with  
appropriate deprecation warnings) until 1.6, where we remove it  
completely.  Something like:

my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
                                         -primary_id => 'TSC0000030',
                                         -tagname => "tag2);

# either
print $link1->display_text(),"\n";
# or ...
print $link1->as_text("display"),"\n";
# prints "TSC:TSC0000030"

# default human readable
print $link1->as_text(),"\n";
# prints "Direct database link to TSC0000030 in database TSC"

print "$link1\n";
# gets a deprecation warning for now, removed completely for 1.6

chris

On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:

> The reason was to provide for backward compatibility with the  
> original API in which tag values were scalars, not objects. The  
> idea was that if someone relied on that and treats the object as a  
> scalar (comparison, printing, etc), the operator overloading would  
> take care of that.
>
> So by going back to the original API the overloading should become  
> obsolete, at least theoretically.
>
> The overloading can cause some very subtle issues that I pointed  
> out in an earlier email. It's one of those really "clever" tricks  
> that just make it very hard for newcomers to understand what's  
> going on in their code.
>
> 	-hilmar
>
> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>
>> I have finished rolling back most of the specific changes made prior
>> to the 1.5 release and have relevant tests passing:
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>
>> Operator overloading of Bio::Annotation objects will be trickier to
>> debug as tons of tests fail when the overloading is removed:
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>>
>> I'll start looking into fixes.  I don't like overloads from a
>> personal standpoint (problems w/ long-term code maintenance), but was
>> there a more specific reason for removing them?
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Sun Aug 26 16:57:37 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 16:57:37 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
Message-ID: <503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>

The thing that I actually never quite understood (and predates the  
API changes) is why $ann->as_text() needs to include explanatory text  
such as 'Direct database link to blah in database foo.' I would have  
said that "TSC:TSC0000030" is human readable enough, unless you  
present it without any context so that one would have no clue that it  
is a database cross-reference.

The as_text() method shouldn't be meant for the sole purpose of  
debugging annotation collections. However, I'm not sure for what else  
you could use it for, given that there are no guidelines for what to  
expect.

In fact, I do use as_text() a lot for a real purpose, namely as a  
surrogate unique key. For example, making a collection of dblinks  
unique is quite simple using the as_text() method:

	my %dbhash = map { ($_->as_text(), $_) } $anncoll->remove_Annotations 
('dblink');
	$anncoll->add_Annotation('dblink',$_) foreach (values %dbhash);

This is a common task when harvesting annotation from various places  
and then integrating it. However, there is nothing in the API  
documentation that suggests that this might be a reliable or even  
expected property such that you could omit the 'dblink' tag above.

I agree that having a conceptual equivalent to $feature->display_name  
and $seq->display_id would be good, but these methods have no claim  
to returning something that's unique in any way.

I guess I've now raised more questions than I answered (in fact I  
didn't answer any). Sorry 'bout that.

	-hilmar

On Aug 26, 2007, at 3:22 PM, Chris Fields wrote:

> I managed to find your comments (as well as ones from Ewan, Jason,  
> and a few others) on the mail list archives, so I'll link to them.   
> The problem will be fixing the several places where overloading is  
> assumed but no longer exists (i.e. in write_* methods), but we can  
> probably pinpoint those by throwing or warning when overloading is  
> assumed.
>
> My thought is to either modify as_text() or add a new display_text 
> () method to all AnnotationI that explicitly does what the  
> overloading implied (print the annotation in a specified or assumed  
> way).  We could then delegate to that in the stringification  
> overload (with appropriate deprecation warnings) until 1.6, where  
> we remove it completely.  Something like:
>
> my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
>                                         -primary_id => 'TSC0000030',
>                                         -tagname => "tag2);
>
> # either
> print $link1->display_text(),"\n";
> # or ...
> print $link1->as_text("display"),"\n";
> # prints "TSC:TSC0000030"
>
> # default human readable
> print $link1->as_text(),"\n";
> # prints "Direct database link to TSC0000030 in database TSC"
>
> print "$link1\n";
> # gets a deprecation warning for now, removed completely for 1.6
>
> chris
>
> On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:
>
>> The reason was to provide for backward compatibility with the  
>> original API in which tag values were scalars, not objects. The  
>> idea was that if someone relied on that and treats the object as a  
>> scalar (comparison, printing, etc), the operator overloading would  
>> take care of that.
>>
>> So by going back to the original API the overloading should become  
>> obsolete, at least theoretically.
>>
>> The overloading can cause some very subtle issues that I pointed  
>> out in an earlier email. It's one of those really "clever" tricks  
>> that just make it very hard for newcomers to understand what's  
>> going on in their code.
>>
>> 	-hilmar
>>
>> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>>
>>> I have finished rolling back most of the specific changes made prior
>>> to the 1.5 release and have relevant tests passing:
>>>
>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>>
>>> Operator overloading of Bio::Annotation objects will be trickier to
>>> debug as tons of tests fail when the overloading is removed:
>>>
>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>>>
>>> I'll start looking into fixes.  I don't like overloads from a
>>> personal standpoint (problems w/ long-term code maintenance), but  
>>> was
>>> there a more specific reason for removing them?
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sun Aug 26 18:47:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 26 Aug 2007 17:47:41 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
	<503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
Message-ID: <E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>

Either way I implement, it would be used simply as a generic  
convenience method to replicate output via stringification  
overloading, using a common method name for all AnnotationI; there  
seem to be several instances where this is used for generating output  
(i.e. SeqIO::genbank).  So, for instance, when formatting output you  
could just call as_text('display') or display_text() and you would  
get the most common formatting for that particular annotation type.

chris

On Aug 26, 2007, at 3:57 PM, Hilmar Lapp wrote:

> The thing that I actually never quite understood (and predates the  
> API changes) is why $ann->as_text() needs to include explanatory  
> text such as 'Direct database link to blah in database foo.' I  
> would have said that "TSC:TSC0000030" is human readable enough,  
> unless you present it without any context so that one would have no  
> clue that it is a database cross-reference.
>
> The as_text() method shouldn't be meant for the sole purpose of  
> debugging annotation collections. However, I'm not sure for what  
> else you could use it for, given that there are no guidelines for  
> what to expect.
>
> In fact, I do use as_text() a lot for a real purpose, namely as a  
> surrogate unique key. For example, making a collection of dblinks  
> unique is quite simple using the as_text() method:
>
> 	my %dbhash = map { ($_->as_text(), $_) } $anncoll- 
> >remove_Annotations('dblink');
> 	$anncoll->add_Annotation('dblink',$_) foreach (values %dbhash);
>
> This is a common task when harvesting annotation from various  
> places and then integrating it. However, there is nothing in the  
> API documentation that suggests that this might be a reliable or  
> even expected property such that you could omit the 'dblink' tag  
> above.
>
> I agree that having a conceptual equivalent to $feature- 
> >display_name and $seq->display_id would be good, but these methods  
> have no claim to returning something that's unique in any way.
>
> I guess I've now raised more questions than I answered (in fact I  
> didn't answer any). Sorry 'bout that.
>
> 	-hilmar
>
> On Aug 26, 2007, at 3:22 PM, Chris Fields wrote:
>
>> I managed to find your comments (as well as ones from Ewan, Jason,  
>> and a few others) on the mail list archives, so I'll link to  
>> them.  The problem will be fixing the several places where  
>> overloading is assumed but no longer exists (i.e. in write_*  
>> methods), but we can probably pinpoint those by throwing or  
>> warning when overloading is assumed.
>>
>> My thought is to either modify as_text() or add a new display_text 
>> () method to all AnnotationI that explicitly does what the  
>> overloading implied (print the annotation in a specified or  
>> assumed way).  We could then delegate to that in the  
>> stringification overload (with appropriate deprecation warnings)  
>> until 1.6, where we remove it completely.  Something like:
>>
>> my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
>>                                         -primary_id => 'TSC0000030',
>>                                         -tagname => "tag2);
>>
>> # either
>> print $link1->display_text(),"\n";
>> # or ...
>> print $link1->as_text("display"),"\n";
>> # prints "TSC:TSC0000030"
>>
>> # default human readable
>> print $link1->as_text(),"\n";
>> # prints "Direct database link to TSC0000030 in database TSC"
>>
>> print "$link1\n";
>> # gets a deprecation warning for now, removed completely for 1.6
>>
>> chris
>>
>> On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:
>>
>>> The reason was to provide for backward compatibility with the  
>>> original API in which tag values were scalars, not objects. The  
>>> idea was that if someone relied on that and treats the object as  
>>> a scalar (comparison, printing, etc), the operator overloading  
>>> would take care of that.
>>>
>>> So by going back to the original API the overloading should  
>>> become obsolete, at least theoretically.
>>>
>>> The overloading can cause some very subtle issues that I pointed  
>>> out in an earlier email. It's one of those really "clever" tricks  
>>> that just make it very hard for newcomers to understand what's  
>>> going on in their code.
>>>
>>> 	-hilmar
>>>
>>> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>>>
>>>> I have finished rolling back most of the specific changes made  
>>>> prior
>>>> to the 1.5 release and have relevant tests passing:
>>>>
>>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>>>
>>>> Operator overloading of Bio::Annotation objects will be trickier to
>>>> debug as tons of tests fail when the overloading is removed:
>>>>
>>>> http://www.bioperl.org/wiki/ 
>>>> Feature_Annotation_rollback#Second_round
>>>>
>>>> I'll start looking into fixes.  I don't like overloads from a
>>>> personal standpoint (problems w/ long-term code maintenance),  
>>>> but was
>>>> there a more specific reason for removing them?
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Sun Aug 26 19:01:03 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 19:01:03 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
	<503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
	<E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>
Message-ID: <35BBCF3B-BA1B-4C8D-8753-2A27AB3B423C@gmx.net>


On Aug 26, 2007, at 6:47 PM, Chris Fields wrote:

> just call as_text('display') or display_text()

The latter is more obvious, and can be better tested for presence and  
implementation, though in the world of perl that's of course not  
strictly true.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From zeroliu at 163.com  Mon Aug 27 07:49:53 2007
From: zeroliu at 163.com (zeroliu)
Date: Mon, 27 Aug 2007 19:49:53 +0800 (CST)
Subject: [Bioperl-l] Problems of parse emboss water result by Bio::AlignIO
Message-ID: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>

 Hello,
I'm trying to parse water (EMBOSS 5.0.0) result by Bio::AlignIO
(Bioperl-1.4) and encountered some problems.
1. What does the Bio::AlignIO->next_aln() return?
Does it return a Bio::Align::AlignI or Bio::SimpleAlign object?
Or it depends on the alignment file format?
2. How can I get the "score" properity in a water alignment result?
There is a score method in Bio::SimpleAlign but not in Bio::AlignIO.
In 2004, Jason mentioned:
Scores are set by the Alignment parser - we separate the 'running' from
the 'parsing'.
Bio::AlignIO::emboss had to be updated.
(http://article.gmane.org/gmane.comp.lang.perl.bio.general/7156/match=alignio+water)
How could I know it?
Thank you very much!  


From cjfields at uiuc.edu  Mon Aug 27 13:13:13 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 12:13:13 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Annotated status
Message-ID: <6DC5ECA8-3DF1-4B84-914C-4F2B3B44E29A@uiuc.edu>

What is the current status on maintenance of  
Bio::SeqFeature::Annotated?  From what I gather (based on the code  
and past mail list posts) the intent of the module seems to be to  
store any SeqFeature-specific data (tags, score, source, primary_tag,  
etc) in a Bio::AnnotationCollectionI as strongly typed data.  However  
there are several inconsistencies, such as objects being returned  
when a string is expected (score(), source()).

Also, several methods appear half-implemented, aren't consistent with  
SeqFeatureI API or similar methods in other SeqFeatureI's, and there  
are no docs explaining what is expected.
If no one speaks up on it, I'll do my best with maintaining it  
myself, but don't expect the API to stay as it is.

chris


From cjfields at uiuc.edu  Mon Aug 27 18:31:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 17:31:01 -0500
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
Message-ID: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>

This is related to the ongoing Feature/Annotation rollback.  I have  
found that a few Ontology-related modules are (either directly or  
indirectly) passing strings instead of Bio::Annotation::DBLinks to  
Bio::Ontology::Term::new(), add_dblink(), or add_dblink_context()  
(thelast is where the error occurs).

If needed we could allow strings to be passed but this isn't  
consistent with the API.  Any thoughts on what to do here?

chris


From hlapp at gmx.net  Mon Aug 27 19:07:12 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 27 Aug 2007 19:07:12 -0400
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
In-Reply-To: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
References: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
Message-ID: <01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>

The B::O::TermI interface actually says that get_dblinks() would  
return scalars. That's why the add_dblink methods accept strings. I  
also agree that this is inconsistent with with the rest of BioPerl.

Oddly enough, Term::add_dblink_context() does ask for DBLink objects,  
though it doesn't seem to be enforced, even though  
Term::get_dblink_context() is advertised as returning scalars.

So it does seem this is messed up design-wise. It seems to me that to  
really fix this would inevitably break the API, and I don't see how  
you would make this backwards compatible w/o creating a lot of messy  
code, the sole purpose of which would be backwards compatibility.

One could only fix Term::add_dblink_context() as it's not in the  
interface but that wouldn't contribute anything to improving  
consistency.

So the alternative to breaking the API in a non-backwards compatible  
fashion would be to add to it, map the existing dblink methods onto  
the added ones, and start deprecating them. For example, you could  
add methods get_dbxrefs() (also on the interface), add_dbxref(),  
etc,   and build in a context argument so we don't need another set  
of methods for that. They would accept and return DBLink objects, and  
the get_dblink() methods could be changed to map those to scalars  
while also getting slated for deprecation.

Does this make sense?

	-hilmar

On Aug 27, 2007, at 6:31 PM, Chris Fields wrote:

> This is related to the ongoing Feature/Annotation rollback.  I have
> found that a few Ontology-related modules are (either directly or
> indirectly) passing strings instead of Bio::Annotation::DBLinks to
> Bio::Ontology::Term::new(), add_dblink(), or add_dblink_context()
> (thelast is where the error occurs).
>
> If needed we could allow strings to be passed but this isn't
> consistent with the API.  Any thoughts on what to do here?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Mon Aug 27 21:12:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 20:12:35 -0500
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
In-Reply-To: <01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>
References: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
	<01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>
Message-ID: <EF121F1E-BAA0-49BD-830F-1F3BC6FAC807@uiuc.edu>


On Aug 27, 2007, at 6:07 PM, Hilmar Lapp wrote:

> The B::O::TermI interface actually says that get_dblinks() would  
> return scalars. That's why the add_dblink methods accept strings. I  
> also agree that this is inconsistent with with the rest of BioPerl.
>
> Oddly enough, Term::add_dblink_context() does ask for DBLink  
> objects, though it doesn't seem to be enforced, even though  
> Term::get_dblink_context() is advertised as returning scalars.

This happened b/c of stringification and 'eq' overloading.  Just  
removing the overloads didn't reveal this problem; I had to add  
exceptions to them to fish this out.

> So it does seem this is messed up design-wise. It seems to me that  
> to really fix this would inevitably break the API, and I don't see  
> how you would make this backwards compatible w/o creating a lot of  
> messy code, the sole purpose of which would be backwards  
> compatibility.
>
> One could only fix Term::add_dblink_context() as it's not in the  
> interface but that wouldn't contribute anything to improving  
> consistency.

Agreed; in fact it may make it more confusing.

> So the alternative to breaking the API in a non-backwards  
> compatible fashion would be to add to it, map the existing dblink  
> methods onto the added ones, and start deprecating them. For  
> example, you could add methods get_dbxrefs() (also on the  
> interface), add_dbxref(), etc,   and build in a context argument so  
> we don't need another set of methods for that. They would accept  
> and return DBLink objects, and the get_dblink() methods could be  
> changed to map those to scalars while also getting slated for  
> deprecation.
>
> Does this make sense?
>
> 	-hilmar

I think so; I'll have to look over the code to see how we would  
implement this, though I'm guessing everything would be stored as  
DBLink objects by default.  Any changes will probably need to wait  
until after I fish out any remaining spots in the code where  
overloading is being used, but at least we have a direction on where  
to go.

chris


From cjfields at uiuc.edu  Tue Aug 28 00:18:19 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 23:18:19 -0500
Subject: [Bioperl-l] Feature/Annotation rollback (update #2)
Message-ID: <A91DD20B-841B-480A-A953-E811AD634AF0@uiuc.edu>

Okay, the planned rollback on is pretty much complete with a few  
exceptions.  I'll probably merge back to bioperl-live within the next  
few days once the following issues are addressed:

1)  Bio::Ontology::Term - several classes are using  
Bio::Ontology::Term in ways inconsistent with one another; some are  
passing Bio::Annotation::DBLink instances and other are passing  
simple strings.  This was somewhat transparent with various operator  
overloads but now they have really come to the surface.  I'll  
probably work on Hilmar's suggestion on adding extra class methods to  
give it a more consistent interface and deprecate the older ones.  As  
one might guess this affects much of Bio::Ontology but also  
Bio::Seqfeature::Annotated; strangely enough FeatureIO tests pass  
(which may simply mean there isn't enough test coverage for FeatureIO).

2)  Bio::SeqFeature::Annotated - no word back on maintenance for this  
module.  It needs to implement Bio::SeqFeature::TypedSeqFeatureI  
(pretty easy) and needs documentation (not so easy).  It's apparently  
essential for FeatureIO.  I'll basically get it up-and-running and  
clean up the API.

There are a few odds and ends that need to be addressed with  
roundtripping, but these are already problems on the MAIN trunk so  
they will be addressed once code is merged back in.

chris


From Frigerio at pierroton.inra.fr  Tue Aug 28 03:12:22 2007
From: Frigerio at pierroton.inra.fr (Jean-Marc FRIGERIO)
Date: Tue, 28 Aug 2007 09:12:22 +0200
Subject: [Bioperl-l] Bio::SeqIO::phd_comment objet
Message-ID: <200708280912.22798.Frigerio@pierroton.inra.fr>

Hi,

The Bio::SeqIO::phd module says, speaking about the COMMENT section of a phd 
file:
 # this should be an actual object to assist in serialization
  # but I don't have time for this now."

The doc says ( http://www.bioperl.org/wiki/Core_1.5.1_1.5.2_delta)

   This really needs a "phred_comments" object of some sort so that it will be 
serializable. Then when java clients get this object they will be able to 
deserialize it. 

I volunteer to do this,  but need your opinion.

Do we really need an object (Bio::phd_comment ? Bio::SeqIO::phd_comment ? 
Bio::phd_header ? other ?).

Or adding  few  Bio::Seq::SeqWithQuality subs in the Bio::SeqIO::phd module 
would suffice ? What are the constraints of serialization/deserialization of 
the java clients ?
I was thinking of just adding get/setter for all the comments
chromat_file(), abi_thumbprint(), etc.

touch() for the timestamp
attribute() for new unknown comments
write_comment().

others ?

		-- jmf

-- 
Jean-Marc Frigerio,
UMR BIOGECO   69, route d'Arcachon, 33612 CESTAS France
Tel : +33(0) 557 122 829   Fax : +33(0) 557 122 881
Frigerio at pierroton.inra.fr   http://www.pierroton.inra.fr/biogeco/index.html


From jay at jays.net  Tue Aug 28 07:14:37 2007
From: jay at jays.net (Jay Hannah)
Date: Tue, 28 Aug 2007 06:14:37 -0500
Subject: [Bioperl-l] Problems of parse emboss water result by
	Bio::AlignIO
In-Reply-To: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>
References: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>
Message-ID: <4CD8B5C2-3C87-495C-894E-17C3C67091DA@jays.net>

On Aug 27, 2007, at 6:49 AM, zeroliu wrote:
> I'm trying to parse water (EMBOSS 5.0.0) result by Bio::AlignIO
> (Bioperl-1.4) and encountered some problems.
> 1. What does the Bio::AlignIO->next_aln() return?
> Does it return a Bio::Align::AlignI or Bio::SimpleAlign object?
> Or it depends on the alignment file format?

http://doc.bioperl.org/bioperl-live/Bio/AlignIO.html
  Title   : next_aln
  Usage   : $aln = stream->next_aln
  Function: reads the next $aln object from the stream
  Returns : a Bio::Align::AlignI compliant object

> 2. How can I get the "score" properity in a water alignment result?
> There is a score method in Bio::SimpleAlign but not in Bio::AlignIO.
> In 2004, Jason mentioned:
> Scores are set by the Alignment parser - we separate the 'running'  
> from
> the 'parsing'.
> Bio::AlignIO::emboss had to be updated.
> (http://article.gmane.org/gmane.comp.lang.perl.bio.general/7156/ 
> match=alignio+water)
> How could I know it?

Line 480 of t/AlignIO.t seems to walk you through? Here's the block,  
with the test overhead removed.

# EMBOSS water
$str = Bio::AlignIO->new('-format' => 'emboss',
                          '-file' => 'cysprot.water');
$aln = $str->next_aln();
# $aln is now a Bio::Align::AlignI object
print $aln->score;    # '501.50'

HTH,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Tue Aug 28 17:05:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 28 Aug 2007 16:05:10 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
Message-ID: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>

I'm now wrapping up the Feature/Annotation rollback.  I will probably  
start merging back to the main branch in the next day or two., as  
soon as interested parties (*cough*devs*cough*) look over the last  
batch of changes.

http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round

I have also added a small benchmark test which indicates a decrease  
in parsing time in SeqIO::genbank with all tests passing.  I expect  
this will translate over to any Bio::SeqFeature::Generic-using class  
(open mouth, prepare to insert foot....).

It is also possible there are still some instances where overloading  
is expected lurking about in the ~1000 or so modules, so I'll leave  
the exceptions I added to all Bio::AnnotationI; we can remove them  
down the line, maybe prior to rel1.6, after more tests are added or  
if they get particularly annoying.  My guess is I caught 99.99% of  
them (prepare to insert other foot....).

The key change in this last round is the addition of several class  
*dbxref* methods to Bio::Ontology::Term and  
Bio::Annotation::OntologyTerm, all of which are capable of working  
with either DBLink instances or simple scalars.  This was primarily  
done in order to clear up inconsistencies in the older *dblink*  
methods, which were ambiguous (some indicates simple scalar  
arguments, others DBLink objects); operator overloading was used  
extensively in these cases, which led to several issues.  I have  
added deprecation warnings to the older methods which now map to  
using the newer methods.  All tests pass with the exception of a few  
already failing on the MAIN branch; the single test which needs to be  
fixed is a round-tripping error in swiss.t (now a TODO), which can be  
fixed after merging back.

Please respond to this if there are any questions or if I need to  
clarify the changes I made a bit more.

chris


From hlapp at duke.edu  Tue Aug 28 18:13:32 2007
From: hlapp at duke.edu (Hilmar Lapp)
Date: Tue, 28 Aug 2007 18:13:32 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
References: <20070828070219.DE03668527@evol.biology.mcmaster.ca>
Message-ID: <1F006707-291C-4895-A178-33FDFBDE6AE6@duke.edu>

Is anyone thinking about adding support for this as an aligner  
option? I'm not sure whether aside from a Bio::Tools::Run module we'd  
also need a format parser - it sounds like it's emitting clustalw  
format?

	-hilmar

Begin forwarded message:

> From: evoldir at evol.biology.mcmaster.ca
> Date: August 28, 2007 3:02:19 AM EDT
> To: hlapp at duke.edu
> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
> Reply-To: racartwr at ncsu.edu
>
>
> Ngila is a global, pairwise alignment program that uses logarithmic  
> and
> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are more
> biologically realistic than the more popular (and efficient) affine  
> gap
> cost model.
>
> I have recently completed updating the program to version 1.2.1.  The
> new version includes two new, evolutionary alignment models based  
> on my
> current research.  These models allow you to find the maximum  
> alignment
> of two sequences based on biological, evolutionary parameters---no  
> more
> guessing at biological costs.  Additional changes are noted on the  
> website.
>
> Website & Manual:
>
> http://scit.us/projects/ngila/
>
> Windows Binary:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>
> Unix/Mac Source Code:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>
> I'll be happy to answer any questions users have about the new  
> models or
> the program.
>
> -- 
> *********************************************************
> Reed A. Cartwright, PhD     http://scit.us/
> Postdoctoral Researcher     http://www.dererumnatura.us/
> Department of Genetics      http://www.pandasthumb.org/
>
> Bioinformatics Research Center
> North Carolina State University
> Campus Box 7566
> Raleigh, NC 27695-7566
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:- hlapp at duke dot edu :
===========================================================


From hlapp at duke.edu  Tue Aug 28 18:13:32 2007
From: hlapp at duke.edu (Hilmar Lapp)
Date: Tue, 28 Aug 2007 18:13:32 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
Message-ID: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>

Is anyone thinking about adding support for this as an aligner  
option? I'm not sure whether aside from a Bio::Tools::Run module we'd  
also need a format parser - it sounds like it's emitting clustalw  
format?

	-hilmar

Begin forwarded message:

> From: evoldir at evol.biology.mcmaster.ca
> Date: August 28, 2007 3:02:19 AM EDT
> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
> Reply-To: racartwr at ncsu.edu
>
>
> Ngila is a global, pairwise alignment program that uses logarithmic  
> and
> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are more
> biologically realistic than the more popular (and efficient) affine  
> gap
> cost model.
>
> I have recently completed updating the program to version 1.2.1.  The
> new version includes two new, evolutionary alignment models based  
> on my
> current research.  These models allow you to find the maximum  
> alignment
> of two sequences based on biological, evolutionary parameters---no  
> more
> guessing at biological costs.  Additional changes are noted on the  
> website.
>
> Website & Manual:
>
> http://scit.us/projects/ngila/
>
> Windows Binary:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>
> Unix/Mac Source Code:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>
> I'll be happy to answer any questions users have about the new  
> models or
> the program.
>
> -- 
> *********************************************************
> Reed A. Cartwright, PhD     http://scit.us/
> Postdoctoral Researcher     http://www.dererumnatura.us/
> Department of Genetics      http://www.pandasthumb.org/
>
> Bioinformatics Research Center
> North Carolina State University
> Campus Box 7566
> Raleigh, NC 27695-7566
>


From hlapp at gmx.net  Tue Aug 28 19:09:46 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 28 Aug 2007 19:09:46 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
In-Reply-To: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
References: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
Message-ID: <EF683AC3-F30C-49BC-9F16-7BA10C70F751@gmx.net>

Sorry for the double post, BTW. I had erroneously assumed that the  
first email would be held for post by non-member. -hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 00:01:13 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 28 Aug 2007 23:01:13 -0500
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
In-Reply-To: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
References: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
Message-ID: <EDED724C-3219-45FF-BAF2-592EEEBCB634@uiuc.edu>

It probably wouldn't be hard to write one up, particularly if it's  
got already parsable format.  We could probably base it off the  
current clustalw wrapper unless someone else thinks there is a better  
way.

chris

On Aug 28, 2007, at 5:13 PM, Hilmar Lapp wrote:

> Is anyone thinking about adding support for this as an aligner
> option? I'm not sure whether aside from a Bio::Tools::Run module we'd
> also need a format parser - it sounds like it's emitting clustalw
> format?
>
> 	-hilmar
>
> Begin forwarded message:
>
>> From: evoldir at evol.biology.mcmaster.ca
>> Date: August 28, 2007 3:02:19 AM EDT
>> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
>> Reply-To: racartwr at ncsu.edu
>>
>>
>> Ngila is a global, pairwise alignment program that uses logarithmic
>> and
>> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are  
>> more
>> biologically realistic than the more popular (and efficient) affine
>> gap
>> cost model.
>>
>> I have recently completed updating the program to version 1.2.1.  The
>> new version includes two new, evolutionary alignment models based
>> on my
>> current research.  These models allow you to find the maximum
>> alignment
>> of two sequences based on biological, evolutionary parameters---no
>> more
>> guessing at biological costs.  Additional changes are noted on the
>> website.
>>
>> Website & Manual:
>>
>> http://scit.us/projects/ngila/
>>
>> Windows Binary:
>>
>> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>>
>> Unix/Mac Source Code:
>>
>> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>>
>> I'll be happy to answer any questions users have about the new
>> models or
>> the program.
>>
>> -- 
>> *********************************************************
>> Reed A. Cartwright, PhD     http://scit.us/
>> Postdoctoral Researcher     http://www.dererumnatura.us/
>> Department of Genetics      http://www.pandasthumb.org/
>>
>> Bioinformatics Research Center
>> North Carolina State University
>> Campus Box 7566
>> Raleigh, NC 27695-7566
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Aug 29 12:03:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 11:03:07 -0500
Subject: [Bioperl-l] remote SwissProt server problems
Message-ID: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>

Just as a notice, DBFetch is currently retrieving only single records  
for the UniProtKB database (where Bio::DB::SwissProt fetches  
sequences).  If anyone runs remote sevrer tests and DB.t in the test  
suite you'll see a failure towards the end which indicates this.   
I've posted a notice to the server help desk and will respond when I  
hear more.

chris


From cain.cshl at gmail.com  Wed Aug 29 15:45:48 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 29 Aug 2007 15:45:48 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
Message-ID: <1188416748.2567.36.camel@localhost.localdomain>

Hi Chris,

I just wanted to let you know that I was out of town for a few days, but
now I'm back and I'm doing testing of GMOD software based on the branch
you are working on.  I'll let you know how it goes, but don't let me
stop you if you confident of your changes.  I'm sure whatever goes
wrong, it will just point out holes in the FeatureIO tests (I'm sure
there are plenty) and will require hopefully minimal changes on my end.

Thanks for your considerable efforts on this!  (Regardless of how much
work it makes for me :-)
Scott


On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> I'm now wrapping up the Feature/Annotation rollback.  I will probably  
> start merging back to the main branch in the next day or two., as  
> soon as interested parties (*cough*devs*cough*) look over the last  
> batch of changes.
> 
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
> 
> I have also added a small benchmark test which indicates a decrease  
> in parsing time in SeqIO::genbank with all tests passing.  I expect  
> this will translate over to any Bio::SeqFeature::Generic-using class  
> (open mouth, prepare to insert foot....).
> 
> It is also possible there are still some instances where overloading  
> is expected lurking about in the ~1000 or so modules, so I'll leave  
> the exceptions I added to all Bio::AnnotationI; we can remove them  
> down the line, maybe prior to rel1.6, after more tests are added or  
> if they get particularly annoying.  My guess is I caught 99.99% of  
> them (prepare to insert other foot....).
> 
> The key change in this last round is the addition of several class  
> *dbxref* methods to Bio::Ontology::Term and  
> Bio::Annotation::OntologyTerm, all of which are capable of working  
> with either DBLink instances or simple scalars.  This was primarily  
> done in order to clear up inconsistencies in the older *dblink*  
> methods, which were ambiguous (some indicates simple scalar  
> arguments, others DBLink objects); operator overloading was used  
> extensively in these cases, which led to several issues.  I have  
> added deprecation warnings to the older methods which now map to  
> using the newer methods.  All tests pass with the exception of a few  
> already failing on the MAIN branch; the single test which needs to be  
> fixed is a round-tripping error in swiss.t (now a TODO), which can be  
> fixed after merging back.
> 
> Please respond to this if there are any questions or if I need to  
> clarify the changes I made a bit more.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070829/f8433568/attachment-0003.bin>

From cjfields at uiuc.edu  Wed Aug 29 16:13:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 15:13:17 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188416748.2567.36.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
Message-ID: <8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>

I'll probably go ahead and start merging this stuff over to CVS HEAD  
then.  There haven't been any objections so far.

The page I posted outlines the more critical fixes, primarily the  
changes to Bio::Ontology::Term methods (along with relevant code) due  
to inconsistencies in the interface.  The Bio::Annotation classes  
also now throw if you attempt to use them in an overloaded context.   
I also split off SeqFeature::Annotated tests into it's own test suite  
(SeqFeatAnnotated.t).

Let me know if there are any problems along the way!

chris

On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:

> Hi Chris,
>
> I just wanted to let you know that I was out of town for a few  
> days, but
> now I'm back and I'm doing testing of GMOD software based on the  
> branch
> you are working on.  I'll let you know how it goes, but don't let me
> stop you if you confident of your changes.  I'm sure whatever goes
> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> there are plenty) and will require hopefully minimal changes on my  
> end.
>
> Thanks for your considerable efforts on this!  (Regardless of how much
> work it makes for me :-)
> Scott
>
>
> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>> I'm now wrapping up the Feature/Annotation rollback.  I will probably
>> start merging back to the main branch in the next day or two., as
>> soon as interested parties (*cough*devs*cough*) look over the last
>> batch of changes.
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>>
>> I have also added a small benchmark test which indicates a decrease
>> in parsing time in SeqIO::genbank with all tests passing.  I expect
>> this will translate over to any Bio::SeqFeature::Generic-using class
>> (open mouth, prepare to insert foot....).
>>
>> It is also possible there are still some instances where overloading
>> is expected lurking about in the ~1000 or so modules, so I'll leave
>> the exceptions I added to all Bio::AnnotationI; we can remove them
>> down the line, maybe prior to rel1.6, after more tests are added or
>> if they get particularly annoying.  My guess is I caught 99.99% of
>> them (prepare to insert other foot....).
>>
>> The key change in this last round is the addition of several class
>> *dbxref* methods to Bio::Ontology::Term and
>> Bio::Annotation::OntologyTerm, all of which are capable of working
>> with either DBLink instances or simple scalars.  This was primarily
>> done in order to clear up inconsistencies in the older *dblink*
>> methods, which were ambiguous (some indicates simple scalar
>> arguments, others DBLink objects); operator overloading was used
>> extensively in these cases, which led to several issues.  I have
>> added deprecation warnings to the older methods which now map to
>> using the newer methods.  All tests pass with the exception of a few
>> already failing on the MAIN branch; the single test which needs to be
>> fixed is a round-tripping error in swiss.t (now a TODO), which can be
>> fixed after merging back.
>>
>> Please respond to this if there are any questions or if I need to
>> clarify the changes I made a bit more.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                          
> cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jay at jays.net  Wed Aug 29 18:11:55 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 29 Aug 2007 17:11:55 -0500
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
Message-ID: <46D5EF2B.5000101@jays.net>

Please slap me if I'm hysterical.

I'm seeking a broad bioinformatics search engine platform. I want to 
take gobs of data in gobs of formats and allow people to search it on 
the web.

- Entrez is awesome. Unfortunately I don't see anything in the NCBI 
toolkit that helps me run my own version of it. Even a tiny one. After 
an initial "check out our toolkit" response from NCBI I don't seem to be 
getting anywhere. Maybe I'm not communicating enough or well enough.

- EB-eye Search is slick. I don't see any developer kit or source code 
of any kind and I've gotten no response to my emails to them.

- LuceGene is very cool. But it looks like no one has touched it in 2.5 
years and I've gotten no response from their contact email address. I'm 
especially intrigued by their

  src/LuceGene/src/org/eugenes/index/LuceneReadseqIndexer.java

which seems to use the rather popular(?) Java Readseq to populate Lucene 
with source data in all sorts of different formats.

I don't know Java.

- Solr is really neat. It's easy to install and gives a simple/powerful 
XML API to populate a Lucene index.

... so ...

I'm thinking BioPerl knows how to parse lots of formats into a Bio::Seq.

I'm thinking I could write Perl which would take a Bio::Seq object and 
convert it to an XML file which Solr would happily inject into Lucene 
for me.

If I could do that I'm thinking that any of the many formats that 
Bio::SeqIO can slurp could magically be sent into a Lucene index for 
searching.

I'm thinking that would be really cool and I'm going to write it.

Now's your chance to slap me.

Since I haven't started yet, what would I call this thing? 
Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)

Thanks,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


More notes:
http://clab.ist.unomaha.edu/CLAB/index.php/RT11


From hlapp at gmx.net  Wed Aug 29 21:37:59 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 29 Aug 2007 21:37:59 -0400
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <46D5EF2B.5000101@jays.net>
References: <46D5EF2B.5000101@jays.net>
Message-ID: <D202078D-8F88-4FAA-94EA-8C08CE653C41@gmx.net>


On Aug 29, 2007, at 6:11 PM, Jay Hannah wrote:

> [...]
>
> I'm thinking I could write Perl which would take a Bio::Seq object and
> convert it to an XML file which Solr would happily inject into Lucene
> for me.
>
> If I could do that I'm thinking that any of the many formats that
> Bio::SeqIO can slurp could magically be sent into a Lucene index for
> searching.
>
> [...]
> Since I haven't started yet, what would I call this thing?
> Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)

Would this be a Solr-specific XML writer? Or could you use an  
existing XML format for sequences?

(as an aside, if you do need a Solr-specific format writer, my  
suggestion would be to name it solrxml [lowercase])

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 22:01:45 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 21:01:45 -0500
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <46D5EF2B.5000101@jays.net>
References: <46D5EF2B.5000101@jays.net>
Message-ID: <0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>


On Aug 29, 2007, at 5:11 PM, Jay Hannah wrote:

> Please slap me if I'm hysterical.
>
> I'm seeking a broad bioinformatics search engine platform. I want to
> take gobs of data in gobs of formats and allow people to search it on
> the web.
>
> - Entrez is awesome. Unfortunately I don't see anything in the NCBI
> toolkit that helps me run my own version of it. Even a tiny one. After
> an initial "check out our toolkit" response from NCBI I don't seem  
> to be
> getting anywhere. Maybe I'm not communicating enough or well enough.

No.  I have had non-responses before from NCBI; they may just be too  
busy.  Warnock probably applies.

> - EB-eye Search is slick. I don't see any developer kit or source code
> of any kind and I've gotten no response to my emails to them.

Not sure of this one personally.

> - LuceGene is very cool.
> ...
> I don't know Java.

...but you could write a (perl) wrapper around it.  You can try  
contacting Don Gilbert about it, though I think he's been trying out  
Chado.

> - Solr is really neat. It's easy to install and gives a simple/ 
> powerful
> XML API to populate a Lucene index.
> ... so ...
>
> I'm thinking BioPerl knows how to parse lots of formats into a  
> Bio::Seq.
>
> ...
>
> I'm thinking that would be really cool and I'm going to write it.
>
> Now's your chance to slap me.

No need.

> Since I haven't started yet, what would I call this thing?
> Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)
>
> Thanks,
>
> Jay Hannah
> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
> More notes:
> http://clab.ist.unomaha.edu/CLAB/index.php/RT11

The way I would go about it is use an established XML schema as a  
starting point and implement a writer (if bioperl doesn't already  
support it).  It's better than reinventing (a constantly reinvented)  
wheel and starting up a brand-new schema of your own.  INSDSeq  
(http://www.insdc.org/page.php?page=xmlstatus) is one I've been  
wanting to add for a while but haven't had time to work on; there are  
several other examples.  Note that a few of the currently supported  
ones in bioperl, such as bsml and game, have had very little to no  
development over the years in favor of newer (better?) XML flavors,  
so it likely isn't worth working with those.

chris


From hlapp at gmx.net  Wed Aug 29 22:02:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 29 Aug 2007 22:02:45 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
Message-ID: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>


On Aug 28, 2007, at 5:05 PM, Chris Fields wrote:

> I'm now wrapping up the Feature/Annotation rollback.  I will probably
> start merging back to the main branch in the next day or two., as
> soon as interested parties (*cough*devs*cough*) look over the last
> batch of changes.
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>
> [...]
> It is also possible there are still some instances where overloading
> is expected lurking about in the ~1000 or so modules, so I'll leave
> the exceptions I added to all Bio::AnnotationI

Keep in mind that code such as

	if ($ann) { ... }

is mostly not b/c someone wanted to use overloading, but rather  
someone was lazy and really meant to say

	if (defined($ann)) { ... }

In the absence of eq overloading, these will behave identically. So  
if you leave the exceptions in it is sort-of policing lazy  
programmers, which I guess is fine in principle, but is guaranteed to  
trip up a lot of script code. I'd take it out if you're reasonably  
sure that at least within BioPerl itself those lazy programming  
incidents are removed.

> [...]
> The key change in this last round is the addition of several class
> *dbxref* methods to Bio::Ontology::Term and
> Bio::Annotation::OntologyTerm, all of which are capable of working
> with either DBLink instances or simple scalars.

I don't think you need the code here to deal with both scalars and  
objects. It is fine I think to define the new methods from the outset  
to consistently accept and return DBLink objects, and period.

The backwards compatibility logic should rather be in the *_dblink*()  
methods; i.e., instead of simple aliases they should have the code to  
map to and from the new API. That way, once the deprecation cycle  
ends, they can be removed, and with them all the legacy code that now  
is no longer needed, whereas if you have that in the new methods, it  
keeps bothering the maintainers.

You also mention a add_dbxref_context() on the wiki page - I'm not  
sure why that would be needed given that you build in the -context  
option to add_dbxref() from the outset. But maybe I've glossed over  
some detail.

Once this is merged back to the main trunk, I guess we need to give  
Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it  
makes real sense.

Thanks Chris for this effort, this clears a monumental roadblock.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 23:23:14 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 22:23:14 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
Message-ID: <A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>


On Aug 29, 2007, at 9:02 PM, Hilmar Lapp wrote:

>
> On Aug 28, 2007, at 5:05 PM, Chris Fields wrote:
>
>> I'm now wrapping up the Feature/Annotation rollback.  I will probably
>> start merging back to the main branch in the next day or two., as
>> soon as interested parties (*cough*devs*cough*) look over the last
>> batch of changes.
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>>
>> [...]
>> It is also possible there are still some instances where overloading
>> is expected lurking about in the ~1000 or so modules, so I'll leave
>> the exceptions I added to all Bio::AnnotationI
>
> Keep in mind that code such as
>
> 	if ($ann) { ... }
>
> is mostly not b/c someone wanted to use overloading, but rather
> someone was lazy and really meant to say
>
> 	if (defined($ann)) { ... }

Agreed.

> In the absence of eq overloading, these will behave identically. So
> if you leave the exceptions in it is sort-of policing lazy
> programmers, which I guess is fine in principle, but is guaranteed to
> trip up a lot of script code. I'd take it out if you're reasonably
> sure that at least within BioPerl itself those lazy programming
> incidents are removed.

I agree the overload exceptions shouldn't be left in.  The problem is  
I'm not certain we have caught most implicit overload calls (just the  
ones tested for).  Scott's checking everything against GMOD, though,  
so we can remove them after that.

>> [...]
>> The key change in this last round is the addition of several class
>> *dbxref* methods to Bio::Ontology::Term and
>> Bio::Annotation::OntologyTerm, all of which are capable of working
>> with either DBLink instances or simple scalars.
>
> I don't think you need the code here to deal with both scalars and
> objects. It is fine I think to define the new methods from the outset
> to consistently accept and return DBLink objects, and period.
>
> The backwards compatibility logic should rather be in the *_dblink*()
> methods; i.e., instead of simple aliases they should have the code to
> map to and from the new API. That way, once the deprecation cycle
> ends, they can be removed, and with them all the legacy code that now
> is no longer needed, whereas if you have that in the new methods, it
> keeps bothering the maintainers.

That should be easy enough to fix and would be more consistent.  I  
can look over the various calls to dbxref methods and see what needs  
to be done, then fix that in cvs.

> You also mention a add_dbxref_context() on the wiki page - I'm not
> sure why that would be needed given that you build in the -context
> option to add_dbxref() from the outset. But maybe I've glossed over
> some detail.

The -context parameter was in get_dbxref(), to grab those DBLinks in  
a particular context.  We could do the same with add_dbxref() (pass  
DBLinks in first arg as array ref, context as second arg).  That  
would then obviate the need for add_dbxref_context().

I'll also change the parameter passing in get_dbxref() to just accept  
context as an single optional argument since we're dealing with only  
DBLink instances now.

> Once this is merged back to the main trunk, I guess we need to give
> Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it
> makes real sense.

It describes one method, ontology_term(), which returns a  
Bio::Ontology::TermI.  This is similar to SeqFeature::Annotated::type 
(), which returns a Bio::Annotation::OntologyTerm (a  
Bio::Ontology::TermI).  My thought is to simply deprecate type() in  
favor of TypedSeqFeatureI::ontology_term().

> Thanks Chris for this effort, this clears a monumental roadblock.
>
> 	-hilmar

No problem.  It just needed to be done.

chris


From florent.angly at gmail.com  Wed Aug 29 23:44:58 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 29 Aug 2007 20:44:58 -0700
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
Message-ID: <46D63D3A.6050308@gmail.com>

Hilmar Lapp wrote:
> Keep in mind that code such as
>
> 	if ($ann) { ... }
>
> is mostly not b/c someone wanted to use overloading, but rather  
> someone was lazy and really meant to say
>
> 	if (defined($ann)) { ... }
>
> In the absence of eq overloading, these will behave identically. So  
> if you leave the exceptions in it is sort-of policing lazy  
> programmers, which I guess is fine in principle, but is guaranteed to  
> trip up a lot of script code. I'd take it out if you're reasonably  
> sure that at least within BioPerl itself those lazy programming  
> incidents are removed.
	if ($ann) { ... }

and 

	if (defined($ann)) { ... }

are not the same.

	if ($ann)

is evaluated false for an empty string like

        $ann = '';

and for a value of zero, i.e.

	$ann = 0;

while

	defined($ann)

returns true in these 2 cases.

Florent


From cjfields at uiuc.edu  Wed Aug 29 23:54:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 22:54:05 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <46D63D3A.6050308@gmail.com>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<46D63D3A.6050308@gmail.com>
Message-ID: <90C3DE31-12FD-4BF3-B9F7-0FB5E1DE2A28@uiuc.edu>


On Aug 29, 2007, at 10:44 PM, Florent Angly wrote:

> Hilmar Lapp wrote:
>> Keep in mind that code such as
>>
>> 	if ($ann) { ... }
>>
>> is mostly not b/c someone wanted to use overloading, but rather   
>> someone was lazy and really meant to say
>>
>> 	if (defined($ann)) { ... }
>>
>> In the absence of eq overloading, these will behave identically.  
>> So  if you leave the exceptions in it is sort-of policing lazy   
>> programmers, which I guess is fine in principle, but is guaranteed  
>> to  trip up a lot of script code. I'd take it out if you're  
>> reasonably  sure that at least within BioPerl itself those lazy  
>> programming  incidents are removed.
> 	if ($ann) { ... }
>
> and
> 	if (defined($ann)) { ... }
>
> are not the same.
>
> 	if ($ann)
>
> is evaluated false for an empty string like
>
>        $ann = '';
>
> and for a value of zero, i.e.
>
> 	$ann = 0;
>
> while
>
> 	defined($ann)
>
> returns true in these 2 cases.
>
> Florent

I agree, but we're talking about the context in which this test is  
performed, where $ann is either an instance of a Bio::AnnotationI or  
undef (not a scalar value or '').  In this case it works both as 'if  
($ann)' or 'if (defined($ann))', though the latter is preferred.   
Never underestimate laziness!

chris


From cain.cshl at gmail.com  Wed Aug 29 23:59:11 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 29 Aug 2007 23:59:11 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <46D63D3A.6050308@gmail.com>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<46D63D3A.6050308@gmail.com>
Message-ID: <1188446351.2567.55.camel@localhost.localdomain>

Hi Florent,

Of course what you wrote below is true, but what Hilmar was writing
about was lazy programmers (like me) who assume that the empty string
and 0 value cases aren't going to happen (because we happen to know they
never should in certain contexts), and so use 'if ($ann)'.  Of course,
at the moment, I am in the process of de-lazifying my code (though I
tended to think of it as being efficent :-)

Scott


On Wed, 2007-08-29 at 20:44 -0700, Florent Angly wrote:
> Hilmar Lapp wrote:
> > Keep in mind that code such as
> >
> > 	if ($ann) { ... }
> >
> > is mostly not b/c someone wanted to use overloading, but rather  
> > someone was lazy and really meant to say
> >
> > 	if (defined($ann)) { ... }
> >
> > In the absence of eq overloading, these will behave identically. So  
> > if you leave the exceptions in it is sort-of policing lazy  
> > programmers, which I guess is fine in principle, but is guaranteed to  
> > trip up a lot of script code. I'd take it out if you're reasonably  
> > sure that at least within BioPerl itself those lazy programming  
> > incidents are removed.
> 	if ($ann) { ... }
> 
> and 
> 
> 	if (defined($ann)) { ... }
> 
> are not the same.
> 
> 	if ($ann)
> 
> is evaluated false for an empty string like
> 
>         $ann = '';
> 
> and for a value of zero, i.e.
> 
> 	$ann = 0;
> 
> while
> 
> 	defined($ann)
> 
> returns true in these 2 cases.
> 
> Florent
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070829/27872681/attachment-0003.bin>

From cain.cshl at gmail.com  Thu Aug 30 00:05:06 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 00:05:06 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
Message-ID: <1188446706.2567.59.camel@localhost.localdomain>

Hi Chris,

Is there a reason that the value method of the
Bio::Annotation::SimpleValue (and possibly some of its siblings)
returning "Value: $value"?  It didn't used to have the "Value: " before,
did it?

Thanks,
Scott


On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> I'll probably go ahead and start merging this stuff over to CVS HEAD  
> then.  There haven't been any objections so far.
> 
> The page I posted outlines the more critical fixes, primarily the  
> changes to Bio::Ontology::Term methods (along with relevant code) due  
> to inconsistencies in the interface.  The Bio::Annotation classes  
> also now throw if you attempt to use them in an overloaded context.   
> I also split off SeqFeature::Annotated tests into it's own test suite  
> (SeqFeatAnnotated.t).
> 
> Let me know if there are any problems along the way!
> 
> chris
> 
> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > I just wanted to let you know that I was out of town for a few  
> > days, but
> > now I'm back and I'm doing testing of GMOD software based on the  
> > branch
> > you are working on.  I'll let you know how it goes, but don't let me
> > stop you if you confident of your changes.  I'm sure whatever goes
> > wrong, it will just point out holes in the FeatureIO tests (I'm sure
> > there are plenty) and will require hopefully minimal changes on my  
> > end.
> >
> > Thanks for your considerable efforts on this!  (Regardless of how much
> > work it makes for me :-)
> > Scott
> >
> >
> > On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >> I'm now wrapping up the Feature/Annotation rollback.  I will probably
> >> start merging back to the main branch in the next day or two., as
> >> soon as interested parties (*cough*devs*cough*) look over the last
> >> batch of changes.
> >>
> >> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
> >>
> >> I have also added a small benchmark test which indicates a decrease
> >> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >> this will translate over to any Bio::SeqFeature::Generic-using class
> >> (open mouth, prepare to insert foot....).
> >>
> >> It is also possible there are still some instances where overloading
> >> is expected lurking about in the ~1000 or so modules, so I'll leave
> >> the exceptions I added to all Bio::AnnotationI; we can remove them
> >> down the line, maybe prior to rel1.6, after more tests are added or
> >> if they get particularly annoying.  My guess is I caught 99.99% of
> >> them (prepare to insert other foot....).
> >>
> >> The key change in this last round is the addition of several class
> >> *dbxref* methods to Bio::Ontology::Term and
> >> Bio::Annotation::OntologyTerm, all of which are capable of working
> >> with either DBLink instances or simple scalars.  This was primarily
> >> done in order to clear up inconsistencies in the older *dblink*
> >> methods, which were ambiguous (some indicates simple scalar
> >> arguments, others DBLink objects); operator overloading was used
> >> extensively in these cases, which led to several issues.  I have
> >> added deprecation warnings to the older methods which now map to
> >> using the newer methods.  All tests pass with the exception of a few
> >> already failing on the MAIN branch; the single test which needs to be
> >> fixed is a round-tripping error in swiss.t (now a TODO), which can be
> >> fixed after merging back.
> >>
> >> Please respond to this if there are any questions or if I need to
> >> clarify the changes I made a bit more.
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                          
> > cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/b03eef7e/attachment-0003.bin>

From cjfields at uiuc.edu  Thu Aug 30 00:17:18 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 23:17:18 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188446706.2567.59.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
Message-ID: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>

It shouldn't, that sounds like the output for add_text().  value()  
should just return the scalar value.

As a note, I added a new method, display_text(), for all  
Bio::AnnotationI classes which by default replicates the same output  
that stringification overloads produced.  So you should be able to  
explicitly call $ann->display_text for any Bio::AnnotationI where you  
once used an implicit call:

# old
print "$ann\n";

# new
print $ann->display_text,"\n";

chris

On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:

> Hi Chris,
>
> Is there a reason that the value method of the
> Bio::Annotation::SimpleValue (and possibly some of its siblings)
> returning "Value: $value"?  It didn't used to have the "Value: "  
> before,
> did it?
>
> Thanks,
> Scott
>
>
> On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
>> I'll probably go ahead and start merging this stuff over to CVS HEAD
>> then.  There haven't been any objections so far.
>>
>> The page I posted outlines the more critical fixes, primarily the
>> changes to Bio::Ontology::Term methods (along with relevant code) due
>> to inconsistencies in the interface.  The Bio::Annotation classes
>> also now throw if you attempt to use them in an overloaded context.
>> I also split off SeqFeature::Annotated tests into it's own test suite
>> (SeqFeatAnnotated.t).
>>
>> Let me know if there are any problems along the way!
>>
>> chris
>>
>> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> I just wanted to let you know that I was out of town for a few
>>> days, but
>>> now I'm back and I'm doing testing of GMOD software based on the
>>> branch
>>> you are working on.  I'll let you know how it goes, but don't let me
>>> stop you if you confident of your changes.  I'm sure whatever goes
>>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
>>> there are plenty) and will require hopefully minimal changes on my
>>> end.
>>>
>>> Thanks for your considerable efforts on this!  (Regardless of how  
>>> much
>>> work it makes for me :-)
>>> Scott
>>>
>>>
>>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
>>>> probably
>>>> start merging back to the main branch in the next day or two., as
>>>> soon as interested parties (*cough*devs*cough*) look over the last
>>>> batch of changes.
>>>>
>>>> http://www.bioperl.org/wiki/ 
>>>> Feature_Annotation_rollback#Fourth_Round
>>>>
>>>> I have also added a small benchmark test which indicates a decrease
>>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
>>>> this will translate over to any Bio::SeqFeature::Generic-using  
>>>> class
>>>> (open mouth, prepare to insert foot....).
>>>>
>>>> It is also possible there are still some instances where  
>>>> overloading
>>>> is expected lurking about in the ~1000 or so modules, so I'll leave
>>>> the exceptions I added to all Bio::AnnotationI; we can remove them
>>>> down the line, maybe prior to rel1.6, after more tests are added or
>>>> if they get particularly annoying.  My guess is I caught 99.99% of
>>>> them (prepare to insert other foot....).
>>>>
>>>> The key change in this last round is the addition of several class
>>>> *dbxref* methods to Bio::Ontology::Term and
>>>> Bio::Annotation::OntologyTerm, all of which are capable of working
>>>> with either DBLink instances or simple scalars.  This was primarily
>>>> done in order to clear up inconsistencies in the older *dblink*
>>>> methods, which were ambiguous (some indicates simple scalar
>>>> arguments, others DBLink objects); operator overloading was used
>>>> extensively in these cases, which led to several issues.  I have
>>>> added deprecation warnings to the older methods which now map to
>>>> using the newer methods.  All tests pass with the exception of a  
>>>> few
>>>> already failing on the MAIN branch; the single test which needs  
>>>> to be
>>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
>>>> can be
>>>> fixed after merging back.
>>>>
>>>> Please respond to this if there are any questions or if I need to
>>>> clarify the changes I made a bit more.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>> Scott Cain, Ph. D.
>>> cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From neetisomaiya at gmail.com  Thu Aug 30 00:47:53 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 30 Aug 2007 10:17:53 +0530
Subject: [Bioperl-l] kegg xml parsing
Message-ID: <764978cf0708292147q4ead37b0i782b83ecda8ce3da@mail.gmail.com>

Hi,

Has anyone used XML::Twig for parsing of kegg xml data?
I was looking for some small example code of the same.

Thanks.
-- 
-Neeti
Even my blood says, B positive


From sdavis2 at mail.nih.gov  Thu Aug 30 06:16:54 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 30 Aug 2007 06:16:54 -0400
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>
References: <46D5EF2B.5000101@jays.net>
	<0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>
Message-ID: <46D69916.4060202@mail.nih.gov>

Chris Fields wrote:
> On Aug 29, 2007, at 5:11 PM, Jay Hannah wrote:
> 
>> Please slap me if I'm hysterical.
>>
>> I'm seeking a broad bioinformatics search engine platform. I want to
>> take gobs of data in gobs of formats and allow people to search it on
>> the web.

Not sure how it might or might not meet your needs, but have you looked
at SRS (Sequence Retrieval System)?  I have never tried to use it,
personally, though.

Sean


From cjfields at uiuc.edu  Thu Aug 30 09:17:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 08:17:17 -0500
Subject: [Bioperl-l] remote SwissProt server problems
In-Reply-To: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>
References: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>
Message-ID: <62B4DE62-C11E-4E75-837C-6C1005FB12A4@uiuc.edu>

This should be fixed now (DBFetch-related tests pass, though MeSH  
tests are now failing!).

chris

On Aug 29, 2007, at 11:03 AM, Chris Fields wrote:

> Just as a notice, DBFetch is currently retrieving only single records
> for the UniProtKB database (where Bio::DB::SwissProt fetches
> sequences).  If anyone runs remote sevrer tests and DB.t in the test
> suite you'll see a failure towards the end which indicates this.
> I've posted a notice to the server help desk and will respond when I
> hear more.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cain.cshl at gmail.com  Thu Aug 30 10:39:59 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 10:39:59 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
Message-ID: <1188484799.2567.84.camel@localhost.localdomain>

Hi Chris,

I see--I was using as_text and getting the "Value: $value"; there are
places in my code where I have always used ->value and I thought that
the way it was working had changed.

What is the use case for having the as_text method work the way it does?

Thanks,
Scott


On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
> It shouldn't, that sounds like the output for add_text().  value()  
> should just return the scalar value.
> 
> As a note, I added a new method, display_text(), for all  
> Bio::AnnotationI classes which by default replicates the same output  
> that stringification overloads produced.  So you should be able to  
> explicitly call $ann->display_text for any Bio::AnnotationI where you  
> once used an implicit call:
> 
> # old
> print "$ann\n";
> 
> # new
> print $ann->display_text,"\n";
> 
> chris
> 
> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > Is there a reason that the value method of the
> > Bio::Annotation::SimpleValue (and possibly some of its siblings)
> > returning "Value: $value"?  It didn't used to have the "Value: "  
> > before,
> > did it?
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> >> I'll probably go ahead and start merging this stuff over to CVS HEAD
> >> then.  There haven't been any objections so far.
> >>
> >> The page I posted outlines the more critical fixes, primarily the
> >> changes to Bio::Ontology::Term methods (along with relevant code) due
> >> to inconsistencies in the interface.  The Bio::Annotation classes
> >> also now throw if you attempt to use them in an overloaded context.
> >> I also split off SeqFeature::Annotated tests into it's own test suite
> >> (SeqFeatAnnotated.t).
> >>
> >> Let me know if there are any problems along the way!
> >>
> >> chris
> >>
> >> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> >>
> >>> Hi Chris,
> >>>
> >>> I just wanted to let you know that I was out of town for a few
> >>> days, but
> >>> now I'm back and I'm doing testing of GMOD software based on the
> >>> branch
> >>> you are working on.  I'll let you know how it goes, but don't let me
> >>> stop you if you confident of your changes.  I'm sure whatever goes
> >>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> >>> there are plenty) and will require hopefully minimal changes on my
> >>> end.
> >>>
> >>> Thanks for your considerable efforts on this!  (Regardless of how  
> >>> much
> >>> work it makes for me :-)
> >>> Scott
> >>>
> >>>
> >>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
> >>>> probably
> >>>> start merging back to the main branch in the next day or two., as
> >>>> soon as interested parties (*cough*devs*cough*) look over the last
> >>>> batch of changes.
> >>>>
> >>>> http://www.bioperl.org/wiki/ 
> >>>> Feature_Annotation_rollback#Fourth_Round
> >>>>
> >>>> I have also added a small benchmark test which indicates a decrease
> >>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >>>> this will translate over to any Bio::SeqFeature::Generic-using  
> >>>> class
> >>>> (open mouth, prepare to insert foot....).
> >>>>
> >>>> It is also possible there are still some instances where  
> >>>> overloading
> >>>> is expected lurking about in the ~1000 or so modules, so I'll leave
> >>>> the exceptions I added to all Bio::AnnotationI; we can remove them
> >>>> down the line, maybe prior to rel1.6, after more tests are added or
> >>>> if they get particularly annoying.  My guess is I caught 99.99% of
> >>>> them (prepare to insert other foot....).
> >>>>
> >>>> The key change in this last round is the addition of several class
> >>>> *dbxref* methods to Bio::Ontology::Term and
> >>>> Bio::Annotation::OntologyTerm, all of which are capable of working
> >>>> with either DBLink instances or simple scalars.  This was primarily
> >>>> done in order to clear up inconsistencies in the older *dblink*
> >>>> methods, which were ambiguous (some indicates simple scalar
> >>>> arguments, others DBLink objects); operator overloading was used
> >>>> extensively in these cases, which led to several issues.  I have
> >>>> added deprecation warnings to the older methods which now map to
> >>>> using the newer methods.  All tests pass with the exception of a  
> >>>> few
> >>>> already failing on the MAIN branch; the single test which needs  
> >>>> to be
> >>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
> >>>> can be
> >>>> fixed after merging back.
> >>>>
> >>>> Please respond to this if there are any questions or if I need to
> >>>> clarify the changes I made a bit more.
> >>>>
> >>>> chris
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>> -- 
> >>> -------------------------------------------------------------------- 
> >>> --
> >>> --
> >>> Scott Cain, Ph. D.
> >>> cain at cshl.edu
> >>> GMOD Coordinator (http://www.gmod.org/)
> >>> 216-392-3087
> >>> Cold Spring Harbor Laboratory
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher
> >> Lab of Dr. Robert Switzer
> >> Dept of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                    
> > cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/f2f5159f/attachment-0003.bin>

From cain.cshl at gmail.com  Thu Aug 30 11:46:24 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 11:46:24 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
Message-ID: <1188488785.2567.93.camel@localhost.localdomain>

Hi Chris,

Good news!  I only had to add a few defineds and a few display_texts and
I was able to successfully create a database and load the yeast GFF3
file.  While I want to do more testing with GFF from other sources,
clearly, I am 95% of the way there with relatively little work.

Nice job and Thanks!
Scott


On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
> It shouldn't, that sounds like the output for add_text().  value()  
> should just return the scalar value.
> 
> As a note, I added a new method, display_text(), for all  
> Bio::AnnotationI classes which by default replicates the same output  
> that stringification overloads produced.  So you should be able to  
> explicitly call $ann->display_text for any Bio::AnnotationI where you  
> once used an implicit call:
> 
> # old
> print "$ann\n";
> 
> # new
> print $ann->display_text,"\n";
> 
> chris
> 
> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > Is there a reason that the value method of the
> > Bio::Annotation::SimpleValue (and possibly some of its siblings)
> > returning "Value: $value"?  It didn't used to have the "Value: "  
> > before,
> > did it?
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> >> I'll probably go ahead and start merging this stuff over to CVS HEAD
> >> then.  There haven't been any objections so far.
> >>
> >> The page I posted outlines the more critical fixes, primarily the
> >> changes to Bio::Ontology::Term methods (along with relevant code) due
> >> to inconsistencies in the interface.  The Bio::Annotation classes
> >> also now throw if you attempt to use them in an overloaded context.
> >> I also split off SeqFeature::Annotated tests into it's own test suite
> >> (SeqFeatAnnotated.t).
> >>
> >> Let me know if there are any problems along the way!
> >>
> >> chris
> >>
> >> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> >>
> >>> Hi Chris,
> >>>
> >>> I just wanted to let you know that I was out of town for a few
> >>> days, but
> >>> now I'm back and I'm doing testing of GMOD software based on the
> >>> branch
> >>> you are working on.  I'll let you know how it goes, but don't let me
> >>> stop you if you confident of your changes.  I'm sure whatever goes
> >>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> >>> there are plenty) and will require hopefully minimal changes on my
> >>> end.
> >>>
> >>> Thanks for your considerable efforts on this!  (Regardless of how  
> >>> much
> >>> work it makes for me :-)
> >>> Scott
> >>>
> >>>
> >>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
> >>>> probably
> >>>> start merging back to the main branch in the next day or two., as
> >>>> soon as interested parties (*cough*devs*cough*) look over the last
> >>>> batch of changes.
> >>>>
> >>>> http://www.bioperl.org/wiki/ 
> >>>> Feature_Annotation_rollback#Fourth_Round
> >>>>
> >>>> I have also added a small benchmark test which indicates a decrease
> >>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >>>> this will translate over to any Bio::SeqFeature::Generic-using  
> >>>> class
> >>>> (open mouth, prepare to insert foot....).
> >>>>
> >>>> It is also possible there are still some instances where  
> >>>> overloading
> >>>> is expected lurking about in the ~1000 or so modules, so I'll leave
> >>>> the exceptions I added to all Bio::AnnotationI; we can remove them
> >>>> down the line, maybe prior to rel1.6, after more tests are added or
> >>>> if they get particularly annoying.  My guess is I caught 99.99% of
> >>>> them (prepare to insert other foot....).
> >>>>
> >>>> The key change in this last round is the addition of several class
> >>>> *dbxref* methods to Bio::Ontology::Term and
> >>>> Bio::Annotation::OntologyTerm, all of which are capable of working
> >>>> with either DBLink instances or simple scalars.  This was primarily
> >>>> done in order to clear up inconsistencies in the older *dblink*
> >>>> methods, which were ambiguous (some indicates simple scalar
> >>>> arguments, others DBLink objects); operator overloading was used
> >>>> extensively in these cases, which led to several issues.  I have
> >>>> added deprecation warnings to the older methods which now map to
> >>>> using the newer methods.  All tests pass with the exception of a  
> >>>> few
> >>>> already failing on the MAIN branch; the single test which needs  
> >>>> to be
> >>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
> >>>> can be
> >>>> fixed after merging back.
> >>>>
> >>>> Please respond to this if there are any questions or if I need to
> >>>> clarify the changes I made a bit more.
> >>>>
> >>>> chris
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>> -- 
> >>> -------------------------------------------------------------------- 
> >>> --
> >>> --
> >>> Scott Cain, Ph. D.
> >>> cain at cshl.edu
> >>> GMOD Coordinator (http://www.gmod.org/)
> >>> 216-392-3087
> >>> Cold Spring Harbor Laboratory
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher
> >> Lab of Dr. Robert Switzer
> >> Dept of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                    
> > cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/ec7a594e/attachment-0003.bin>

From hlapp at gmx.net  Thu Aug 30 12:07:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:07:18 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188488785.2567.93.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
Message-ID: <0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:

> Good news!  I only had to add a few defineds and a few  
> display_texts and
> I was able to successfully create a database and load the yeast GFF3

Scott - I'm a little worried - what are you using the display_text()  
calls for? There is no method to set a property that would be  
returned here, so you only have control over that if you override the  
method in a custom AnnotationI class.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
OtZghop1tET5iMqnwXzL+lk=
=NVrK
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Thu Aug 30 12:10:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:10:14 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188484799.2567.84.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188484799.2567.84.camel@localhost.localdomain>
Message-ID: <49824C75-3FA5-4E59-8F99-BC0E974E9652@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 10:39 AM, Scott Cain wrote:

> What is the use case for having the as_text method work the way it  
> does?

That's a bit nebulous as I tried to point out the other day. It's  
just a textual representation of the annotation, but you don't really  
have control over what the particular Annotation class considers to  
fulfill that purpose.

So, it's fine to expect a printable meaningful string to be returned,  
but don't try to parse it or rely on exactly what it is going to look  
like.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1uvnuV6N2JxL7qsRAn+dAKC9iLj93El38uv7kjprdZDo0sXC6wCgqwhm
0/tF89/FO1a4CWAf1bahd+8=
=I7SM
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Thu Aug 30 12:20:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:20:18 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
Message-ID: <DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>


On Aug 29, 2007, at 11:23 PM, Chris Fields wrote:

>> Once this is merged back to the main trunk, I guess we need to give
>> Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it
>> makes real sense.
>
> It describes one method, ontology_term(), which returns a  
> Bio::Ontology::TermI.  This is similar to  
> SeqFeature::Annotated::type(), which returns a  
> Bio::Annotation::OntologyTerm (a Bio::Ontology::TermI).  My thought  
> is to simply deprecate type() in favor of  
> TypedSeqFeatureI::ontology_term().

I think we'll want to think about that. type() gives me some  
indication of what the returned value might represent, whereas  
ontology_term() only tells me about the type of the returned object.

You could make ontology_term() accept a context argument, such as

	my $feature_type = $typedFeat->ontology_term(-context => -type);

Or you could name the method(s) more explicitly, such as

	my $feature_type = $typedFeat->type_term();
	my $feature_source = $typedFeat->source_term();
	my @annTerms = $typedFeat->get_Annotations('Gene Ontology');

Am I making sense?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Thu Aug 30 12:28:47 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 12:28:47 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
Message-ID: <1188491327.2567.101.camel@localhost.localdomain>

Hi Hilmar,

I'm using it as Chris suggested: where I had be depending on ""
overloading.  I think in most places, I am using it on
Bio::Annotation::SimpleValue to get the string that is the simple value.
On more complex data types, I am using other methods built into those
classes to extract useful stuff for inserting into the database.

Scott


On Thu, 2007-08-30 at 12:07 -0400, Hilmar Lapp wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:
> 
> > Good news!  I only had to add a few defineds and a few  
> > display_texts and
> > I was able to successfully create a database and load the yeast GFF3
> 
> Scott - I'm a little worried - what are you using the display_text()  
> calls for? There is no method to set a property that would be  
> returned here, so you only have control over that if you override the  
> method in a custom AnnotationI class.
> 
> 	-hilmar
> - --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.3 (Darwin)
> 
> iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
> OtZghop1tET5iMqnwXzL+lk=
> =NVrK
> -----END PGP SIGNATURE-----
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/1d98e384/attachment-0003.bin>

From hlapp at gmx.net  Thu Aug 30 12:52:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:52:14 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188491327.2567.101.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
Message-ID: <F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 12:28 PM, Scott Cain wrote:

> I think in most places, I am using it on
> Bio::Annotation::SimpleValue to get the string that is the simple  
> value.

You should be using $ann->value() for that, unless I'm missing  
something.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1vXCuV6N2JxL7qsRAkcJAKCICRtOSlPLVYYKCbOTvDIf4idb3wCgkxYM
seeaNvSsFY/4bHLGZ9dum2Q=
=E35w
-----END PGP SIGNATURE-----


From cain.cshl at gmail.com  Thu Aug 30 13:16:09 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 13:16:09 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
	<F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>
Message-ID: <1188494169.2567.109.camel@localhost.localdomain>

Well, in the instances where I was using it, ->value seems to work
exactly the same, so I changed it to value to be more consistent with
other code I'd written.  I'd used display_name without really thinking
about it.

Thanks,
Scott


On Thu, 2007-08-30 at 12:52 -0400, Hilmar Lapp wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> On Aug 30, 2007, at 12:28 PM, Scott Cain wrote:
> 
> > I think in most places, I am using it on
> > Bio::Annotation::SimpleValue to get the string that is the simple  
> > value.
> 
> You should be using $ann->value() for that, unless I'm missing  
> something.
> 
> 	-hilmar
> - --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.3 (Darwin)
> 
> iD8DBQFG1vXCuV6N2JxL7qsRAkcJAKCICRtOSlPLVYYKCbOTvDIf4idb3wCgkxYM
> seeaNvSsFY/4bHLGZ9dum2Q=
> =E35w
> -----END PGP SIGNATURE-----
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/4c383cd3/attachment-0003.bin>

From cjfields at uiuc.edu  Thu Aug 30 13:27:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 12:27:46 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188491327.2567.101.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
Message-ID: <6E9B07D0-AB37-4439-AA9D-9268AB5A38C0@uiuc.edu>

display_text() is really a hack for explicitly getting the same  
output one would have expected from stringification overload for any  
Bio::AnnotationI (you can also use callbacks on it for customizing it  
if needed, but that's not important here).  It works depending on the  
context of what you're trying to accomplish, but it might be best to  
use value() specifically in places where you expect only using  
Bio::Annotation::Simple.

chris

On Aug 30, 2007, at 11:28 AM, Scott Cain wrote:

> Hi Hilmar,
>
> I'm using it as Chris suggested: where I had be depending on ""
> overloading.  I think in most places, I am using it on
> Bio::Annotation::SimpleValue to get the string that is the simple  
> value.
> On more complex data types, I am using other methods built into those
> classes to extract useful stuff for inserting into the database.
>
> Scott
>
>
>
> On Thu, 2007-08-30 at 12:07 -0400, Hilmar Lapp wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>>
>> On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:
>>
>>> Good news!  I only had to add a few defineds and a few
>>> display_texts and
>>> I was able to successfully create a database and load the yeast GFF3
>>
>> Scott - I'm a little worried - what are you using the display_text()
>> calls for? There is no method to set a property that would be
>> returned here, so you only have control over that if you override the
>> method in a custom AnnotationI class.
>>
>> 	-hilmar
>> - --
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.3 (Darwin)
>>
>> iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
>> OtZghop1tET5iMqnwXzL+lk=
>> =NVrK
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug 30 13:45:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 12:45:44 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188488785.2567.93.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
Message-ID: <B81A709F-5081-4EB0-8778-2ABEDB02BA86@uiuc.edu>

Sounds good but I have yet to commit some of the Ontology changes  
Hilmar and I discussed (whereupon our brace heroes deprecate dblinks  
methods in favor of dbxrefs).  These should be committed fairly soon  
(hour or two).

My guess is the change will be fairly transparent so shouldn't affect  
anything unless you have scripts using those methods directly.

chris

On Aug 30, 2007, at 10:46 AM, Scott Cain wrote:

> Hi Chris,
>
> Good news!  I only had to add a few defineds and a few  
> display_texts and
> I was able to successfully create a database and load the yeast GFF3
> file.  While I want to do more testing with GFF from other sources,
> clearly, I am 95% of the way there with relatively little work.
>
> Nice job and Thanks!
> Scott
>
>
> On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
>> It shouldn't, that sounds like the output for add_text().  value()
>> should just return the scalar value.
>>
>> As a note, I added a new method, display_text(), for all
>> Bio::AnnotationI classes which by default replicates the same output
>> that stringification overloads produced.  So you should be able to
>> explicitly call $ann->display_text for any Bio::AnnotationI where you
>> once used an implicit call:
>>
>> # old
>> print "$ann\n";
>>
>> # new
>> print $ann->display_text,"\n";
>>
>> chris
>>
>> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> Is there a reason that the value method of the
>>> Bio::Annotation::SimpleValue (and possibly some of its siblings)
>>> returning "Value: $value"?  It didn't used to have the "Value: "
>>> before,
>>> did it?
>>>
>>> Thanks,
>>> Scott
>>>
>>>
>>> On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
>>>> I'll probably go ahead and start merging this stuff over to CVS  
>>>> HEAD
>>>> then.  There haven't been any objections so far.
>>>>
>>>> The page I posted outlines the more critical fixes, primarily the
>>>> changes to Bio::Ontology::Term methods (along with relevant  
>>>> code) due
>>>> to inconsistencies in the interface.  The Bio::Annotation classes
>>>> also now throw if you attempt to use them in an overloaded context.
>>>> I also split off SeqFeature::Annotated tests into it's own test  
>>>> suite
>>>> (SeqFeatAnnotated.t).
>>>>
>>>> Let me know if there are any problems along the way!
>>>>
>>>> chris
>>>>
>>>> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
>>>>
>>>>> Hi Chris,
>>>>>
>>>>> I just wanted to let you know that I was out of town for a few
>>>>> days, but
>>>>> now I'm back and I'm doing testing of GMOD software based on the
>>>>> branch
>>>>> you are working on.  I'll let you know how it goes, but don't  
>>>>> let me
>>>>> stop you if you confident of your changes.  I'm sure whatever goes
>>>>> wrong, it will just point out holes in the FeatureIO tests (I'm  
>>>>> sure
>>>>> there are plenty) and will require hopefully minimal changes on my
>>>>> end.
>>>>>
>>>>> Thanks for your considerable efforts on this!  (Regardless of how
>>>>> much
>>>>> work it makes for me :-)
>>>>> Scott
>>>>>
>>>>>
>>>>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>>>>>> I'm now wrapping up the Feature/Annotation rollback.  I will
>>>>>> probably
>>>>>> start merging back to the main branch in the next day or two., as
>>>>>> soon as interested parties (*cough*devs*cough*) look over the  
>>>>>> last
>>>>>> batch of changes.
>>>>>>
>>>>>> http://www.bioperl.org/wiki/
>>>>>> Feature_Annotation_rollback#Fourth_Round
>>>>>>
>>>>>> I have also added a small benchmark test which indicates a  
>>>>>> decrease
>>>>>> in parsing time in SeqIO::genbank with all tests passing.  I  
>>>>>> expect
>>>>>> this will translate over to any Bio::SeqFeature::Generic-using
>>>>>> class
>>>>>> (open mouth, prepare to insert foot....).
>>>>>>
>>>>>> It is also possible there are still some instances where
>>>>>> overloading
>>>>>> is expected lurking about in the ~1000 or so modules, so I'll  
>>>>>> leave
>>>>>> the exceptions I added to all Bio::AnnotationI; we can remove  
>>>>>> them
>>>>>> down the line, maybe prior to rel1.6, after more tests are  
>>>>>> added or
>>>>>> if they get particularly annoying.  My guess is I caught  
>>>>>> 99.99% of
>>>>>> them (prepare to insert other foot....).
>>>>>>
>>>>>> The key change in this last round is the addition of several  
>>>>>> class
>>>>>> *dbxref* methods to Bio::Ontology::Term and
>>>>>> Bio::Annotation::OntologyTerm, all of which are capable of  
>>>>>> working
>>>>>> with either DBLink instances or simple scalars.  This was  
>>>>>> primarily
>>>>>> done in order to clear up inconsistencies in the older *dblink*
>>>>>> methods, which were ambiguous (some indicates simple scalar
>>>>>> arguments, others DBLink objects); operator overloading was used
>>>>>> extensively in these cases, which led to several issues.  I have
>>>>>> added deprecation warnings to the older methods which now map to
>>>>>> using the newer methods.  All tests pass with the exception of a
>>>>>> few
>>>>>> already failing on the MAIN branch; the single test which needs
>>>>>> to be
>>>>>> fixed is a round-tripping error in swiss.t (now a TODO), which
>>>>>> can be
>>>>>> fixed after merging back.
>>>>>>
>>>>>> Please respond to this if there are any questions or if I need to
>>>>>> clarify the changes I made a bit more.
>>>>>>
>>>>>> chris
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>> -- 
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> --
>>>>> --
>>>>> Scott Cain, Ph. D.
>>>>> cain at cshl.edu
>>>>> GMOD Coordinator (http://www.gmod.org/)
>>>>> 216-392-3087
>>>>> Cold Spring Harbor Laboratory
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>> Scott Cain, Ph. D.
>>> cain.cshl at gmail.com
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug 30 14:03:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 13:03:29 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
	<DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>
Message-ID: <D4E8E9D3-BB64-48C5-8273-5C6C04DC8DE9@uiuc.edu>


On Aug 30, 2007, at 11:20 AM, Hilmar Lapp wrote:

>> ...It describes one method, ontology_term(), which returns a  
>> Bio::Ontology::TermI.  This is similar to  
>> SeqFeature::Annotated::type(), which returns a  
>> Bio::Annotation::OntologyTerm (a Bio::Ontology::TermI).  My  
>> thought is to simply deprecate type() in favor of  
>> TypedSeqFeatureI::ontology_term().
>
> I think we'll want to think about that. type() gives me some  
> indication of what the returned value might represent, whereas  
> ontology_term() only tells me about the type of the returned object.
>
> You could make ontology_term() accept a context argument, such as
>
> 	my $feature_type = $typedFeat->ontology_term(-context => -type);
>
> Or you could name the method(s) more explicitly, such as
>
> 	my $feature_type = $typedFeat->type_term();
> 	my $feature_source = $typedFeat->source_term();
> 	my @annTerms = $typedFeat->get_Annotations('Gene Ontology');
>
> Am I making sense?
>
> 	-hilmar

I think so; I'll have to look at what is returned from type() in some  
more detail.

It appears that the two main culprits for passing strings off to  
Ontology::Term are the Bio::OntologyIO::obo and  
Bio::OntologyIO::dagflat parsers.  I can add some code in there to  
change those to DBLinks prior to creating Ontology::Term instances,  
which should clean that up.

chris


From cjfields at uiuc.edu  Thu Aug 30 20:57:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 19:57:15 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <46CF27F4.8030608@arcor.de>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>	<46CEAD83.2050904@arcor.de>	<9824900.1187973171940.JavaMail.ngmail@webmail17>	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
	<BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
	<46CF27F4.8030608@arcor.de>
Message-ID: <4ED2E2B0-8E36-4500-A4C9-B8C333E14614@uiuc.edu>


On Aug 24, 2007, at 1:48 PM, marian wrote:

> ...
> Bio::Microarray::Tools::MitoChip would be OK to me. I merely meant,  
> that it
> isnt an expression chip and you also wont/cant analyze expression  
> data with
> the tool I am talking about.
>
> Marian

Okay, I have everything working from bugzilla:

http://bugzilla.open-bio.org/show_bug.cgi?id=2332

I suppose what we need to do next is get a test script going.  I'll  
look at the script attached to see if we can get something going that  
is fairly quick.

chris


From avilella at gmail.com  Fri Aug 31 05:29:43 2007
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 31 Aug 2007 10:29:43 +0100
Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with
	exon boundaries
Message-ID: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com>

Hi,

Probably a bit of a long shot but does anyone have code for
displaying protein or CDS multiple sequence alignments with the exon
boundaries
of each gene in the alignment?

Something in the bioperl world without funky external dependencies. I think
it would
be an awesome addition to the howtos.

Currently, the Bio::Graphics howto has cdna to genome mapping scripts or
blast output scripts, but
I couldn't find code for dealing with multiple sequence alignments.

Cheers,

    Albert.


From neetisomaiya at gmail.com  Fri Aug 31 05:41:51 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 31 Aug 2007 15:11:51 +0530
Subject: [Bioperl-l] need help
Message-ID: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>

Hi,

I am trying to parse the compound (
ftp://ftp.genome.jp/pub/kegg/ligand/compound/compound) and glycan (
ftp://ftp.genome.jp/pub/kegg/ligand/glycan/glycan) files of KEGG using
bioperl.
I just want the kegg id of the compound/glycan and its names and synonyms if
any.
Bio::SeqIO is giving some problem, I am not able to fetch the id and name.
Can someone help me with this.

Thanks.

-- 
-Neeti
Even my blood says, B positive


From cjfields at uiuc.edu  Fri Aug 31 10:51:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 31 Aug 2007 09:51:51 -0500
Subject: [Bioperl-l] need help
In-Reply-To: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>
References: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>
Message-ID: <BD54A833-D2D3-4AE5-8517-BB060F3C132E@uiuc.edu>

I don't believe Bio::SeqIO::kegg will parse those files (they aren't  
sequence files).  The format it recognizes is:

http://www.bioperl.org/wiki/KEGG_sequence_format

for the files found in the subdirectories here:

ftp://ftp.genome.ad.jp/pub/kegg/genes/organisms

I would just build a custom parser if all you're interested in is id/ 
names/synonyms.  It'll be much faster.

chris

On Aug 31, 2007, at 4:41 AM, neeti somaiya wrote:

> Hi,
>
> I am trying to parse the compound (
> ftp://ftp.genome.jp/pub/kegg/ligand/compound/compound) and glycan (
> ftp://ftp.genome.jp/pub/kegg/ligand/glycan/glycan) files of KEGG using
> bioperl.
> I just want the kegg id of the compound/glycan and its names and  
> synonyms if
> any.
> Bio::SeqIO is giving some problem, I am not able to fetch the id  
> and name.
> Can someone help me with this.
>
> Thanks.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Aug  1 02:15:45 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 31 Jul 2007 21:15:45 -0500
Subject: [Bioperl-l] Perl 3D OpenGL
In-Reply-To: <25A5F0A3-1CC3-46B5-8976-A24C451204E7@jays.net>
References: <152401c7d224$8e2455b0$6e4e7c0a@HPONE>
	<25A5F0A3-1CC3-46B5-8976-A24C451204E7@jays.net>
Message-ID: <04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu>


On Jul 31, 2007, at 7:00 AM, Jay Hannah wrote:

> On Jul 29, 2007, at 4:08 PM, Grafman Productions wrote:
>> If this posting is inappropriate, please let me know - my apologies.
>
> Not at all. AFAIK this is the perfect place to discuss any
> contributions you're motivated to make to the BioPerl project.
>
>> I recently came across an article on BioPerl, and it occurred to me
>> that
>> there might be some need for 3D rendering within your BioPerl  
>> project.
>>
>> I released a number of new/updated Perl OpenGL (POGL) modules this
>> year,
>> along with benchmarks that demonstrate that it performs comparably
>> to C.
>>
>> If there's a need for 3D features within BioPerl, and if I can be
>> of any
>> assistance in helping to add such features, I would enjoy the
>> opportunity.
>
> I know nothing about 3D modeling in biology, nor do I hang out with
> any protein structure folks, but 3D always sounds sexy. -grin-
>
> If you're new to bioinformatics (I certainly am) you might want to
> read this:
>
>    http://en.wikipedia.org/wiki/Protein_structure
>
> Because that's probably where your 3D work would be used. Especially
> note the "Software" section, where you'll find some of the
> "competition".  :)
>
> There's some cool stuff out there. I don't know what all would or
> wouldn't be time well spent in Perl / BioPerl.
>
> HTH,
>
> Jay Hannah
> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah

I agree that protein structure is the best place for something like  
this.

It's a wide open area as far as I'm concerned; in fact I would say  
that Bio::Structure is getting pretty dated, so if anyone wants to  
take it over, refactor the code, and so on I don't have a problem.

chris


From shameer at ncbs.res.in  Wed Aug  1 05:45:45 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Wed, 1 Aug 2007 11:15:45 +0530 (IST)
Subject: [Bioperl-l] Perl 3D OpenGL
In-Reply-To: <04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu>
References: <152401c7d224$8e2455b0$6e4e7c0a@HPONE>
	<25A5F0A3-1CC3-46B5-8976-A24C451204E7@jays.net>
	<04BCAD9E-CC25-4F0A-85B1-FBA91C64CE7D@uiuc.edu>
Message-ID: <49637.192.168.1.1.1185947145.squirrel@mail.ncbs.res.in>

Hi,
Open-GL/3D contributions are always welcome !!!
What about Perl-OpenGL/3D implimentation of a web-based 3D-Viewer like Jmol.

 http://jmol.sourceforge.net/

(So we dont need to worry about Java installation and stuffs :) develop it
and deploy it in Perl - eternal happiness !!!)
-- 
SK
>
> On Jul 31, 2007, at 7:00 AM, Jay Hannah wrote:
>
>> On Jul 29, 2007, at 4:08 PM, Grafman Productions wrote:
>>> If this posting is inappropriate, please let me know - my apologies.
>>
>> Not at all. AFAIK this is the perfect place to discuss any
>> contributions you're motivated to make to the BioPerl project.
>>
>>> I recently came across an article on BioPerl, and it occurred to me
>>> that
>>> there might be some need for 3D rendering within your BioPerl
>>> project.
>>>
>>> I released a number of new/updated Perl OpenGL (POGL) modules this
>>> year,
>>> along with benchmarks that demonstrate that it performs comparably
>>> to C.
>>>
>>> If there's a need for 3D features within BioPerl, and if I can be
>>> of any
>>> assistance in helping to add such features, I would enjoy the
>>> opportunity.
>>
>> I know nothing about 3D modeling in biology, nor do I hang out with
>> any protein structure folks, but 3D always sounds sexy. -grin-
>>
>> If you're new to bioinformatics (I certainly am) you might want to
>> read this:
>>
>>    http://en.wikipedia.org/wiki/Protein_structure
>>
>> Because that's probably where your 3D work would be used. Especially
>> note the "Software" section, where you'll find some of the
>> "competition".  :)
>>
>> There's some cool stuff out there. I don't know what all would or
>> wouldn't be time well spent in Perl / BioPerl.
>>
>> HTH,
>>
>> Jay Hannah
>> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
> I agree that protein structure is the best place for something like
> this.
>
> It's a wide open area as far as I'm concerned; in fact I would say
> that Bio::Structure is getting pretty dated, so if anyone wants to
> take it over, refactor the code, and so on I don't have a problem.
>
> chris
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From Alicia.Amadoz at uv.es  Wed Aug  1 07:13:11 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Wed, 1 Aug 2007 09:13:11 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
Message-ID: <1664224328amadoz@uv.es>

Hi, I would like to save my hit sequences from a blast result in a fasta
file. I am trying some things but I have problems using Bio::SearchIO
and Bio::SeqIO. Hope anyone could help me with this. Here is my current
code:

# my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
"fasta");
my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
         my $hseq = $hsp->hit_string();
         # $seq_out->write_seq($hseq);
         $seq_out->write_result($hseq);
      }
   }
}

Here the error is,

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: ResultWriter not defined.

I couldn't find any kind of documentation about ResultWriter.
Thanks in advance,
Alicia


From xianranli78 at yahoo.com.cn  Wed Aug  1 08:11:53 2007
From: xianranli78 at yahoo.com.cn (Xianran Li)
Date: Wed, 1 Aug 2007 16:11:53 +0800
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
References: <1664224328amadoz@uv.es>
Message-ID: <001101c7d413$a0d79aa0$ed07a8c0@BGI.LOCAL>

The $hseq->$hsp->hit_string() will return the string of hit sequence, rather than an objective of Bio::Seq. So may be you should construct a objective firstly, then you could use $seq_out->write_seq($hseq_obj) to write the seq into a fasta file.

# my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta");
  my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
         my $hseq = $hsp->hit_string(); 
            $hseq =~ s/-//g; #### remove the gap within the aligment
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
         # $seq_out->write_seq($hseq);
         $seq_out->write_result($hseq_obj);
      }
   }
}

Xianran
----- Original Message ----- 
From: "Alicia Amadoz" <Alicia.Amadoz at uv.es>
To: <bioperl-l at lists.open-bio.org>
Sent: Wednesday, August 01, 2007 3:13 PM
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file


> Hi, I would like to save my hit sequences from a blast result in a fasta
> file. I am trying some things but I have problems using Bio::SearchIO
> and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> code:
> 
> # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> "fasta");
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> while(my $result = $blast_report->next_result()) {
>    while(my $hit = $result->next_hit()) {
>       while(my $hsp = $hit->next_hsp()) {
>          my $hseq = $hsp->hit_string();
>          # $seq_out->write_seq($hseq);
>          $seq_out->write_result($hseq);
>       }
>    }
> }
> 
> Here the error is,
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: ResultWriter not defined.
> 
> I couldn't find any kind of documentation about ResultWriter.
> Thanks in advance,
> Alicia
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f???????


From Alicia.Amadoz at uv.es  Wed Aug  1 10:25:29 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Wed, 1 Aug 2007 12:25:29 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
Message-ID: <5927683277amadoz@uv.es>

Hi, I have tried what you suggested and I get also some errors.
With this code,

my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
	my $hseq = $hsp->hit_string(); 
        $hseq =~ s/-//g; #### remove the gap within the aligment
        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
        $seq_out->write_seq($hseq_obj);
      }
   }				
}

I have the following error:

Can't locate object method "write_seq" via package "Bio::SearchIO::fasta"

And using write_result methog with this code,

my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
=> "fasta");
while(my $result = $blast_report->next_result()) {
   while(my $hit = $result->next_hit()) {
      while(my $hsp = $hit->next_hsp()) {
	my $hseq = $hsp->hit_string(); 
        $hseq =~ s/-//g; #### remove the gap within the aligment
        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
        $seq_out->write_result($hseq_obj);
      }
   }				
}

I have again this kind of error:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: ResultWriter not defined.
STACK: Error::throw

So, what else can I try?? Thanks in advance,
Alicia


From neetisomaiya at gmail.com  Wed Aug  1 11:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Wed Aug  1 11:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Wed Aug  1 11:28:40 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 1 Aug 2007 16:58:40 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
Message-ID: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>

I have downloaded the omim.txt file from NCBI ftp site and I am running my
attached parser on this file, the parser run stops in between with this :-

------------- EXCEPTION  -------------
MSG: a part/organism must be assigned
STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
/usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
STACK toplevel parse_omim_original.pl:47

--------------------------------------

What is the reason for this?
Can anyone guide me please.

-- 
-Neeti
Even my blood says, B positive


From jay at jays.net  Wed Aug  1 13:30:50 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 1 Aug 2007 09:30:50 -0400 (EDT)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <5927683277amadoz@uv.es>
References: <5927683277amadoz@uv.es>
Message-ID: <Pine.LNX.4.64.0708010926370.3555@ferret.jays.net>

On Wed, 1 Aug 2007, Alicia Amadoz wrote:
> Hi, I have tried what you suggested and I get also some errors.
> With this code,
>
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> while(my $result = $blast_report->next_result()) {
>   while(my $hit = $result->next_hit()) {
>      while(my $hsp = $hit->next_hsp()) {
> 	my $hseq = $hsp->hit_string();
>        $hseq =~ s/-//g; #### remove the gap within the aligment
>        my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
>        $seq_out->write_seq($hseq_obj);
>      }
>   }
> }
>
> I have the following error:
>
> Can't locate object method "write_seq" via package "Bio::SearchIO::fasta"

You don't want to write_seq() to a SearchIO, you want to write_seq() to a 
SeqIO. Try this:

my $seq_out = Bio::SeqIO->new(-file => ">$fasfilename", -format => "fasta");
while(my $result = $blast_report->next_result()) {
    while(my $hit = $result->next_hit()) {
       while(my $hsp = $hit->next_hsp()) {
 	my $hseq = $hsp->hit_string();
         $hseq =~ s/-//g; #### remove the gap within the aligment
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq);
         $seq_out->write_seq($hseq_obj);
       }
    }
}

(Untested.)

HTH,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Wed Aug  1 15:02:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 1 Aug 2007 10:02:07 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
Message-ID: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>

Neeti,

Only post to one list email address, namely the one I'm responding to  
and the one shown here:

http://bioperl.org/mailman/listinfo/bioperl-l

The others are aliases so you essentially posted three times.  As for  
your question: there was no attached script or any additional  
information (bioperl version would have also been nice), so we can't  
help you until we have something more to work with.

chris

On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:

> I have downloaded the omim.txt file from NCBI ftp site and I am  
> running my
> attached parser on this file, the parser run stops in between with  
> this :-
>
> ------------- EXCEPTION  -------------
> MSG: a part/organism must be assigned
> STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> STACK toplevel parse_omim_original.pl:47
>
> --------------------------------------
>
> What is the reason for this?
> Can anyone guide me please.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Thu Aug  2 00:50:06 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Thu, 2 Aug 2007 10:50:06 +1000
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <1664224328amadoz@uv.es>
References: <1664224328amadoz@uv.es>
Message-ID: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>

Alicia,

> Hi, I would like to save my hit sequences from a blast result in a fasta
> file. I am trying some things but I have problems using Bio::SearchIO
> and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> code:
> # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> "fasta");
> my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> => "fasta");
> ...
>        my $hseq = $hsp->hit_string();
>          # $seq_out->write_seq($hseq);
>          $seq_out->write_result($hseq);

You have encountered two common problems for BioPerl beginners:

1. "fasta" means two different things! In SearchIO it refers to the
output format of the "fasta" sequence alignment software. In SeqIO it
refers to a file format that stores just sequences. Confusing, I know.
You need SeqIO and write_seq, not SearchIO and write_result.

2. $hseq is a STRING which has the raw sequence letters in it.
However, the write_seq() method needs a Bio::Seq object (which has
extra details like the name and ID) not a raw string.

The example code Jay Hannah supplied in his reply looks pretty good,
you should try it.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University


From Alicia.Amadoz at uv.es  Thu Aug  2 07:06:54 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Thu, 2 Aug 2007 09:06:54 +0200 (CEST)
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
Message-ID: <3579584634amadoz@uv.es>

Hi, thanks for your help and suggestions. I have tried the example code
of Jay Hannah and it works perfectly. But what I need to save in fasta
format is the whole sequence in the database that is similar to my query
sequence. I don't understand very well the difference between
hit_string() and query_string(), are they the whole sequence that is
similiar (about hit_string), a part of the whole sequence or just the
part that is aligned to my query string? 

With the previous code what I have are different sequences in length
with the same id as my query string, so I am not sure that I am doing
what I need to do. Any light on this point?

Thank you very much for your help.
Alicia

> Alicia,
> 
> > Hi, I would like to save my hit sequences from a blast result in a fasta
> > file. I am trying some things but I have problems using Bio::SearchIO
> > and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> > code:
> > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> > "fasta");
> > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> > => "fasta");
> > ...
> >        my $hseq = $hsp->hit_string();
> >          # $seq_out->write_seq($hseq);
> >          $seq_out->write_result($hseq);
> 
> You have encountered two common problems for BioPerl beginners:
> 
> 1. "fasta" means two different things! In SearchIO it refers to the
> output format of the "fasta" sequence alignment software. In SeqIO it
> refers to a file format that stores just sequences. Confusing, I know.
> You need SeqIO and write_seq, not SearchIO and write_result.
> 
> 2. $hseq is a STRING which has the raw sequence letters in it.
> However, the write_seq() method needs a Bio::Seq object (which has
> extra details like the name and ID) not a raw string.
> 
> The example code Jay Hannah supplied in his reply looks pretty good,
> you should try it.
> 
> -- 
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> 
> 


From xianranli78 at yahoo.com.cn  Thu Aug  2 08:56:04 2007
From: xianranli78 at yahoo.com.cn (Xianran Li)
Date: Thu, 2 Aug 2007 16:56:04 +0800
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
	<3579584634amadoz@uv.es>
Message-ID: <003701c7d4e2$f7a34bc0$ed07a8c0@BGI.LOCAL>

----- Original Message ----- 
From: "Alicia Amadoz" <Alicia.Amadoz at uv.es>
To: "Torsten Seemann" <torsten.seemann at infotech.monash.edu.au>; <bioperl-l at bioperl.org>
Cc: <jay at jays.net>
Sent: Thursday, August 02, 2007 3:06 PM
Subject: Re: [Bioperl-l] trying to save blast hit sequences to fasta file


> Hi, thanks for your help and suggestions. I have tried the example code
> of Jay Hannah and it works perfectly. But what I need to save in fasta
> format is the whole sequence in the database that is similar to my query
> sequence. I don't understand very well the difference between
> hit_string() and query_string(), are they the whole sequence that is
> similiar (about hit_string), a part of the whole sequence or just the
> part that is aligned to my query string? 

The hit_string() returns the  aligned sequences of the subject in your database and the query_string() is the aligned sequences of the query. These two things will be the same unless there are some mutations and or gaps within the alignment. 

> 
> With the previous code what I have are different sequences in length
> with the same id as my query string, so I am not sure that I am doing
> what I need to do. Any light on this point?

Did you specify the $id before 
  
my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 

If you didn't, then all the sequences retrieved will get the same id. The following is a simply way to avoid this problem.

my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>"fasta");                                                           
my $i;                                                                    
while(my $result = $blast_report->next_result()) {                        
   while(my $hit = $result->next_hit()) {                                 
      while(my $hsp = $hit->next_hsp()) {                                 
            $i ++;                                                      
         my $hseq = $hsp->hit_string();                                   
            $hseq =~ s/-//g; #### remove the gap within the aligment      
         my $id = $i; ###### specifiy the id                            
         my $hseq_obj = Bio::Seq->new(-display_id => $id, -seq => $hseq); 
         # $seq_out->write_seq($hseq);                                    
         $seq_out->write_result($hseq_obj);                               
      }                                                                   
   }                                                                      
}               


Xianran 

> 
> Thank you very much for your help.
> Alicia
> 
> > Alicia,
> > 
> > > Hi, I would like to save my hit sequences from a blast result in a fasta
> > > file. I am trying some things but I have problems using Bio::SearchIO
> > > and Bio::SeqIO. Hope anyone could help me with this. Here is my current
> > > code:
> > > # my $seq_out = Bio::SeqIO->new("-file" => ">$fasfilename", "-format" =>
> > > "fasta");
> > > my $seq_out = Bio::SearchIO->new("-file" => ">$fasfilename", "-format"
> > > => "fasta");
> > > ...
> > >        my $hseq = $hsp->hit_string();
> > >          # $seq_out->write_seq($hseq);
> > >          $seq_out->write_result($hseq);
> > 
> > You have encountered two common problems for BioPerl beginners:
> > 
> > 1. "fasta" means two different things! In SearchIO it refers to the
> > output format of the "fasta" sequence alignment software. In SeqIO it
> > refers to a file format that stores just sequences. Confusing, I know.
> > You need SeqIO and write_seq, not SearchIO and write_result.
> > 
> > 2. $hseq is a STRING which has the raw sequence letters in it.
> > However, the write_seq() method needs a Bio::Seq object (which has
> > extra details like the name and ID) not a raw string.
> > 
> > The example code Jay Hannah supplied in his reply looks pretty good,
> > you should try it.
> > 
> > -- 
> > --Torsten Seemann
> > --Victorian Bioinformatics Consortium, Monash University
> > 
> > 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l?????????????????????????????????????????????????????????????????'?f???????


From neetisomaiya at gmail.com  Thu Aug  2 06:20:33 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 2 Aug 2007 11:50:33 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
Message-ID: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>

Hi,

The script is attached with this mail.
I am using bioperl-1.4.

Regards,
Neeti.

On 8/1/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Neeti,
>
> Only post to one list email address, namely the one I'm responding to
> and the one shown here:
>
> http://bioperl.org/mailman/listinfo/bioperl-l
>
> The others are aliases so you essentially posted three times.  As for
> your question: there was no attached script or any additional
> information (bioperl version would have also been nice), so we can't
> help you until we have something more to work with.
>
> chris
>
> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>
> > I have downloaded the omim.txt file from NCBI ftp site and I am
> > running my
> > attached parser on this file, the parser run stops in between with
> > this :-
> >
> > ------------- EXCEPTION  -------------
> > MSG: a part/organism must be assigned
> > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > STACK toplevel parse_omim_original.pl:47
> >
> > --------------------------------------
> >
> > What is the reason for this?
> > Can anyone guide me please.
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
-Neeti
Even my blood says, B positive
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_omim_original.pl
Type: application/x-perl
Size: 5998 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/fbbee8db/attachment.pl>

From neetisomaiya at gmail.com  Thu Aug  2 13:00:33 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 2 Aug 2007 18:30:33 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
Message-ID: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>

Also,
As per the following links we can fetch data from the genemap file as well
:-
http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/OMIMparser.pm

But when I am trying to do so in the exact manner as given in the above
link, I get no data. As in there are OMIM ids which are present in both the
omim.txt and genemap files, and for such cases when I parse and fetch data,
data from both files should be obtained, but I aint getting it.

For eg. while running the attached script, for OMIM id 100790, I get all
data from omim.txt but the cytoposition, gene symbol etc from genemap is not
coming, though it is present in the genemap file.

Please help me find what could be going wrong.

On 8/2/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> The script is attached with this mail.
> I am using bioperl-1.4.
>
> Regards,
> Neeti.
>
> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:
> >
> > Neeti,
> >
> > Only post to one list email address, namely the one I'm responding to
> > and the one shown here:
> >
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> > The others are aliases so you essentially posted three times.  As for
> > your question: there was no attached script or any additional
> > information (bioperl version would have also been nice), so we can't
> > help you until we have something more to work with.
> >
> > chris
> >
> > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
> >
> > > I have downloaded the omim.txt file from NCBI ftp site and I am
> > > running my
> > > attached parser on this file, the parser run stops in between with
> > > this :-
> > >
> > > ------------- EXCEPTION  -------------
> > > MSG: a part/organism must be assigned
> > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > > STACK toplevel parse_omim_original.pl:47
> > >
> > > --------------------------------------
> > >
> > > What is the reason for this?
> > > Can anyone guide me please.
> > >
> > > --
> > > -Neeti
> > > Even my blood says, B positive
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
>
>
> --
> -Neeti
> Even my blood says, B positive
>
>


-- 
-Neeti
Even my blood says, B positive
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parse_omim_original.pl
Type: application/x-perl
Size: 8750 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070802/6bdb009c/attachment.pl>

From cjfields at uiuc.edu  Thu Aug  2 17:05:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 12:05:55 -0500
Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml
References: <38B65B2C-A36D-41FB-83C9-7D7B55156CCD@uiuc.edu>
Message-ID: <EF284983-9A37-4F0F-BF92-04C7804275A0@uiuc.edu>

For archiving purposes; of course I forgot to cc the list!

-c

Begin forwarded message:

> From: Chris Fields <cjfields at uiuc.edu>
> Date: August 2, 2007 12:04:59 PM CDT
> To: gyang at plantbio.uga.edu
> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
> with xml
>
> Guojun,
>
> Make sure to keep this on the mail list for archiving purposes.
>
> It could be that the RID is not being removed properly (if it isn't  
> removed then you will repeatedly retrieve your BLAST report).  The  
> new error you are seeing may be coming from whatever XML::SAX  
> backend parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat,  
> etc); it doesn't look bioperl-related and there is an eval which  
> catches this stuff in SearchIO::blastxml.  Does text parsing work?
>
> Could you directly send me your script or add it to a new bug  
> report as an attachment?
>
> http://www.bioperl.org/wiki/Bugs
>
> chris
>
> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote:
>
>> Hi,Chris,
>> I installed the latest version of bioperl, in addition to the  
>> repeated output problem, there are new problems with parsing:
>>
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>  No close tag marker [Ln: 4126, Col: 0]
>>
>> ---------------------------------------------------
>>
>> Would you please kindly give me a hint on this,
>> Thanks a lot,
>> Guojun
>>
>>
>> ----- Original Message -----
>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> To: gyang at plantbio.uga.edu
>> Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org]
>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
>> with xml
>>
>>
>>> Make sure to keep responses on the ail list.
>>>> You might want to run a full install, just in case.  If I remember
>>> correctly Sendu made some changes a while back in the BLAST-related
>>> modules which may be related to this.  At the very least install/
>>> upgrade all modules in Bio::Tools::Run.
>>>> chris
>>>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote:
>>>>> Thanks, Chris,
>>>> But when I replaced the old RemoteBlast.pm with the new one, I got
>>>> "can't locate the object method "retrieve_parameter"". Does this
>>>> mean I need to install something else?
>>>> Guojun
>>>>
>>>> ----- Original Message -----
>>>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>>>> To: gyang at plantbio.uga.edu
>>>> Cc: bioperl-l at bioperl.org
>>>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast
>>>> with xml
>>>>
>>>>
>>>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote:
>>>>>>> I am running remoteblast and using readmethod "xml", I  
>>>>>>> noticed that
>>>>>> it is printing the output repeatedly nonstop. It's like in a  
>>>>>> loop.
>>>>>> Did anybody notice this before? Can anybody help me getting  
>>>>>> out of
>>>>>> this?
>>>>>> Thanks a lot,
>>>>>>
>>>>>>
>>>>>> Guojun Yang
>>>>>> University of Georgia
>>>>>> Not seeing that using bioperl-live; you may need to update
>>>>> RemoteBlast.pm as this sounds similar to an issue that popped up
>>>>> earlier in the spring.
>>>>>> chris
>>>>>
>>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>>>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug  2 17:51:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 12:51:27 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708020600v551b917ck9acdd443268b85fa@mail.gmail.com>
Message-ID: <921F31D6-3CA9-483A-8AFF-B3555E9768C4@uiuc.edu>

Neeti,

The genemap wasn't loaded in all cases; don't know what the reasoning  
for it was, but it is fixed in CVS now  
(Bio::Phenotype::OMIM::OMIMparser, specifically).  I would recommend  
that you install a full upgrade to at least bioperl 1.5.2 before  
using this; I can't guarantee it will work with bioperl 1.4.

chris

On Aug 2, 2007, at 8:00 AM, neeti somaiya wrote:

> Also,
> As per the following links we can fetch data from the genemap file  
> as well
> :-
> http://search.cpan.org/~birney/bioperl-1.2.3/Bio/Phenotype/OMIM/ 
> OMIMparser.pm
>
> But when I am trying to do so in the exact manner as given in the  
> above
> link, I get no data. As in there are OMIM ids which are present in  
> both the
> omim.txt and genemap files, and for such cases when I parse and  
> fetch data,
> data from both files should be obtained, but I aint getting it.
>
> For eg. while running the attached script, for OMIM id 100790, I  
> get all
> data from omim.txt but the cytoposition, gene symbol etc from  
> genemap is not
> coming, though it is present in the genemap file.
>
> Please help me find what could be going wrong.
>
> On 8/2/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>>
>> Hi,
>>
>> The script is attached with this mail.
>> I am using bioperl-1.4.
>>
>> Regards,
>> Neeti.
>>
>> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:
>>>
>>> Neeti,
>>>
>>> Only post to one list email address, namely the one I'm  
>>> responding to
>>> and the one shown here:
>>>
>>> http://bioperl.org/mailman/listinfo/bioperl-l
>>>
>>> The others are aliases so you essentially posted three times.  As  
>>> for
>>> your question: there was no attached script or any additional
>>> information (bioperl version would have also been nice), so we can't
>>> help you until we have something more to work with.
>>>
>>> chris
>>>
>>> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>>>
>>>> I have downloaded the omim.txt file from NCBI ftp site and I am
>>>> running my
>>>> attached parser on this file, the parser run stops in between with
>>>> this :-
>>>>
>>>> ------------- EXCEPTION  -------------
>>>> MSG: a part/organism must be assigned
>>>> STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
>>>> STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
>>>> /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
>>>> STACK toplevel parse_omim_original.pl:47
>>>>
>>>> --------------------------------------
>>>>
>>>> What is the reason for this?
>>>> Can anyone guide me please.
>>>>
>>>> --
>>>> -Neeti
>>>> Even my blood says, B positive
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>>
>>
>>
>> --
>> -Neeti
>> Even my blood says, B positive
>>
>>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
> <parse_omim_original.pl>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug  2 18:16:56 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 2 Aug 2007 13:16:56 -0500
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
Message-ID: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>

Neeti,

Keep this on the list please.  I am unable to reproduce this using  
your script with or without using the optional genemap file.  You  
really should upgrade bioperl to 1.5.2 and try the fix first; this is  
something that may have been fixed post-bioperl 1.4.

chris

On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote:

> Waiting for your reply on the exception I had mentioned in my first  
> mail.
>
> Thanks.
>
> ---------- Forwarded message ----------
> From: neeti somaiya < neetisomaiya at gmail.com>
> Date: Aug 2, 2007 11:50 AM
> Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser
> To: bioperl-l at lists.open-bio.org
>
> Hi,
>
> The script is attached with this mail.
> I am using bioperl-1.4.
>
> Regards,
> Neeti.
>
>
> On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti,
>
> Only post to one list email address, namely the one I'm responding to
> and the one shown here:
>
> http://bioperl.org/mailman/listinfo/bioperl-l
>
> The others are aliases so you essentially posted three times.  As for
> your question: there was no attached script or any additional
> information (bioperl version would have also been nice), so we can't
> help you until we have something more to work with.
>
> chris
>
> On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
>
> > I have downloaded the omim.txt file from NCBI ftp site and I am
> > running my
> > attached parser on this file, the parser run stops in between with
> > this :-
> >
> > ------------- EXCEPTION  -------------
> > MSG: a part/organism must be assigned
> > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > STACK toplevel parse_omim_original.pl:47
> >
> > --------------------------------------
> >
> > What is the reason for this?
> > Can anyone guide me please.
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
>
>
>
> -- 
> -Neeti
> Even my blood says, B positive
> <parse_omim_original.pl>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From torsten.seemann at infotech.monash.edu.au  Fri Aug  3 01:03:36 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Fri, 3 Aug 2007 11:03:36 +1000
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <3579584634amadoz@uv.es>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>
	<3579584634amadoz@uv.es>
Message-ID: <a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>

Alicia,

> Hi, thanks for your help and suggestions. I have tried the example code
> of Jay Hannah and it works perfectly. But what I need to save in fasta
> format is the whole sequence in the database that is similar to my query
> sequence.

Unfortunately the hit_string is only that part of the sequence in the
database that was similar enough to your query sequence. The BLAST
report does not have the whole hit sequence in it, only the locally
aligned part. SearchIO can only give you what it can get from the
BLAST report.

You will need to record the IDs of the database sequences you are
interested in, and write extra code to retrieve the WHOLE hit sequence
from your database.

--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University


From neetisomaiya at gmail.com  Fri Aug  3 05:46:32 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 3 Aug 2007 11:16:32 +0530
Subject: [Bioperl-l] URGENT : Problem in OMIM parser
In-Reply-To: <9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>
References: <764978cf0708010428q5b51d27dw13f73dc5ca8d6dc9@mail.gmail.com>
	<0458A649-AD66-4495-8E25-78D556D51AD9@uiuc.edu>
	<764978cf0708012320v1f30c7a7tfc3a2e524b72093@mail.gmail.com>
	<764978cf0708021057g435539d2yd7168274589ec55f@mail.gmail.com>
	<9D5F428F-D091-4815-A438-B3357D88212C@uiuc.edu>
Message-ID: <764978cf0708022246v98abed6ue41233f6b27c5674@mail.gmail.com>

Hi,

Thanks a lot.
The exception is not coming after upgrade to bioperl-1.5.2
But the genemap data is still a problem.

You had mentioned that I should take Bio::Phenotype::OMIM::OMIMparser,
specifically from cvs. Where exactly can I get it?

Thanks,
Neeti.

On 8/2/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
> Neeti,
>
> Keep this on the list please.  I am unable to reproduce this using
> your script with or without using the optional genemap file.  You
> really should upgrade bioperl to 1.5.2 and try the fix first; this is
> something that may have been fixed post-bioperl 1.4.
>
> chris
>
> On Aug 2, 2007, at 12:57 PM, neeti somaiya wrote:
>
> > Waiting for your reply on the exception I had mentioned in my first
> > mail.
> >
> > Thanks.
> >
> > ---------- Forwarded message ----------
> > From: neeti somaiya < neetisomaiya at gmail.com>
> > Date: Aug 2, 2007 11:50 AM
> > Subject: Re: [Bioperl-l] URGENT : Problem in OMIM parser
> > To: bioperl-l at lists.open-bio.org
> >
> > Hi,
> >
> > The script is attached with this mail.
> > I am using bioperl-1.4.
> >
> > Regards,
> > Neeti.
> >
> >
> > On 8/1/07, Chris Fields < cjfields at uiuc.edu> wrote:Neeti,
> >
> > Only post to one list email address, namely the one I'm responding to
> > and the one shown here:
> >
> > http://bioperl.org/mailman/listinfo/bioperl-l
> >
> > The others are aliases so you essentially posted three times.  As for
> > your question: there was no attached script or any additional
> > information (bioperl version would have also been nice), so we can't
> > help you until we have something more to work with.
> >
> > chris
> >
> > On Aug 1, 2007, at 6:28 AM, neeti somaiya wrote:
> >
> > > I have downloaded the omim.txt file from NCBI ftp site and I am
> > > running my
> > > attached parser on this file, the parser run stops in between with
> > > this :-
> > >
> > > ------------- EXCEPTION  -------------
> > > MSG: a part/organism must be assigned
> > > STACK Bio::Phenotype::OMIM::OMIMentry::add_clinical_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMentry.pm:566
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_finer_parse_symptoms
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:555
> > > STACK Bio::Phenotype::OMIM::OMIMparser::_createOMIMentry
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:536
> > > STACK Bio::Phenotype::OMIM::OMIMparser::next_phenotype
> > > /usr/lib/perl5/site_perl/5.8.8/Bio/Phenotype/OMIM/OMIMparser.pm:272
> > > STACK toplevel parse_omim_original.pl:47
> > >
> > > --------------------------------------
> > >
> > > What is the reason for this?
> > > Can anyone guide me please.
> > >
> > > --
> > > -Neeti
> > > Even my blood says, B positive
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> >
> >
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> >
> >
> >
> > --
> > -Neeti
> > Even my blood says, B positive
> > <parse_omim_original.pl>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


-- 
-Neeti
Even my blood says, B positive


From jay at jays.net  Fri Aug  3 14:23:11 2007
From: jay at jays.net (Jay Hannah)
Date: Fri, 03 Aug 2007 09:23:11 -0500
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>	<3579584634amadoz@uv.es>
	<a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
Message-ID: <46B33A4F.2010403@jays.net>

Torsten Seemann wrote:
>> Hi, thanks for your help and suggestions. I have tried the example code
>> of Jay Hannah and it works perfectly. But what I need to save in fasta
>> format is the whole sequence in the database that is similar to my query
>> sequence.
>>     
>
> Unfortunately the hit_string is only that part of the sequence in the
> database that was similar enough to your query sequence. The BLAST
> report does not have the whole hit sequence in it, only the locally
> aligned part. SearchIO can only give you what it can get from the
> BLAST report.
>
> You will need to record the IDs of the database sequences you are
> interested in, and write extra code to retrieve the WHOLE hit sequence
> from your database.
>   
This probably won't help, but my (extremely poorly documented) 
"SeqLab.net" project

   http://seqlab.net

is a framework that sits on top of BioPerl. The current cross_blast() 
stuff (http://seqlab.net/pods2html/tutorial.html) does this:

   GenBank -> FASTA -> formatdb -> "stand alone" NCBI BLAST -> reports

When the reports run they have simultaneous access to both the original 
Bio::Seq objects from the GenBank file and the Bio::SearchIO objects 
from the BLAST results, so it can kick out reports that understand the 
relationships between (and details of) the original sequences and HSPs 
simultaneously...

If you get stuck trying to do what Torsten suggests and have questions 
about SeqLab.net you could open a ticket with my group

   http://clab.ist.unomaha.edu/CLAB/index.php/RT

and I'll try to help.

Cheers,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From mbasu at mail.nih.gov  Fri Aug  3 18:55:57 2007
From: mbasu at mail.nih.gov (Malay)
Date: Fri, 03 Aug 2007 14:55:57 -0400
Subject: [Bioperl-l] trying to save blast hit sequences to fasta file
In-Reply-To: <46B33A4F.2010403@jays.net>
References: <a79f6a4b0708011750r6ec60098occe3d2a24f9ad66f@mail.gmail.com>	<3579584634amadoz@uv.es>	<a79f6a4b0708021803o2f998117i9817ae94d42b884e@mail.gmail.com>
	<46B33A4F.2010403@jays.net>
Message-ID: <46B37A3D.4070606@mail.nih.gov>

Jay Hannah wrote:
> Torsten Seemann wrote:
>>> Hi, thanks for your help and suggestions. I have tried the example code
>>> of Jay Hannah and it works perfectly. But what I need to save in fasta
>>> format is the whole sequence in the database that is similar to my query
>>> sequence.
>>>     
>> Unfortunately the hit_string is only that part of the sequence in the
>> database that was similar enough to your query sequence. The BLAST
>> report does not have the whole hit sequence in it, only the locally
>> aligned part. SearchIO can only give you what it can get from the
>> BLAST report.
>>
>> You will need to record the IDs of the database sequences you are
>> interested in, and write extra code to retrieve the WHOLE hit sequence
>> from your database.

I am not sure whether it has already been suggested or not but you can 
retrieve the full sequence from any blast database using "fastacmd", 
which is part of NCBI toolbox. Parse the "description" string from from 
the BLAST report and run:

fastacmd -d <database file> -s <description>

where, the argument of -s can be any unique string for the database.

-Malay


From cjfields at uiuc.edu  Mon Aug  6 17:49:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 6 Aug 2007 12:49:08 -0500
Subject: [Bioperl-l] Fwd: nonstop repeated output from Remote_blast with xml
References: <1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu>
Message-ID: <B97BD1F9-05FE-4225-810F-5EA10AB2728B@uiuc.edu>

Wasn't paying attention! Forwarding this to the mail list in case  
anyone wanted the answer...

chris

Begin forwarded message:

> From: Chris Fields <cjfields at uiuc.edu>
> Date: August 6, 2007 12:10:37 PM CDT
> To: gyang at plantbio.uga.edu
> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
> with xml
>
> Guojun,
>
> Sorry about the long wait on this.  At this time RemoteBlast  
> doesn't automatically set the retrieval header to return XML when  
> setting the -reporttype parameter to 'xml' or 'blastxml'.  The  
> default is text output, so you are retrieving regular text BLAST  
> reports instead of XML, hence the reported XML parser failure (BTW,  
> you can see the plain text being returned in the debugging  
> output).  I'll look into a fix for that.
>
> In the meantime, you can do this manually by setting the following  
> key prior to submitting the BLAST run:
>
> $Bio::Tools::Run::RemoteBlast::RETRIEVALHEADER{'FORMAT_TYPE'} = 'XML';
>
> When I run your example with the above line added it works fine.   
> As an additional note, the CVS version of Bio::SearchIO::blastxml  
> now supports newer versions of XML::SAX::Expat; the problem there  
> was a bug in XML::SAX::Expat that killed parsing.
>
> Additional rant before I go back to work (you can skip this if  
> needed):  RemoteBlast is one of the most used modules in BioPerl,  
> but it is also the most problematic as NCBI keeps changing things  
> on their end (BLAST text output, prompts when returning RIDs,  
> etc).  It frankly isn't as well-maintained as we would like; this  
> is partly due to plans we have (but unfortunately haven't acted  
> upon) to merge RemoteBlast/StandAloneBlast so they have a similar  
> API and can be used for any BLAST program, including netblast.  If  
> someone wants to take this on at some point then they are more than  
> welcome!
>
> chris
>
> On Aug 3, 2007, at 10:08 AM, Guojun Yang wrote:
>
>> Thanks, Chris,
>> Attached are my script and the query file. I suspected that we  
>> need to add "remove RID... in the code", I tried putting romoving  
>> RID at the end of the parsing coding, but it seemed it removed it  
>> even before the output was processed.   I installed  
>> XML::SAX::Expat, the error became "XML::SAX::Expat is no longer  
>> supported...", so I installed ExpatXS, the error message becomes:
>>
>> -------------------- WARNING ---------------------
>> MSG: error in parsing a report:
>>  no element found at line 4126, column 1, byte 186628 at /usr/lib/ 
>> perl5/site_perl/5.8.3/Bio/SearchIO/blastxml.pm line 304
>>
>>
>> Would you please try the script with the query file with the  
>> following input parameters, to see what happens on your machine (I  
>> want to make sure there is no installation problem on my machine).  
>> The search subroutine is where blast is performed, I did not  
>> include a romove RID there. Thanks again!
>>
>> master:/home/guojun # perl makcgi07.txt
>> Query file name:
>> kiddo.txt
>> Select a function: 1.member;2.RES; 3, long; 4.Anchor; 5.Associator.
>> 1
>> Type in the name of an organism, e.g. Oryza sativa.
>> Oryza sativa
>> Type in the organism to search for RES:
>> Your E_value:
>> 0.001
>> Size limit for ancestor element:
>> 4000
>> Flanking size for retrieved members:
>> 50
>> Tolerance for end mismatch:
>> 0
>>
>>
>>
>> Guojun From: Chris Fields [mailto:cjfields at uiuc.edu]
>> To: gyang at plantbio.uga.edu
>> Sent: Thu, 02 Aug 2007 13:04:59 -0400
>> Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast  
>> with xml
>>
>> Guojun,
>>
>> Make sure to keep this on the mail list for archiving purposes.
>>
>> It could be that the RID is not being removed properly (if it isn't
>> removed then you will repeatedly retrieve your BLAST report). The
>> new error you are seeing may be coming from whatever XML::SAX backend
>> parser is being used (XML::SAX::ExpatXS, XML::SAX::Expat, etc); it
>> doesn't look bioperl-related and there is an eval which catches this
>> stuff in SearchIO::blastxml. Does text parsing work?
>>
>> Could you directly send me your script or add it to a new bug report
>> as an attachment?
>>
>> http://www.bioperl.org/wiki/Bugs
>>
>> chris
>>
>> On Aug 2, 2007, at 11:07 AM, Guojun Yang wrote:
>>
>> > Hi,Chris,
>> > I installed the latest version of bioperl, in addition to the
>> > repeated output problem, there are new problems with parsing:
>> >
>> >
>> > -------------------- WARNING ---------------------
>> > MSG: error in parsing a report:
>> > No close tag marker [Ln: 4126, Col: 0]
>> >
>> > ---------------------------------------------------
>> >
>> > Would you please kindly give me a hint on this,
>> > Thanks a lot,
>> > Guojun
>> >
>> >
>> > ----- Original Message -----
>> > From: Chris Fields [mailto:cjfields at uiuc.edu]
>> > To: gyang at plantbio.uga.edu
>> > Cc: bioperl-l List [mailto:bioperl-l at lists.open-bio.org]
>> > Subject: Re: [Bioperl-l] nonstop repeated output from Remote_blast
>> > with xml
>> >
>> >
>> >> Make sure to keep responses on the ail list.
>> >>> You might want to run a full install, just in case. If I remember
>> >> correctly Sendu made some changes a while back in the BLAST- 
>> related
>> >> modules which may be related to this. At the very least install/
>> >> upgrade all modules in Bio::Tools::Run.
>> >>> chris
>> >>> On Jul 31, 2007, at 9:40 AM, Guojun Yang wrote:
>> >>>> Thanks, Chris,
>> >>> But when I replaced the old RemoteBlast.pm with the new one, I  
>> got
>> >>> "can't locate the object method "retrieve_parameter"". Does this
>> >>> mean I need to install something else?
>> >>> Guojun
>> >>>
>> >>> ----- Original Message -----
>> >>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> >>> To: gyang at plantbio.uga.edu
>> >>> Cc: bioperl-l at bioperl.org
>> >>> Subject: Re: [Bioperl-l] nonstop repeated output from  
>> Remote_blast
>> >>> with xml
>> >>>
>> >>>
>> >>>>> On Jul 30, 2007, at 3:58 PM, Guojun Yang wrote:
>> >>>>>> I am running remoteblast and using readmethod "xml", I noticed
>> >>>>>> that
>> >>>>> it is printing the output repeatedly nonstop. It's like in a  
>> loop.
>> >>>>> Did anybody notice this before? Can anybody help me getting  
>> out of
>> >>>>> this?
>> >>>>> Thanks a lot,
>> >>>>>
>> >>>>>
>> >>>>> Guojun Yang
>> >>>>> University of Georgia
>> >>>>> Not seeing that using bioperl-live; you may need to update
>> >>>> RemoteBlast.pm as this sounds similar to an issue that popped up
>> >>>> earlier in the spring.
>> >>>>> chris
>> >>>>
>> >>> Christopher Fields
>> >> Postdoctoral Researcher
>> >> Lab of Dr. Robert Switzer
>> >> Dept of Biochemistry
>> >> University of Illinois Urbana-Champaign
>> >>>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>>
>>
>> <makcgi07.txt>
>> <kiddo.txt>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Alicia.Amadoz at uv.es  Tue Aug  7 08:20:12 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Tue, 7 Aug 2007 10:20:12 +0200 (CEST)
Subject: [Bioperl-l] error using standaloneblast through webserver, part II
Message-ID: <1387114447amadoz@uv.es>

Hi again, i'm trying to run a bioperl script in linux with
standaloneblast from a webserver but i now have another error. It is the
following:

[blastall] WARNING: Unable to open outfile_allseq.nin
[blastall] WARNING: 101: Unable to open outfile_allseq.nin

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d
 "/outfile_allseq"  -e  10  -i 
/tmp//alicia_2007_07_20/result_search_alicia_12_03_40.fasta  -o 
/tmp//alicia_2007_08_07/101_result_Local_Blast_alicia_09_56_47.out  -p 
blastn

My perl code is the following:

my $blastdatadir = $ARGV[9]; -> Here the value of the variable is ok

BEGIN { 
	$ENV{PATH} .= ':/usr/local/blast-2.2.16/bin'; # path where blastall bin
is located
	$ENV{BLASTDIR} = '/usr/local/blast-2.2.16/bin'; # path where blastall
bin is located
	$ENV{BLASTDATADIR} = $blastdatadir; # path where formated local
databases are located -> Here the value is empty
}   

I have tried without BEGIN { } so $ENV var has a correct value for
$blastdatadir but i get the same error. I have checked that formatdb was
done and all the files are correct.

Any idea or help to solve this problem? 

Thanks in advance. Regards,
Alicia


From mheusel at gmail.com  Tue Aug  7 08:45:33 2007
From: mheusel at gmail.com (Martin Heusel)
Date: Tue, 7 Aug 2007 10:45:33 +0200
Subject: [Bioperl-l] error using standaloneblast through webserver,
	part II
In-Reply-To: <1387114447amadoz@uv.es>
References: <1387114447amadoz@uv.es>
Message-ID: <6127fc200708070145keb750acycce8a43edd0f724d@mail.gmail.com>

> MSG: blastall call crashed: 256 /usr/local/blast-2.2.16/bin/blastall -d
>  "/outfile_allseq"  -e  10  -i

I'm not familiar with all this, but it seems your script tries to
write in the systems root directory /

-d "/outfile_allseq"

that is normally not writable for normal users

is this the problem?

cu

Martin

-- 
+ openid: http://mhe.myopenid.com/
+ gpg   : http://user.cs.tu-berlin.de/~mhe/pub/martin.gpg
+ gpg fp: 4844 71B5 B4E4 3892 69CA  6EA5 6598 61BE 0021 94A2


From Alicia.Amadoz at uv.es  Tue Aug  7 11:08:12 2007
From: Alicia.Amadoz at uv.es (Alicia Amadoz)
Date: Tue, 7 Aug 2007 13:08:12 +0200 (CEST)
Subject: [Bioperl-l] error using standaloneblast through webserver,
	part II
In-Reply-To: <1387114447amadoz@uv.es>
References: <1387114447amadoz@uv.es>
Message-ID: <5825345446amadoz@uv.es>

Hi, i thought that it was enough with setting $ENV{BLASTDATADIR} and
standaloneblast would find the database. I have change it, setting
-database option of params with path_to_database+name_of_database and it
works ok.

Thanks for your help. Regards,
Alicia


From jason at bioperl.org  Wed Aug  8 19:16:07 2007
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 8 Aug 2007 14:16:07 -0500
Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module
References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com>
Message-ID: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>

Young -
I'm forwarding to the list for more help.

Begin forwarded message:

> From: "Young Song" <youngcsong at gmail.com>
> Date: August 8, 2007 1:48:29 PM CDT
> To: jason at bioperl.org
> Subject: Question regarding Bio::GenBank module
>
> Hello,
>
>    I am currently located in Vancouver, Canada, and I actually have  
> some
> question based on the Bio::GenBank module for bioperl.  I read in the
> online document for the module (
> http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are  
> not
> supposed to spam the NCBI with multiple requests, which lead me to  
> think
> about the script that I wrote.  I am trying to extract some  
> information
> based on the fasta protein files located in the  NCBI's  database.   
> The
> script  reads  each '.faa' (Fasta Protein) file and takes in the  
> 'gi'  ID
> for each  sequence, and extracts several information, which looks like
> following output (please note that there are lot more gi's then I  
> am showing
> you right now):
>
> 10954456
> accesstion number: NP_047185.1
> dbsource: GenBank: NC_001911.1
> NP_047185.1
> starting pos. at genomic seq: 1488
> ending pos. at genomic seq: 1991
> strand: +
> description: putative membrane-associated protein
> organism: Buchnera aphidicola
> MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGLL 
> VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLCV 
> IIHLTFVLSAFGMAYIDKMSKKKHVLH
> ************************************************
> 10954457
> accesstion number: NP_047186.1
> dbsource: GenBank: NC_001911.1
> NP_047186.1
> starting pos. at genomic seq: 2158
> ending pos. at genomic seq: 2913
> strand: +
> description: putative replication-associated protein
> organism: Buchnera aphidicola
> MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHRA 
> CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKRK 
> FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKKI 
> LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK
>
>
>
>   Because there are lot of sequences I am dealing with here, I am  
> little bit
> worried that I may be causing harm to the NCBI server.  I just need  
> to know
> if this is the right approach to take, or if there is another  
> solution (I am
> little bit confused what you mean by "multiple requests" in the  
> document).
> Your reply would be very much appreciated.  Thank you in advance.
>
>   Sincerely,
>
>      Young C. Song

--
Jason Stajich
jason at bioperl.org


From cjfields at uiuc.edu  Wed Aug  8 19:41:34 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 8 Aug 2007 14:41:34 -0500
Subject: [Bioperl-l] Fwd: Question regarding Bio::GenBank module
In-Reply-To: <24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>
References: <7a93dad10708081148w74dfede3sd05799a651ebcb80@mail.gmail.com>
	<24F7DCFE-7047-43BA-BD92-E2238C05DAE1@bioperl.org>
Message-ID: <FD7D1694-604A-4C8B-AC47-B31F306EA5B0@uiuc.edu>

NCBI eUtils (which Bio::DB::GenBank uses to get sequence data) has a  
list of user requirements:

http://www.ncbi.nlm.nih.gov/entrez/query/static/ 
eutils_help.html#UserSystemRequirements

The most important one is the 3 second timeout between requests, but  
the module already implements that policy so there isn't a real issue  
unless you deliberately mess with that setting.  NCBI has been known  
to block IPs which don't follow that particular rule.  Also, if you  
are planning making hundreds of requests you should consider running  
the script during low traffic times as indicated in the above link.

chris

On Aug 8, 2007, at 2:16 PM, Jason Stajich wrote:

> Young -
> I'm forwarding to the list for more help.
>
> Begin forwarded message:
>
>> From: "Young Song" <youngcsong at gmail.com>
>> Date: August 8, 2007 1:48:29 PM CDT
>> To: jason at bioperl.org
>> Subject: Question regarding Bio::GenBank module
>>
>> Hello,
>>
>>    I am currently located in Vancouver, Canada, and I actually have
>> some
>> question based on the Bio::GenBank module for bioperl.  I read in the
>> online document for the module (
>> http://search.cpan.org/dist/bioperl/Bio/DB/GenBank.pm), that we are
>> not
>> supposed to spam the NCBI with multiple requests, which lead me to
>> think
>> about the script that I wrote.  I am trying to extract some
>> information
>> based on the fasta protein files located in the  NCBI's  database.
>> The
>> script  reads  each '.faa' (Fasta Protein) file and takes in the
>> 'gi'  ID
>> for each  sequence, and extracts several information, which looks  
>> like
>> following output (please note that there are lot more gi's then I
>> am showing
>> you right now):
>>
>> 10954456
>> accesstion number: NP_047185.1
>> dbsource: GenBank: NC_001911.1
>> NP_047185.1
>> starting pos. at genomic seq: 1488
>> ending pos. at genomic seq: 1991
>> strand: +
>> description: putative membrane-associated protein
>> organism: Buchnera aphidicola
>> MERIIEKAIYASRWLMFPVYVGLSFGFILLTLKFFQQIVFIIPDILAMSESGLVLVVLSLIDIALVGGL 
>> L
>> VMVMFLGYENFISKMDIQDNEKRLGWMGTMDVNSIKNKVASSIVAISSVHLLRLFMEAEKILDDKIMLC 
>> V
>> IIHLTFVLSAFGMAYIDKMSKKKHVLH
>> ************************************************
>> 10954457
>> accesstion number: NP_047186.1
>> dbsource: GenBank: NC_001911.1
>> NP_047186.1
>> starting pos. at genomic seq: 2158
>> ending pos. at genomic seq: 2913
>> strand: +
>> description: putative replication-associated protein
>> organism: Buchnera aphidicola
>> MPRKNYIYNPKPVFNPPKNKRKISTFICYAMKKASEIDVARSNLNYTLLLIDPKTGNILPRFRRLNEHR 
>> A
>> CAMRAIVLAMLYYFDIHSNLVEASIEKLADECGLSTFSDSGNKSITRVSRLINDFLEPMGFVRCKKIKR 
>> K
>> FVSNYIPKKIFLTPMFFMLFNISQSKINRYLFKSKKMSQNLKITEKKIFISFSDIKVMSRLDEKSIRKK 
>> I
>> LNALINYYTASELTKIGPKGLKKRIDIEYNNLCKLFKKIKK
>>
>>
>>
>>   Because there are lot of sequences I am dealing with here, I am
>> little bit
>> worried that I may be causing harm to the NCBI server.  I just need
>> to know
>> if this is the right approach to take, or if there is another
>> solution (I am
>> little bit confused what you mean by "multiple requests" in the
>> document).
>> Your reply would be very much appreciated.  Thank you in advance.
>>
>>   Sincerely,
>>
>>      Young C. Song
>
> --
> Jason Stajich
> jason at bioperl.org
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From gyang at plantbio.uga.edu  Thu Aug  9 19:03:21 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Thu, 09 Aug 2007 15:03:21 -0400
Subject: [Bioperl-l] standalone blastall call crashed, please help
In-Reply-To: 1FE846F1-CB20-41FD-929D-8D14E5695B59@uiuc.edu
Message-ID: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>

Hi, Chris,  
Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm?  Best, Guojun  
   
I set the blast env variables:  
   
BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
$PROGRAMDIR = $ENV{'BLASTDIR'} || '';
......  
   
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  -o  /tmp/4qSSO16EZP  -p  blastx   
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359
STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
STACK: main::ancestor makcgi07.txt:593
STACK: makcgi07.txt:208
  

sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  

my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
print $query->seq();
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('swissprot');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
*    my $blast_report = $factory->blastall($query);
    while (my $result = $blast_report->next_result) {
            while( my $hit = $result->next_hit()) {
                $hit_name=$hit->name;
                $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
                $name=$1;
                $desc = $hit->description();
                if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){
                     $AN=0;
                     $replica=0;
                     while ($ancestor_name[$AN]) {
                        $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name));
                         $AN+=1;
                     }
                        if ($replica==0) {
                        push @ancestor_name, $long_name;
                        push @ancestor_start, $long_start;
                        push @ancestor_end, $long_end;
                        push @desc, $desc;
                        push @hitname,$name;
                        }
                }
               }
              }}
return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
}


From harijay at gmail.com  Thu Aug  9 21:47:50 2007
From: harijay at gmail.com (hari jayaram)
Date: Thu, 9 Aug 2007 17:47:50 -0400
Subject: [Bioperl-l] newbie wants install help
Message-ID: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>

Hi I am trying to install bioperl as a non root user since I dont have root
access on the machine.

I was following the instructions as given on the wiki at
http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
I started from scratch using perl version v5.8.5 and used cpan to install
the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
was needed. Everything worked just fine
I could use cpan as a non root user following instructions given at
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html

But when I try to install bioperl using the instructions for non-root I get
an error when I build Module::Build because I am not root.
Iget the same Module::Build error when I try to install without CPAN using
command line script perl Build.PL --install_base option as given on the
wiki.

Is there a way out

Thanks for your help in advance
harijay
Brandeis University


Installing /usr/share/man/man3/Module::Build::Platform::VMS.3pm
Installing /usr/share/man/man3/Module::Build::Base.3pm
Installing /usr/share/man/man3/Module::Build::Authoring.3pm
Installing /usr/share/man/man3/Module::Build::Compat.3pm
mkdir /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi/auto/Module:
Permission denied at /usr/lib/perl5/5.8.5/ExtUtils/Install.pm line 207
Installing /usr/bin/config_data
make: *** [install] Error 255
  /usr/bin/make install  -- NOT OK
    You may have to su to root to install the package
Couldn't install Module::Build, giving up.
make: *** No targets specified and no makefile found.  Stop.
  /usr/bin/make  -- NOT OK
Running make test
  Can't test without successful make
Running make install
  make had returned bad status, install seems impossible


From bix at sendu.me.uk  Thu Aug  9 22:23:24 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 09 Aug 2007 23:23:24 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
Message-ID: <46BB93DC.9010608@sendu.me.uk>

hari jayaram wrote:
> Hi I am trying to install bioperl as a non root user since I dont have root
> access on the machine.
> 
> I was following the instructions as given on the wiki at
> http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
> I started from scratch using perl version v5.8.5 and used cpan to install
> the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
> was needed. Everything worked just fine
> I could use cpan as a non root user following instructions given at
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
> 
> But when I try to install bioperl using the instructions for non-root I get
> an error when I build Module::Build because I am not root.
> Iget the same Module::Build error when I try to install without CPAN using
> command line script perl Build.PL --install_base option as given on the
> wiki.

Follow the cpan instructions you found to install as non-root:

Bundle::CPAN

Failing that, you require at least:
Module::Build

Failing that:
http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#INSTALLING_BIOPERL_MODULES_THE_HARD_WAY
(it's actually the easiest way, go figure)


From bix at sendu.me.uk  Fri Aug 10 07:41:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 10 Aug 2007 08:41:29 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>	
	<46BB93DC.9010608@sendu.me.uk>
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
Message-ID: <46BC16A9.7090709@sendu.me.uk>

hari jayaram wrote:
> Hi Sendu ,

Hi, please post back to the list as well, so others can benefit.


> Well after going through a few attempts at installing Bundle::CPAN I 
> gave up.
> It always had weird timeout issues . ANd kept re-installing everything 
> on restarting the CPAN shell
> After a while I thought it did complete - since it retunred me to the shell
> 
> I tried the CPAN install of bioperl at that point
> 
> ANd bingo I got booted out at the exact same point when the Bioperl 
> install tried to re-install(?) Module:Build which failed as non root

Did you follow steps 7 and 8 of 
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?

If you managed to install Bundle::CPAN, when you now run 'cpan' it 
should start up and tell you its version number, which should be v1.9102 
or higher. If its lower, you didn't manage to install the latest CPAN, 
or you haven't managed to tell Perl where your newly installed modules are.


> I guess for all future modules I will adopt the option 3 you detailed , 
> i.e just have the modules sitting somewhere and use them from there
> 
> But I am still interested in getting it done right via CPAN.


From n.haigh at sheffield.ac.uk  Fri Aug 10 10:14:06 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Fri, 10 Aug 2007 11:14:06 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <46BC16A9.7090709@sendu.me.uk>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>		<46BB93DC.9010608@sendu.me.uk>	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
	<46BC16A9.7090709@sendu.me.uk>
Message-ID: <46BC3A6E.80302@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sendu Bala wrote:
> hari jayaram wrote:
>> Hi Sendu ,
> 
> Hi, please post back to the list as well, so others can benefit.
> 
> 
>> Well after going through a few attempts at installing Bundle::CPAN I 
>> gave up.
>> It always had weird timeout issues . ANd kept re-installing everything 
>> on restarting the CPAN shell
>> After a while I thought it did complete - since it retunred me to the shell
>>
>> I tried the CPAN install of bioperl at that point
>>
>> ANd bingo I got booted out at the exact same point when the Bioperl 
>> install tried to re-install(?) Module:Build which failed as non root
> 
> Did you follow steps 7 and 8 of 
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?
> 
> If you managed to install Bundle::CPAN, when you now run 'cpan' it 
> should start up and tell you its version number, which should be v1.9102 
> or higher. If its lower, you didn't manage to install the latest CPAN, 
> or you haven't managed to tell Perl where your newly installed modules are.
> 
> 
>> I guess for all future modules I will adopt the option 3 you detailed , 
>> i.e just have the modules sitting somewhere and use them from there
>>
>> But I am still interested in getting it done right via CPAN.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

It will probably also help, if you post the commands you have run and
any output (truncated if it's really long), then we can follow what you
have tried and make some better suggestions.

Cheers
Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGvDpuczuW2jkwy2gRAjFjAJ0eG90cMfHrrIh7LbKWx1JN94kbXgCdGSbi
tMjQrZ/8EPc0wLiNAhYTr4Y=
=kXZ2
-----END PGP SIGNATURE-----


From mbasu at mail.nih.gov  Fri Aug 10 15:25:35 2007
From: mbasu at mail.nih.gov (Malay)
Date: Fri, 10 Aug 2007 11:25:35 -0400
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
Message-ID: <46BC836F.7010906@mail.nih.gov>

hari jayaram wrote:
> Hi I am trying to install bioperl as a non root user since I dont have root
> access on the machine.
> 
> I was following the instructions as given on the wiki at
> http://bioperl.open-bio.org/wiki/Installing_Bioperl_for_Unix
> I started from scratch using perl version v5.8.5 and used cpan to install
> the bioperl module prerequisites bundle Bundle::BioPerl since I thought it
> was needed. Everything worked just fine
> I could use cpan as a non root user following instructions given at
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
> 
> But when I try to install bioperl using the instructions for non-root I get
> an error when I build Module::Build because I am not root.
> Iget the same Module::Build error when I try to install without CPAN using
> command line script perl Build.PL --install_base option as given on the
> wiki.
> 
> Is there a way out
> 
> Thanks for your help in advance
> harijay
> Brandeis University
> 

This is related your situation and broadly applicable to all perl users 
in a non root situation. I can tell from my own experience the best way 
to handle your situation is to use your own Perl, if you are a dedicated 
perl developer. Just compile and install your own perl installation in 
any directory of you choice and put the "bin" directory in front of you 
path and off you go. The advantages are several fold. First, you get a 
very optimized, fast perl. The sysadmin might have just installed a 
binary run-of-the-mill perl version. Second, you get all the freedom of 
installing the very latest updates of all the modules. The sysadmins may 
be too busy man to update perl frequently. Third, a very common problem 
with production machine is that they follow strictly the perl 
installation instruction and avoid threaded perl, which clips your wings 
particularly, when almost all machines contain multiple processors.

The drawbacks are related to finding "/usr/bin/perl" in the shebang 
line. If you follow the perl way of installing any script, it will take 
care of it. When you develop, use the more portable way of

#!/usr/bin/env perl
BEGIN {$^W =1 } # Use it switch on compile time warnings (-w)

All the best,

Malay


-- 
Malay K Basu
www.malaybasu.net


From gyang at plantbio.uga.edu  Fri Aug 10 15:23:36 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Fri, 10 Aug 2007 11:23:36 -0400
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
 from StandAloneBlast
In-Reply-To: 20070809190321.191d0d4a@dogwood.plantbio.uga.edu
Message-ID: <20070810152336.898c3979@dogwood.plantbio.uga.edu>

Hi, Chris,  
Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run.  If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run.
Thanks,  
Guojun  
       _____  

  From: Guojun Yang [mailto:gyang at plantbio.uga.edu]
To: Chris Fields [mailto:cjfields at uiuc.edu]
Cc: bioperl-l at lists.open-bio.org
Sent: Thu, 09 Aug 2007 15:03:21 -0400
Subject: standalone blastall call crashed, please help

  
Hi, Chris,  
Thanks a lot for your efforts. With your help, I am gaining more confidence to fix the cgi code. While the remoteblast problem is fixed now, I am caught in a local blast problem (see the error message and subroutine). The line starting with * is line 593 in the error message. I tried command line blastall, it works fine. I set the permission to all the blast folders and files, it did not help much. The same sequence and database works OK if I use command line blastall. I used the seq object ref $query as query, the error message gives "-i /tmp/...", does this look like an input problem? The subroutine was working before early 2006 (on a different machine), I am wondering whether this is due to changes in the StandAloneBlast.pm?  Best, Guojun  
   
I set the blast env variables:  
   
BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
$PROGRAMDIR = $ENV{'BLASTDIR'} || '';
......  
   
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  -o  /tmp/4qSSO16EZP  -p  blastx   
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/Root/Root.pm:359
STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
STACK: main::ancestor makcgi07.txt:593
STACK: makcgi07.txt:208
  

sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  

my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
print $query->seq();
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('swissprot');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
*    my $blast_report = $factory->blastall($query);
    while (my $result = $blast_report->next_result) {
            while( my $hit = $result->next_hit()) {
                $hit_name=$hit->name;
                $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
                $name=$1;
                $desc = $hit->description();
                if ($desc =~ /.*{|\btransposon\b|\btransposase\b|}.*/i){
                     $AN=0;
                     $replica=0;
                     while ($ancestor_name[$AN]) {
                        $replica=1 if (($ancestor_name[$AN] eq $long_name) && ($hitname[$AN] eq $name));
                         $AN+=1;
                     }
                        if ($replica==0) {
                        push @ancestor_name, $long_name;
                        push @ancestor_start, $long_start;
                        push @ancestor_end, $long_end;
                        push @desc, $desc;
                        push @hitname,$name;
                        }
                }
               }
              }}
return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
}


From cjfields at uiuc.edu  Fri Aug 10 16:17:38 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 10 Aug 2007 11:17:38 -0500
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
References: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
Message-ID: <56186844-3CB9-4968-B16F-FD5EE72865A2@uiuc.edu>

This should be filed as a bug if possible; could you do that?

http://www.bioperl.org/wiki/Bugs

Suggestions have been made many times previously that  
StandAloneBlast, RemoteBlast, etc be combined to use a common API,  
incorporate other BLAST implementations (i.e. WU-BLAST, NCBI's  
netblast, etc), and maybe utilize other cross-platform compatible  
means of running programs and passing off reports to parsers.  In  
fact, Jason, Roger Hall, Torsten, and I discussed tentative plans for  
plugin-able BLAST wrappers:

http://www.bioperl.org/wiki/Module:Bio::Tools::Run::RemoteBlast

Though they have never been acted upon.  If I get time towards the  
end of fall and manage to finish up some other projects I may try  
taking this on, maybe using the wiki to track progress.

chris

On Aug 10, 2007, at 10:23 AM, Guojun Yang wrote:

> Hi, Chris,
> Interestingly, I found the message in bioperl-l from Matthew Laird  
> 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES  
> run.  If one comments out this line in StandAloneBlast.pm, the  
> execution succeeds perfectly fine". It seemed to be mysterious when  
> I uncommented the " $self->throw("$executable call crashed: $? $!  
> $commandstring\n") unless ($status==0) ;" line, the blastall runs.  
> The only difference from what Matthew saw is that, when I did not  
> uncomment the line, blastall DID NOT run.
> Thanks,
> Guojun
>
> From: Guojun Yang [mailto:gyang at plantbio.uga.edu]
> To: Chris Fields [mailto:cjfields at uiuc.edu]
> Cc: bioperl-l at lists.open-bio.org
> Sent: Thu, 09 Aug 2007 15:03:21 -0400
> Subject: standalone blastall call crashed, please help
>
> Hi, Chris,
> Thanks a lot for your efforts. With your help, I am gaining more  
> confidence to fix the cgi code. While the remoteblast problem is  
> fixed now, I am caught in a local blast problem (see the error  
> message and subroutine). The line starting with * is line 593 in  
> the error message. I tried command line blastall, it works fine. I  
> set the permission to all the blast folders and files, it did not  
> help much. The same sequence and database works OK if I use command  
> line blastall. I used the seq object ref $query as query, the error  
> message gives "-i /tmp/...", does this look like an input problem?  
> The subroutine was working before early 2006 (on a different  
> machine), I am wondering whether this is due to changes in the  
> StandAloneBlast.pm?  Best, Guojun
>
> I set the blast env variables:
>
> BEGIN {$ENV{BLASTDIR} = '/usr/blast-2.2.10/bin'; }
> BEGIN {$ENV{BLASTDB}='/usr/blast-2.2.10/data';}
> BEGIN {$ENV{BLASTMAT}='/usr/blast-2.2.10/data';}
> $PROGRAMDIR = $ENV{'BLASTDIR'} || '';
> ......
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: blastall call crashed: -1 /usr/blast-2.2.10/bin/blastall -d  "/ 
> usr/blast-2.2.10/data/swissprot"  -e  0.001  -i  /tmp/3cjvQyodxg  - 
> o  /tmp/4qSSO16EZP  -p  blastx
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /usr/lib/perl5/site_perl/5.8.3/Bio/ 
> Root/Root.pm:359
> STACK: Bio::Tools::Run::StandAloneBlast::_runblast /usr/lib/perl5/ 
> site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:813
> STACK: Bio::Tools::Run::StandAloneBlast::_generic_local_blast /usr/ 
> lib/perl5/site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:760
> STACK: Bio::Tools::Run::StandAloneBlast::blastall /usr/lib/perl5/ 
> site_perl/5.8.3/Bio/Tools/Run/StandAloneBlast.pm:570
> STACK: main::ancestor makcgi07.txt:593
> STACK: makcgi07.txt:208
> sub ancestor {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
>
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"test");
> print $query->seq();
> my $len=$query->length();
> my $long_name=$_[1];
> my $long_start=$_[2];
> my $long_end=$_[3];
> @db=('swissprot');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new(-program =>  
> "blastx",
>                                                         -database  
> => "$db",
>                                                         -e => 1e-3,
>                                                         );
> *    my $blast_report = $factory->blastall($query);
>     while (my $result = $blast_report->next_result) {
>             while( my $hit = $result->next_hit()) {
>                 $hit_name=$hit->name;
>                 $hit_name =~ /\S+[|](\S+)[.]\d+[|].*/;
>                 $name=$1;
>                 $desc = $hit->description();
>                 if ($desc =~ /.*{|\btransposon\b|\btransposase 
> \b|}.*/i){
>                      $AN=0;
>                      $replica=0;
>                      while ($ancestor_name[$AN]) {
>                         $replica=1 if (($ancestor_name[$AN] eq  
> $long_name) && ($hitname[$AN] eq $name));
>                          $AN+=1;
>                      }
>                         if ($replica==0) {
>                         push @ancestor_name, $long_name;
>                         push @ancestor_start, $long_start;
>                         push @ancestor_end, $long_end;
>                         push @desc, $desc;
>                         push @hitname,$name;
>                         }
>                 }
>                }
>               }}
> return @ancestor_name, at ancestor_start, at ancestor_end, at desc;
> }
>
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From harijay at gmail.com  Fri Aug 10 17:09:32 2007
From: harijay at gmail.com (hari jayaram)
Date: Fri, 10 Aug 2007 13:09:32 -0400
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <46BC16A9.7090709@sendu.me.uk>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>
	<46BB93DC.9010608@sendu.me.uk>
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>
	<46BC16A9.7090709@sendu.me.uk>
Message-ID: <aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>

Hey all ,
Thanks for your help. Its working real well now.

Turns out I had not set my PERL5LIB environment variable correctly and it
was not finding the installed modules (thanks Sendu)

So the steps I followed were
1) Install CPAN as myself as detailed
http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html
Importantly the line which tells CPAN what prefix to use for all module
installs
PREFIX=~/perl5lib/ LIB=~/perl5lib/lib INSTALLMAN1DIR=~/perl5lib/man1
INSTALLMAN3DIR=~/perl5lib/man3

2) Set the Perl5LIB to /home/perl5lib/lib ( and not just /home/perl5lib) in
the shell . I use cshell so I edited .cshrc
setenv PERL5LIB /home/hari/perl5lib/lib
setenv MANPATH ${MANPATH}:/home/hari/perl5lib

3) Updated the system CPAN to latest version - this woked very well once the
perl5lib was installed ..only it took a while and sometimes stalled with
messages like done 31/34  But a CTRL C , got it going again

4) Made sure I was using the new CPAN v1.9102

5) Installed Bioperl with command
install S/SE/SENDU/bioperl-1.5.2_102.tar.gz

AND I was good to go..

I am thinking I will screencast this process for everyones benefit and put
it up on bioscreencast.com . If that will be useful for others.
Thanks to everyone on the group. Now the journey begins

Hari Jayaram


On 8/10/07, Sendu Bala <bix at sendu.me.uk> wrote:
> hari jayaram wrote:
> > Hi Sendu ,
>
> Hi, please post back to the list as well, so others can benefit.
>
>
> > Well after going through a few attempts at installing Bundle::CPAN I
> > gave up.
> > It always had weird timeout issues . ANd kept re-installing everything
> > on restarting the CPAN shell
> > After a while I thought it did complete - since it retunred me to the
shell
> >
> > I tried the CPAN install of bioperl at that point
> >
> > ANd bingo I got booted out at the exact same point when the Bioperl
> > install tried to re-install(?) Module:Build which failed as non root
>
> Did you follow steps 7 and 8 of
> http://www.dcc.fc.up.pt/~pbrandao/aulas/0203/AR/modules_inst_cpan.html ?
>
> If you managed to install Bundle::CPAN, when you now run 'cpan' it
> should start up and tell you its version number, which should be v1.9102
> or higher. If its lower, you didn't manage to install the latest CPAN,
> or you haven't managed to tell Perl where your newly installed modules
are.
>
>
> > I guess for all future modules I will adopt the option 3 you detailed ,
> > i.e just have the modules sitting somewhere and use them from there
> >
> > But I am still interested in getting it done right via CPAN.
>


From torsten.seemann at infotech.monash.edu.au  Fri Aug 10 21:48:56 2007
From: torsten.seemann at infotech.monash.edu.au (Torsten Seemann)
Date: Sat, 11 Aug 2007 07:48:56 +1000
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <20070810152336.898c3979@dogwood.plantbio.uga.edu>
References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>
	<20070810152336.898c3979@dogwood.plantbio.uga.edu>
Message-ID: <a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>

> Interestingly, I found the message in bioperl-l from Matthew Laird 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast DOES run.  If one comments out this line in StandAloneBlast.pm, the execution succeeds perfectly fine". It seemed to be mysterious when I uncommented the " $self->throw("$executable call crashed: $? $! $commandstring\n") unless ($status==0) ;" line, the blastall runs. The only difference from what Matthew saw is that, when I did not uncomment the line, blastall DID NOT run.

Yes, Matthew is one of the authors of PSORTB and I spent a bit of time
last year trying to fix this problem (unsuccessfully). The PSORTB docs
http://www.psort.org/downloads/index.html
explain how to get around this problem just as Guojun describes. I use
a custom BioPerl installation just for PSORTB!

 I was under the impression it was already filed as a bug, but my
searching indicates this is not so.

-- 
--Torsten Seemann
--Victorian Bioinformatics Consortium, Monash University


From cjfields at uiuc.edu  Fri Aug 10 22:04:20 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 10 Aug 2007 17:04:20 -0500
Subject: [Bioperl-l] ATTN: Matthew Laird & Elia----blastall call crashed
	from StandAloneBlast
In-Reply-To: <a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>
References: <20070809190321.191d0d4a@dogwood.plantbio.uga.edu>
	<20070810152336.898c3979@dogwood.plantbio.uga.edu>
	<a79f6a4b0708101448x421736c1m6f3f5ff6d851a68c@mail.gmail.com>
Message-ID: <41A08079-6EEC-4B62-8104-C41E70C03083@uiuc.edu>


On Aug 10, 2007, at 4:48 PM, Torsten Seemann wrote:

>> Interestingly, I found the message in bioperl-l from Matthew Laird  
>> 2005 "Blastall & StandAloneBlast". "...the Odd thing is, Blast  
>> DOES run.  If one comments out this line in StandAloneBlast.pm,  
>> the execution succeeds perfectly fine". It seemed to be mysterious  
>> when I uncommented the " $self->throw("$executable call crashed:  
>> $? $! $commandstring\n") unless ($status==0) ;" line, the blastall  
>> runs. The only difference from what Matthew saw is that, when I  
>> did not uncomment the line, blastall DID NOT run.
>
> Yes, Matthew is one of the authors of PSORTB and I spent a bit of time
> last year trying to fix this problem (unsuccessfully). The PSORTB docs
> http://www.psort.org/downloads/index.html
> explain how to get around this problem just as Guojun describes. I use
> a custom BioPerl installation just for PSORTB!
>
>  I was under the impression it was already filed as a bug, but my
> searching indicates this is not so.
>
> -- 
> --Torsten Seemann
> --Victorian Bioinformatics Consortium, Monash University
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Might be wise to go ahead and add it to bugzilla so we can track it,  
along with the workaround.

chris


From neetisomaiya at gmail.com  Mon Aug 13 10:29:39 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 13 Aug 2007 15:59:39 +0530
Subject: [Bioperl-l] Homologene parser?
Message-ID: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>

Hi,

Does anyone know of any Homologene parser, if available?
Please let me know.

Thanks and Regards,
Neeti.


-- 
-Neeti
Even my blood says, B positive


From shameer at ncbs.res.in  Mon Aug 13 11:07:45 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 16:37:45 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and add
 direction to SeqFeature
In-Reply-To: <6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
Message-ID: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>

Dear All,

I am generating images based on Transcription Factor binding site data
using bio::graphics module.
I created my images using program : version-2 
[http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
Stein ). I attaching one of the image with this mail.

I need to make 3 changes to this image

1. to color the 'scale'
Color the scale in two different colors ie, from start 1.0k - color blue
from 101 - till end of the scale green (I thoroghly checked the
Bio::Graphics document, I couldnt find an option to do this )

2. to sort the Transcription factors based on the z_score

3. to give forward/reverse [> or < ]direction for the black boxes

I would appreaciate if any one can give me some clues/link to accomplish
this :).
thanks in advance ,
Shameer

-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TF_top3.png
Type: image/png
Size: 2188 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070813/6a4423bd/attachment-0004.png>

From bix at sendu.me.uk  Mon Aug 13 13:11:50 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 14:11:50 +0100
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>	<1178028249.2644.13.camel@localhost.localdomain>	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
Message-ID: <46C05896.1010002@sendu.me.uk>

Shameer Khadar wrote:
> Dear All,
> 
> I am generating images based on Transcription Factor binding site data
> using bio::graphics module.
> I created my images using program : version-2 
> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
> Stein ). I attaching one of the image with this mail.
> 
> I need to make 3 changes to this image
> 
> 1. to color the 'scale'
> Color the scale in two different colors ie, from start 1.0k - color blue
> from 101 - till end of the scale green (I thoroghly checked the
> Bio::Graphics document, I couldnt find an option to do this )

The scale is just a scale and shouldn't need colouring. You can do what 
you want by having a blue 'upstream' feature and a green 'gene' feature 
in the first row.


> 2. to sort the Transcription factors based on the z_score

I don't know Bio::Graphics well enough, but am interested in the answer...


> 3. to give forward/reverse [> or < ]direction for the black boxes

Presumably you just change the glyph type of your binding sites to 
something that shows direction, like 'processed_transcript'. Someone 
else may have a more appropriate suggestion.

However, do your binding sites really have a direction? That is, do you 
really know which strand your transcription factor bound to?


From cjfields at uiuc.edu  Mon Aug 13 14:39:11 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 09:39:11 -0500
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
	add direction to SeqFeature
In-Reply-To: <51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
Message-ID: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>


On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote:

> Dear All,
>
> I am generating images based on Transcription Factor binding site data
> using bio::graphics module.
> I created my images using program : version-2
> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
> Stein ). I attaching one of the image with this mail.
>
> I need to make 3 changes to this image
>
> 1. to color the 'scale'
> Color the scale in two different colors ie, from start 1.0k - color  
> blue
> from 101 - till end of the scale green (I thoroghly checked the
> Bio::Graphics document, I couldnt find an option to do this )

Much of the documentation you need is available via 'perldoc  
Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes.   
The above may be possible using two seqfeatures instead of one or  
maybe a split location with a callback (not sure, haven't tried  
either, mileage may vary, batteries not included, warranty void if  
packaging is opened, etc).  Might be worth checking out the POD for  
the arrow glyph to see what's possible.

> 2. to sort the Transcription factors based on the z_score

In Bio::Graphics::Panel POD under 'Glyph Options', there is  
documentation for 'sort_order' which accepts callbacks.  According to  
the docs you would basically do something like the following (the  
prototype is required; note the score):

   -sort_order => sub ($$) {
     my ($glyph1,$glyph2) = @_;
     my $a = $glyph1->feature;
     my $b = $glyph2->feature;
     ( $b->score/log($b->length)
           <=>
       $a->score/log($a->length) )
           ||
     ( $a->start <=> $b->start )
   }

Again, haven't tried.

> 3. to give forward/reverse [> or < ]direction for the black boxes

I think you first need to ensure the glyph will accept strandedness,  
though I think most do.  Then you would set either the 'strand_arrow'  
or 'stranded' option to 1 (they are synonyms).  Again, see  
Bio::Graphics::Panel POD under Glyph Options, specifically the  
parameter 'stranded' or 'strand_arrow'.

> I would appreaciate if any one can give me some clues/link to  
> accomplish
> this :).
> thanks in advance ,
> Shameer

No problem!

chris

> -- 
> Shameer Khadar
> Lab (# 25) The Computational Biology Group
> National Centre for Biological Sciences (TIFR)
> GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
> T - 91-080-23666001 EXT - 6251
> W - http://www.ncbs.res.in
> <TF_top3.png>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Mon Aug 13 14:47:35 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 20:17:35 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <46C05896.1010002@sendu.me.uk>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
Message-ID: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>

Dear Sendu,

Thanks for your reply.

>> I need to make 3 changes to this image
>>
>> 1. to color the 'scale'
>> Color the scale in two different colors ie, from start 1.0k - color blue
>> from 101 - till end of the scale green (I thoroghly checked the
>> Bio::Graphics document, I couldnt find an option to do this )
>
> The scale is just a scale and shouldn't need colouring. You can do what
> you want by having a blue 'upstream' feature and a green 'gene' feature
> in the first row.
Thanks for the point : 'The scale is just a scale...'.
But my idea is to differentiate the scale in to three to diffentiate
between 100bp upstream region, UTR and gene start site. starting point of
scale till 0k is the 100bp upstream. From 0k till end of the current_scale
is UTR, from the end of scale gene starts, since this is a bit tough to
distinguish, we thought of this coloring option. Addition of an extra
track may is an alternate option (I tried to convince our experimental
team by adding an extra track, but they want it this way :(..)

>
>> 2. to sort the Transcription factors based on the z_score
> I don't know Bio::Graphics well enough, but am interested in the answer...
>
It is possible, but sort_order option is available. I tried it a couple of
times but it is not  working.

>
>> 3. to give forward/reverse [> or < ]direction for the black boxes
>
> Presumably you just change the glyph type of your binding sites to
> something that shows direction, like 'processed_transcript'. Someone
> else may have a more appropriate suggestion.
Thanks, I will look in to it.

>
> However, do your binding sites really have a direction? That is, do you
> really know which strand your transcription factor bound to?
Yes, these info we collated from various experimental datasets.

-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From bix at sendu.me.uk  Mon Aug 13 15:01:43 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 16:01:43 +0100
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
Message-ID: <46C07257.1000308@sendu.me.uk>

Shameer Khadar wrote:
>> However, do your binding sites really have a direction? That is, do you
>> really know which strand your transcription factor bound to?
 >
> Yes, these info we collated from various experimental datasets.

Well, those datasets I'd like to see... What I was getting at is the 
strand probably isn't known at the experimental level, but to describe 
the site a strand has to be arbitrarily picked so you can write the 
sequence of the site down as a single string. Its probably the case that 
the strand information you have is just the way it happened to be 
reported in the literature and has no biological meaning.


From shameer at ncbs.res.in  Mon Aug 13 15:16:33 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Mon, 13 Aug 2007 20:46:33 +0530 (IST)
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
 add direction to SeqFeature
In-Reply-To: <871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<871544DF-19F0-4C6A-849E-514D8B7BAA12@uiuc.edu>
Message-ID: <42833.192.168.1.1.1187018193.squirrel@mail.ncbs.res.in>

Chris,

Thanks for your detailed reply.
I will read up the docs and try different options using ur code snippets
as starting point. I will get back to the list with my results.

Thanks
-- 
Shameer

>
> On Aug 13, 2007, at 6:07 AM, Shameer Khadar wrote:
>
>> Dear All,
>>
>> I am generating images based on Transcription Factor binding site data
>> using bio::graphics module.
>> I created my images using program : version-2
>> [http://stein.cshl.org/genome_informatics/BioGraphics/] (Courtsey : L.
>> Stein ). I attaching one of the image with this mail.
>>
>> I need to make 3 changes to this image
>>
>> 1. to color the 'scale'
>> Color the scale in two different colors ie, from start 1.0k - color
>> blue
>> from 101 - till end of the scale green (I thoroghly checked the
>> Bio::Graphics document, I couldnt find an option to do this )
>
> Much of the documentation you need is available via 'perldoc
> Bio::Graphics::Panel' and the various Bio::Graphics::Glyph classes.
> The above may be possible using two seqfeatures instead of one or
> maybe a split location with a callback (not sure, haven't tried
> either, mileage may vary, batteries not included, warranty void if
> packaging is opened, etc).  Might be worth checking out the POD for
> the arrow glyph to see what's possible.
>
>> 2. to sort the Transcription factors based on the z_score
>
> In Bio::Graphics::Panel POD under 'Glyph Options', there is
> documentation for 'sort_order' which accepts callbacks.  According to
> the docs you would basically do something like the following (the
> prototype is required; note the score):
>
>    -sort_order => sub ($$) {
>      my ($glyph1,$glyph2) = @_;
>      my $a = $glyph1->feature;
>      my $b = $glyph2->feature;
>      ( $b->score/log($b->length)
>            <=>
>        $a->score/log($a->length) )
>            ||
>      ( $a->start <=> $b->start )
>    }
>
> Again, haven't tried.
>
>> 3. to give forward/reverse [> or < ]direction for the black boxes
>
> I think you first need to ensure the glyph will accept strandedness,
> though I think most do.  Then you would set either the 'strand_arrow'
> or 'stranded' option to 1 (they are synonyms).  Again, see
> Bio::Graphics::Panel POD under Glyph Options, specifically the
> parameter 'stranded' or 'strand_arrow'.
>


-- 
Shameer Khadar
Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From bix at sendu.me.uk  Mon Aug 13 15:47:10 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 13 Aug 2007 16:47:10 +0100
Subject: [Bioperl-l] newbie wants install help
In-Reply-To: <aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>
References: <aad3caa30708091447oc54effbke55c84fa0ddf637b@mail.gmail.com>	
	<46BB93DC.9010608@sendu.me.uk>	
	<aad3caa30708092342g3521c663p8296bcd11218d232@mail.gmail.com>	
	<46BC16A9.7090709@sendu.me.uk>
	<aad3caa30708101009k4734fe45i1dcd29a5e20af834@mail.gmail.com>
Message-ID: <46C07CFE.7020105@sendu.me.uk>

hari jayaram wrote:
> Hey all ,
> Thanks for your help. Its working real well now.
[snip]
> I am thinking I will screencast this process for everyones benefit and 
> put it up on bioscreencast.com <http://bioscreencast.com> . If that will 
> be useful for others.

I'm certain it will. That's a very interesting website. Thanks for 
taking the time, and I hope you find Bioperl useful.


From cjfields at uiuc.edu  Mon Aug 13 16:24:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 11:24:15 -0500
Subject: [Bioperl-l] Bio::Graphics : To change scale color & sort and
	add direction to SeqFeature
In-Reply-To: <46C07257.1000308@sendu.me.uk>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
Message-ID: <A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>


On Aug 13, 2007, at 10:01 AM, Sendu Bala wrote:

> Shameer Khadar wrote:
>>> However, do your binding sites really have a direction? That is,  
>>> do you
>>> really know which strand your transcription factor bound to?
>>
>> Yes, these info we collated from various experimental datasets.
>
> Well, those datasets I'd like to see... What I was getting at is the
> strand probably isn't known at the experimental level, but to describe
> the site a strand has to be arbitrarily picked so you can write the
> sequence of the site down as a single string. Its probably the case  
> that
> the strand information you have is just the way it happened to be
> reported in the literature and has no biological meaning.

It's subjective.  I can think of several cases where strandedness  
does matter and has meaning.  If the motif is related to how the gene  
is transcribed or post-transcriptionally regulated, for instance;  
elements which indicate start of transcription (-10/-35 or any sigma- 
factor-related promoter element in prokaryotes), end of transcription  
(poly-A signal, transcription terminators), modulation of translation  
(SECIS, IRES), or conserved DNA motifs which are transcribed prior to  
regulation (RNA-binding proteins like IRE).

chris


From amacgregor at ccg.murdoch.edu.au  Tue Aug 14 00:52:10 2007
From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor)
Date: Tue, 14 Aug 2007 08:52:10 +0800
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
Message-ID: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>

On 13/08/2007, at 6:29 PM, neeti somaiya wrote:

> Hi,
>
> Does anyone know of any Homologene parser, if available?
> Please let me know.
>
> Thanks and Regards,
> Neeti.

Hi Neeti,

Quite a long time ago now I wrote an Homologene parser and posted it  
to the mailing list:

<http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>

I don't know if this still works but you could use it as a starting  
point. There may also be something newer out there too, I don't know.  
If you search the mailing list archives you'll get a few messages  
around the topic.

Cheers, Andrew.


Andrew Macgregor
Centre for Comparative Genomics, Murdoch University
Email: amacgregor at ccg.murdoch.edu.au
Tel: (08) 9360 2961


From cjfields at uiuc.edu  Tue Aug 14 03:21:54 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 13 Aug 2007 22:21:54 -0500
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
Message-ID: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>

It looks like Heikki responded and thought a good place for it would  
be Bio::SeqIO, but it didn't go anywhere I suppose.  I see that a few  
other posts suggest it could be placed in Bio::Cluster as well which  
I'm not familiar with.  We could add it in if you were still  
interested, just need to find a good place for it; might be nice to  
have a Parse::RecDescent-based parser.

chris

On Aug 13, 2007, at 7:52 PM, Andrew Macgregor wrote:

> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>
>> Hi,
>>
>> Does anyone know of any Homologene parser, if available?
>> Please let me know.
>>
>> Thanks and Regards,
>> Neeti.
>
> Hi Neeti,
>
> Quite a long time ago now I wrote an Homologene parser and posted it
> to the mailing list:
>
> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>
> I don't know if this still works but you could use it as a starting
> point. There may also be something newer out there too, I don't know.
> If you search the mailing list archives you'll get a few messages
> around the topic.
>
> Cheers, Andrew.
>
>
> Andrew Macgregor
> Centre for Comparative Genomics, Murdoch University
> Email: amacgregor at ccg.murdoch.edu.au
> Tel: (08) 9360 2961
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Tue Aug 14 07:46:19 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Tue, 14 Aug 2007 08:46:19 +0100
Subject: [Bioperl-l] Warnings/errors generated by Eclipse
Message-ID: <46C15DCB.80603@sheffield.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I've just been setting up Eclipse with the EPIC plugin, and it's
generating some errors and warnings about bioperl-live that I'd like to
pass by you.

I think most of the errors are along the lines of:
"Can't find 'build_params' in _build in
/usr/local/share/perl/5.8.8/Module/Build/Base.pm line 1011"

This occurs with files like:
t/Biblio_biofetch.t
t/seqread_fail.t

I think it's to do with the parameters passed to test_begin() or it
could be my setup of Eclipse?

Other highlighted problems are some of the scripts in the examples dir.
Some require modules that reside in the bioperl-run package. Would it be
wise to move these to the bioperl-run examples dir?

There may also be some problems with XML files in t/data e.g.
t/data/interpro_ebi.xml
There appears to be a typo on line 2. However, I'm not sure this is
up-to-date? I can comment on the others later if required.

Cheers
Nath
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGwV3KczuW2jkwy2gRApM/AJ9abWl02CAJqDK2sEXEUEg8nGRC4ACdHcAb
nZmh+1dmtc1W9mThkUVKitw=
=5eXZ
-----END PGP SIGNATURE-----


From amacgregor at ccg.murdoch.edu.au  Tue Aug 14 05:14:58 2007
From: amacgregor at ccg.murdoch.edu.au (Andrew Macgregor)
Date: Tue, 14 Aug 2007 13:14:58 +0800
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
Message-ID: <C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>

On 14/08/2007, at 11:21 AM, Chris Fields wrote:

> It looks like Heikki responded and thought a good place for it  
> would be Bio::SeqIO, but it didn't go anywhere I suppose.  I see  
> that a few other posts suggest it could be placed in Bio::Cluster  
> as well which I'm not familiar with.  We could add it in if you  
> were still interested, just need to find a good place for it; might  
> be nice to have a Parse::RecDescent-based parser.
>
> chris
>

Hi Chris,

I was also doing some parsing of UniGene at the time but found  
RecDescent was too slow and went back to regexes. That code found  
it's way into Bio::Cluster. Occasionally I see a message with someone  
looking for a Homologene parser but not very often, so I'm not sure  
it is worth the effort of moving the code into bioperl.

Cheers, Andrew.


From neetisomaiya at gmail.com  Tue Aug 14 13:24:07 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Tue, 14 Aug 2007 18:54:07 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
Message-ID: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>

Hi Andrew,

I think the homologene data files have changed now on the ftp, from what you
had used.
It is now homologene.data and homologene.xml.
I tried using your parser, but because it was written on the file
hmlg.trip.ftp, it doesnt work anymore.

I came across a parser
http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
.
I am looking at it to see if it works for me. NOt sure if it will.

~Neeti.

On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>
> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>
> > Hi,
> >
> > Does anyone know of any Homologene parser, if available?
> > Please let me know.
> >
> > Thanks and Regards,
> > Neeti.
>
> Hi Neeti,
>
> Quite a long time ago now I wrote an Homologene parser and posted it
> to the mailing list:
>
> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>
> I don't know if this still works but you could use it as a starting
> point. There may also be something newer out there too, I don't know.
> If you search the mailing list archives you'll get a few messages
> around the topic.
>
> Cheers, Andrew.
>
>
> Andrew Macgregor
> Centre for Comparative Genomics, Murdoch University
> Email: amacgregor at ccg.murdoch.edu.au
> Tel: (08) 9360 2961
>
>
>
>


-- 
-Neeti
Even my blood says, B positive


From bix at sendu.me.uk  Tue Aug 14 14:57:29 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 14 Aug 2007 15:57:29 +0100
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
Message-ID: <46C1C2D9.6050409@sendu.me.uk>

I'm looking at what looks like a pretty major bug in Bio::SimpleAlign, 
but before I commit the fix I wanted to check my sanity/understanding.

My understanding is that an alignment may be built from just sub-parts 
of a number of sequences. So you give each sequence in the alignment a 
start and stop so you can later map back the aligned region to the 
original sequence. So, for example, the following should all pass:

diff -r1.56 SimpleAlign.t
459a460,540
 >
 >
 > # is _remove_col really working correctly?
 > my $a = Bio::LocatableSeq->new(-id => 'a', -seq => 
'atcgatcgatcgatcg', -start => 5, -end => 20);
 > my $b = Bio::LocatableSeq->new(-id => 'b', -seq => 
'-tcgatc-atcgatcg', -start => 30, -end => 43);
 > my $c = Bio::LocatableSeq->new(-id => 'c', -seq => 
'atcgatcgatc-atc-', -start => 50, -end => 63);
 > my $d = Bio::LocatableSeq->new(-id => 'd', -seq => 
'--cgatcgatcgat--', -start => 80, -end => 91);
 > my $e = Bio::LocatableSeq->new(-id => 'e', -seq => 
'-t-gatcgatcga-c-', -start => 100, -end => 111);
 > $aln = Bio::SimpleAlign->new();
 > $aln->add_seq($a);
 > $aln->add_seq($b);
 > $aln->add_seq($c);
 >
 > my $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 6;
 >               is $seq->end, 19;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 >       elsif ($seq->id eq 'b') {
 >               is $seq->start, 30;
 >               is $seq->end, 42;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 >       elsif ($seq->id eq 'c') {
 >               is $seq->start, 51;
 >               is $seq->end, 63;
 >               is $seq->seq, 'tcgatcatcatc';
 >       }
 > }
 >
 > $aln->add_seq($d);
 > $aln->add_seq($e);
 > $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 8;
 >               is $seq->end, 17;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'b') {
 >               is $seq->start, 32;
 >               is $seq->end, 40;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'c') {
 >               is $seq->start, 53;
 >               is $seq->end, 61;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'd') {
 >               is $seq->start, 81;
 >               is $seq->end, 90;
 >               is $seq->seq, 'gatcatca';
 >       }
 >       elsif ($seq->id eq 'e') {
 >               is $seq->start, 101;
 >               is $seq->end, 110;
 >               is $seq->seq, 'gatcatca';
 >       }
 > }
 >
 > my $f = Bio::LocatableSeq->new(-id => 'f', -seq => 
'a-cgatcgatcgat-g', -start => 30, -end => 43);
 > $aln = Bio::SimpleAlign->new();
 > $aln->add_seq($a);
 > $aln->add_seq($f);
 >
 > $gapless = $aln->remove_gaps();
 > foreach my $seq ($gapless->each_seq) {
 >       if ($seq->id eq 'a') {
 >               is $seq->start, 5;
 >               is $seq->end, 20;
 >               is $seq->seq, 'acgatcgatcgatg';
 >       }
 >       elsif ($seq->id eq 'f') {
 >               is $seq->start, 30;
 >               is $seq->end, 43;
 >               is $seq->seq, 'acgatcgatcgatg';
 >       }
 > }


But they don't. Once you remove certain columns the start and stop of 
the sequences in the alignment are no longer correct coordinates for the 
sub-sequence in the original sequence.

I propose the following patch to resolve this issue:

diff -r1.136 SimpleAlign.pm
1116c1116,1118
<
---
 >
 >     my $gap = $self->gap_char;
 >
1129,1137c1131,1147
<             my $spliced;
<             $spliced .= $start > 0 ? substr($sequence,0,$start) : '';
<             $spliced .= substr($sequence,$end+1,$seq->length-$end+1);
<             $sequence = $spliced;
<             if ($start == 1) {
<               $new_seq->start($end);
<             }
<             else {
<               $new_seq->start( $seq->start);
---
 >             my $orig = $sequence;
 >             my $head =  $start > 0 ? substr($sequence, 0, $start) : '';
 >             my $tail = ($end + 1) >= length($sequence) ? '' : 
substr($sequence, $end + 1);
 >             $sequence = $head.$tail;
 >             # start
 >             unless (defined $new_seq->start) {
 >                 if ($start == 0) {
 >                     my $start_adjust = () = substr($orig, 0, $end + 
1) =~ /$gap/g;
 >                     $new_seq->start($seq->start + $end + 1 - 
$start_adjust);
 >                 }
 >                 else {
 >                     my $start_adjust = $orig =~ /$gap+/;
 >                     if ($start_adjust) {
 >                         $start_adjust = $+[0] - 1 < $start;
 >                     }
 >                     $new_seq->start($seq->start + $start_adjust);
 >                 }
1140,1141c1150,1152
<             if($end >= $seq->end){
<              $new_seq->end( $start);
---
 >             if (($end + 1) >= length($orig)) {
 >                 my $end_adjust = () = substr($orig, $start) =~ /$gap/g;
 >                 $new_seq->end($seq->end - (length($orig) - $start) + 
$end_adjust);
1144c1155
<              $new_seq->end($seq->end);
---
 >                 $new_seq->end($seq->end);
1148c1159
<                 push @new, $new_seq;
---
 >               push @new, $new_seq;
1207,1209c1218,1234
<       # sort the positions to remove columns at the end 1st
<       @$positions = sort { $b->[0] <=> $a->[0] } @$positions;
<       $aln = $self->_remove_col($aln,$positions);
---
 >       # sort the positions
 >       @$positions = sort { $a->[0] <=> $b->[0] } @$positions;
 >
 >     my @remove;
 >     my $length = 0;
 >     foreach my $pos (@{$positions}) {
 >         my ($start, $end) = @{$pos};
 >
 >         #have to offset the start and end for subsequent removes
 >         $start-=$length;
 >         $end  -=$length;
 >         $length += ($end-$start+1);
 >         push @remove, [$start,$end];
 >     }
 >
 >     #remove the segments
 >     $aln = $#remove >= 0 ? $self->_remove_col($aln,\@remove) : $self;


This breaks 2 tests in SimpleAlign.t, but as far as I can tell, those 
tests expect the wrong answer. Changed to expect the correct answer, 
SimpleAlign.t and all other tests in the test suite pass.

diff -r1.56 SimpleAlign.t
214,215c214,215
<       "P84139/1-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
<       "P814153/1-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
---
 >       "P84139/2-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
 >       "P814153/2-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
229c229
<       "gb|443893|124775/1-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
---
 >       "gb|443893|124775/2-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",


Can someone triple-check my thinking and report back please?

Cheers,
Sendu.


From basu at pharm.sunysb.edu  Tue Aug 14 15:02:06 2007
From: basu at pharm.sunysb.edu (Siddhartha Basu)
Date: Tue, 14 Aug 2007 11:02:06 -0400
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
Message-ID: <46C1C3EE.4030006@pharm.sunysb.edu>

neeti somaiya wrote:
> Hi Andrew,
> 
> I think the homologene data files have changed now on the ftp, from what you
> had used.
> It is now homologene.data and homologene.xml.
> I tried using your parser, but because it was written on the file
> hmlg.trip.ftp, it doesnt work anymore.
> 
> I came across a parser
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> .
> I am looking at it to see if it works for me. NOt sure if it will.
> 
> ~Neeti.

Hi Neeti,
I have recently written a parser for 'homologene' xml data specific for 
my purpose. I am not sure whether it will suit your purpose but it could 
be extended for general purpose parsing, so i am putting it forward. 
Here is how it works .......

* It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
* It does SAX based parsing (currently uses XML::SAX::ExpatXS)
* Returns a graph(uses Graph module of perl) object where each node is a 
homologue entry with its corresponding entrez gene id. Each node also 
contain the following attributes ...
	* Refseq protein id.
	* Protein id (pid)
	* ncbi taxon id.
* The edge attribute contain information about the ortholog(true/false) 
relationship between two nodes.
* The rest of tags currently are not being extracted. However, parsing 
the rest of the tags should not be very difficult.

Generally i get homologene xml stream from an 'efetch' through 
Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and 
then works on it.

So, to make it more generic and work on local file

* We need another class that reads the chunk between 
<HG-Entry>.....</HG-Entry> and sends it to the parser.
* Add supports for most of the tags.
* Massage the data to a bioperl compatible object.

The first two i could work it out and for the last one i have to figure 
out the bioperl object that could be suitable (like  Bio::Cluster or 
Bio::NetWork::Node/Edge).

Let me know if it sounds interesting and i will send you the code.

-siddhartha


> 
> On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>>
>>> Hi,
>>>
>>> Does anyone know of any Homologene parser, if available?
>>> Please let me know.
>>>
>>> Thanks and Regards,
>>> Neeti.
>> Hi Neeti,
>>
>> Quite a long time ago now I wrote an Homologene parser and posted it
>> to the mailing list:
>>
>> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>>
>> I don't know if this still works but you could use it as a starting
>> point. There may also be something newer out there too, I don't know.
>> If you search the mailing list archives you'll get a few messages
>> around the topic.
>>
>> Cheers, Andrew.
>>
>>
>> Andrew Macgregor
>> Centre for Comparative Genomics, Murdoch University
>> Email: amacgregor at ccg.murdoch.edu.au
>> Tel: (08) 9360 2961
>>
>>
>>
>>
> 
> 


From cjfields at uiuc.edu  Tue Aug 14 16:33:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 14 Aug 2007 11:33:31 -0500
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
In-Reply-To: <46C1C2D9.6050409@sendu.me.uk>
References: <46C1C2D9.6050409@sendu.me.uk>
Message-ID: <B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>

Could you attach the scripts and patches to a bug report for tracking  
so anyone interested can double-check?  Having them in an email is  
problematic as the text in some clients wraps.

 From what I'm seeing I think we're in general agreement, though I'll  
reason through it to see if I'm following correctly.  The data in the  
SimpleAlign example you give is this:

a/5-20            atcgatcgatcgatcg
b/30-43           -tcgatc-atcgatcg
c/50-63           atcgatcgatc-atc-
                    ****** *** ***

Removing the gaps gives:

a/5-20            tcgatcatcatc
b/30-43           tcgatcatcatc
c/50-63           tcgatcatcatc
                   ************

The start/end is wrong, as you state.  Adjusting to map simple start/ 
ends to the original sequence won't work as we're removing gaps and  
residues in the LocatableSeqs along with it (ends and internal  
residues).  I guess if we want to map back to the original sequence  
accurately we would have to use split locations (not currently  
implemented with LocatableSeq) or maybe a cigar-like syntax against  
consensus (ugh), otherwise we wouldn't know where to map the relevant  
internal gaps (now missing from the alignment) w/o running a local  
alignment against the original sequence:

a/6-11;12-19      tcgatcatcatc
b/30-38;40-42     tcgatcatcatc
c/51-56;58-63     tcgatcatcatc
                   ************

That could get really hairy for long alignments.  We could also  
return multiple SimpleAligns which map correctly (ugh), but what we  
really want (and the API specifies) is a new single SimpleAlign.

It may come down to simply stating it 'voids the warranty' (so-to- 
speak) when modifications are made to alignments which remove/insert  
residues from LocatableSeqs via remove_gaps/remove_columns or  
similar, and either leave as is with relevant warnings or readjust  
start/end appropriately when LocatableSeq residues change.

gapless_a/1-12    tcgatcatcatc
gapless_b/1-12    tcgatcatcatc
gapless_c/1-12    tcgatcatcatc
                   ************

Not sure which is the best approach but anything would be better than  
giving an unexpectedly incorrect answer.

chris

On Aug 14, 2007, at 9:57 AM, Sendu Bala wrote:

> I'm looking at what looks like a pretty major bug in Bio::SimpleAlign,
> but before I commit the fix I wanted to check my sanity/understanding.
>
> My understanding is that an alignment may be built from just sub-parts
> of a number of sequences. So you give each sequence in the alignment a
> start and stop so you can later map back the aligned region to the
> original sequence. So, for example, the following should all pass:
>
> diff -r1.56 SimpleAlign.t
> 459a460,540
>>
>>
>> # is _remove_col really working correctly?
>> my $a = Bio::LocatableSeq->new(-id => 'a', -seq =>
> 'atcgatcgatcgatcg', -start => 5, -end => 20);
>> my $b = Bio::LocatableSeq->new(-id => 'b', -seq =>
> '-tcgatc-atcgatcg', -start => 30, -end => 43);
>> my $c = Bio::LocatableSeq->new(-id => 'c', -seq =>
> 'atcgatcgatc-atc-', -start => 50, -end => 63);
>> my $d = Bio::LocatableSeq->new(-id => 'd', -seq =>
> '--cgatcgatcgat--', -start => 80, -end => 91);
>> my $e = Bio::LocatableSeq->new(-id => 'e', -seq =>
> '-t-gatcgatcga-c-', -start => 100, -end => 111);
>> $aln = Bio::SimpleAlign->new();
>> $aln->add_seq($a);
>> $aln->add_seq($b);
>> $aln->add_seq($c);
>>
>> my $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 6;
>>               is $seq->end, 19;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>>       elsif ($seq->id eq 'b') {
>>               is $seq->start, 30;
>>               is $seq->end, 42;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>>       elsif ($seq->id eq 'c') {
>>               is $seq->start, 51;
>>               is $seq->end, 63;
>>               is $seq->seq, 'tcgatcatcatc';
>>       }
>> }
>>
>> $aln->add_seq($d);
>> $aln->add_seq($e);
>> $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 8;
>>               is $seq->end, 17;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'b') {
>>               is $seq->start, 32;
>>               is $seq->end, 40;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'c') {
>>               is $seq->start, 53;
>>               is $seq->end, 61;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'd') {
>>               is $seq->start, 81;
>>               is $seq->end, 90;
>>               is $seq->seq, 'gatcatca';
>>       }
>>       elsif ($seq->id eq 'e') {
>>               is $seq->start, 101;
>>               is $seq->end, 110;
>>               is $seq->seq, 'gatcatca';
>>       }
>> }
>>
>> my $f = Bio::LocatableSeq->new(-id => 'f', -seq =>
> 'a-cgatcgatcgat-g', -start => 30, -end => 43);
>> $aln = Bio::SimpleAlign->new();
>> $aln->add_seq($a);
>> $aln->add_seq($f);
>>
>> $gapless = $aln->remove_gaps();
>> foreach my $seq ($gapless->each_seq) {
>>       if ($seq->id eq 'a') {
>>               is $seq->start, 5;
>>               is $seq->end, 20;
>>               is $seq->seq, 'acgatcgatcgatg';
>>       }
>>       elsif ($seq->id eq 'f') {
>>               is $seq->start, 30;
>>               is $seq->end, 43;
>>               is $seq->seq, 'acgatcgatcgatg';
>>       }
>> }
>
>
> But they don't. Once you remove certain columns the start and stop of
> the sequences in the alignment are no longer correct coordinates  
> for the
> sub-sequence in the original sequence.
>
> I propose the following patch to resolve this issue:
>
> diff -r1.136 SimpleAlign.pm
> 1116c1116,1118
> <
> ---
>>
>>     my $gap = $self->gap_char;
>>
> 1129,1137c1131,1147
> <             my $spliced;
> <             $spliced .= $start > 0 ? substr($sequence,0,$start) :  
> '';
> <             $spliced .= substr($sequence,$end+1,$seq->length-$end 
> +1);
> <             $sequence = $spliced;
> <             if ($start == 1) {
> <               $new_seq->start($end);
> <             }
> <             else {
> <               $new_seq->start( $seq->start);
> ---
>>             my $orig = $sequence;
>>             my $head =  $start > 0 ? substr($sequence, 0,  
>> $start) : '';
>>             my $tail = ($end + 1) >= length($sequence) ? '' :
> substr($sequence, $end + 1);
>>             $sequence = $head.$tail;
>>             # start
>>             unless (defined $new_seq->start) {
>>                 if ($start == 0) {
>>                     my $start_adjust = () = substr($orig, 0, $end +
> 1) =~ /$gap/g;
>>                     $new_seq->start($seq->start + $end + 1 -
> $start_adjust);
>>                 }
>>                 else {
>>                     my $start_adjust = $orig =~ /$gap+/;
>>                     if ($start_adjust) {
>>                         $start_adjust = $+[0] - 1 < $start;
>>                     }
>>                     $new_seq->start($seq->start + $start_adjust);
>>                 }
> 1140,1141c1150,1152
> <             if($end >= $seq->end){
> <              $new_seq->end( $start);
> ---
>>             if (($end + 1) >= length($orig)) {
>>                 my $end_adjust = () = substr($orig, $start) =~ / 
>> $gap/g;
>>                 $new_seq->end($seq->end - (length($orig) - $start) +
> $end_adjust);
> 1144c1155
> <              $new_seq->end($seq->end);
> ---
>>                 $new_seq->end($seq->end);
> 1148c1159
> <                 push @new, $new_seq;
> ---
>>               push @new, $new_seq;
> 1207,1209c1218,1234
> <       # sort the positions to remove columns at the end 1st
> <       @$positions = sort { $b->[0] <=> $a->[0] } @$positions;
> <       $aln = $self->_remove_col($aln,$positions);
> ---
>>       # sort the positions
>>       @$positions = sort { $a->[0] <=> $b->[0] } @$positions;
>>
>>     my @remove;
>>     my $length = 0;
>>     foreach my $pos (@{$positions}) {
>>         my ($start, $end) = @{$pos};
>>
>>         #have to offset the start and end for subsequent removes
>>         $start-=$length;
>>         $end  -=$length;
>>         $length += ($end-$start+1);
>>         push @remove, [$start,$end];
>>     }
>>
>>     #remove the segments
>>     $aln = $#remove >= 0 ? $self->_remove_col($aln,\@remove) : $self;
>
>
> This breaks 2 tests in SimpleAlign.t, but as far as I can tell, those
> tests expect the wrong answer. Changed to expect the correct answer,
> SimpleAlign.t and all other tests in the test suite pass.
>
> diff -r1.56 SimpleAlign.t
> 214,215c214,215
> <       "P84139/1-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
> <       "P814153/1-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
> ---
>>       "P84139/2-33              NEGEHQIKLDELFEKLLRARLIFKNKDVLRRC\n".
>>       "P814153/2-33             NEGMHQIKLDVLFEKLLRARLIFKNKDVLRRC\n".
> 229c229
> <       "gb|443893|124775/1-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
> ---
>>       "gb|443893|124775/2-32    -RFRIKVPPAVEGARPALLIFKSRPELGC\n",
>
>
> Can someone triple-check my thinking and report back please?
>
> Cheers,
> Sendu.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Tue Aug 14 17:13:30 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 14 Aug 2007 18:13:30 +0100
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
 columns?
In-Reply-To: <B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
References: <46C1C2D9.6050409@sendu.me.uk>
	<B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
Message-ID: <46C1E2BA.8060606@sendu.me.uk>

Chris Fields wrote:
> Could you attach the scripts and patches to a bug report for tracking
> so anyone interested can double-check?  Having them in an email is 
> problematic as the text in some clients wraps.

http://bugzilla.open-bio.org/show_bug.cgi?id=2344


> From what I'm seeing I think we're in general agreement, though I'll
>  reason through it to see if I'm following correctly.  The data in
> the SimpleAlign example you give is this:
> 
> a/5-20            atcgatcgatcgatcg
> b/30-43           -tcgatc-atcgatcg
> c/50-63           atcgatcgatc-atc-
>                    ****** *** ***
> 
> Removing the gaps gives:
> 
> a/5-20            tcgatcatcatc
> b/30-43           tcgatcatcatc
> c/50-63           tcgatcatcatc
>                   ************
> 
> The start/end is wrong, as you state.

Yes. For extra clarity, my thinking is that the correct answer is:

a/6-19            tcgatcatcatc
b/30-42           tcgatcatcatc
c/51-63           tcgatcatcatc
                   ************


> Adjusting to map simple start/ends to the original sequence won't
> work as we're removing gaps and residues in the LocatableSeqs along
> with it (ends and internal residues).  I guess if we want to map back
> to the original sequence accurately [snip]

What you say in the rest of your discussion is valid and deserves some 
thought/discussion, but for now just getting the start and end correct, 
ignoring any issues with internal residues, seems like a no-brainer.

For my own purposes that is all I need; having removed gaps I only need 
the start and end so I can take that region from each sequence and do a 
new alignment (for example).


BTW. Either my patch isn't quite perfect or there's another related bug 
I'm still tracking down. I'll commit when I've solved that, unless 
someone points out any mistakes in my thinking.


From basu at pharm.stonybrook.edu  Tue Aug 14 16:16:23 2007
From: basu at pharm.stonybrook.edu (Siddhartha Basu)
Date: Tue, 14 Aug 2007 12:16:23 -0400
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
Message-ID: <46C1D557.7090101@pharm.stonybrook.edu>

neeti somaiya wrote:
> Hi Andrew,
> 
> I think the homologene data files have changed now on the ftp, from what you
> had used.
> It is now homologene.data and homologene.xml.
> I tried using your parser, but because it was written on the file
> hmlg.trip.ftp, it doesnt work anymore.
> 
> I came across a parser
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> .
> I am looking at it to see if it works for me. NOt sure if it will.
> 
> ~Neeti.

Hi Neeti,
I have recently written a parser for 'homologene' xml data specific for
my purpose. I am not sure whether it will suit your purpose but it could
be extended for general purpose parsing, so i am putting it forward.
Here is how it works .......

* It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
* It does SAX based parsing (currently uses XML::SAX::ExpatXS)
* Returns a graph(uses Graph module of perl) object where each node is a
homologue entry with its corresponding entrez gene id. Each node also
contain the following attributes ...
	* Refseq protein id.
	* Protein id (pid)
	* ncbi taxon id.
* The edge attribute contain information about the ortholog(true/false)
relationship between two nodes.
* The rest of tags currently are not being extracted. However, parsing
the rest of the tags should not be very difficult.

Generally i get homologene xml stream from an 'efetch' through
Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and
then works on it.

So, to make it more generic and work on local file

* We need another class that reads the chunk between
<HG-Entry>.....</HG-Entry> and sends it to the parser.
* Add supports for most of the tags.
* Massage the data to a bioperl compatible object.

The first two i could work it out and for the last one i have to figure
out the bioperl object that could be suitable (like  Bio::Cluster or
Bio::NetWork::Node/Edge).

Let me know if it sounds interesting and i will send you the code.

-siddhartha


> 
> On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
>>
>>> Hi,
>>>
>>> Does anyone know of any Homologene parser, if available?
>>> Please let me know.
>>>
>>> Thanks and Regards,
>>> Neeti.
>> Hi Neeti,
>>
>> Quite a long time ago now I wrote an Homologene parser and posted it
>> to the mailing list:
>>
>> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
>>
>> I don't know if this still works but you could use it as a starting
>> point. There may also be something newer out there too, I don't know.
>> If you search the mailing list archives you'll get a few messages
>> around the topic.
>>
>> Cheers, Andrew.
>>
>>
>> Andrew Macgregor
>> Centre for Comparative Genomics, Murdoch University
>> Email: amacgregor at ccg.murdoch.edu.au
>> Tel: (08) 9360 2961
>>
>>
>>
>>
> 
> 


From cjfields at uiuc.edu  Tue Aug 14 17:19:59 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 14 Aug 2007 12:19:59 -0500
Subject: [Bioperl-l] Should coords be adjusted after removing alignment
	columns?
In-Reply-To: <46C1E2BA.8060606@sendu.me.uk>
References: <46C1C2D9.6050409@sendu.me.uk>
	<B0CBCE00-3C7F-4373-BF5C-4DE573F695C8@uiuc.edu>
	<46C1E2BA.8060606@sendu.me.uk>
Message-ID: <EE515FDC-2223-4D03-B819-3EA909539A61@uiuc.edu>


On Aug 14, 2007, at 12:13 PM, Sendu Bala wrote:
...

>
> Yes. For extra clarity, my thinking is that the correct answer is:
>
> a/6-19            tcgatcatcatc
> b/30-42           tcgatcatcatc
> c/51-63           tcgatcatcatc
>  ...
> What you say in the rest of your discussion is valid and deserves  
> some thought/discussion, but for now just getting the start and end  
> correct, ignoring any issues with internal residues, seems like a  
> no-brainer.
>
> For my own purposes that is all I need; having removed gaps I only  
> need the start and end so I can take that region from each sequence  
> and do a new alignment (for example).

It might be worth addressing the split location issue in the bug  
report before it gets lost in the ether.  Or maybe start a new one as  
an enhancement request.

> BTW. Either my patch isn't quite perfect or there's another related  
> bug I'm still tracking down. I'll commit when I've solved that,  
> unless someone points out any mistakes in my thinking.

Sounds fine by me.

chris


From gyang at plantbio.uga.edu  Tue Aug 14 19:01:07 2007
From: gyang at plantbio.uga.edu (Guojun Yang)
Date: Tue, 14 Aug 2007 15:01:07 -0400
Subject: [Bioperl-l] the most weird thing  I've seen, help please
In-Reply-To: 41A08079-6EEC-4B62-8104-C41E70C03083@uiuc.edu
Message-ID: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>

Hi, all,  
I have two subroutines in my code. One is remoteblast and the other local blast. It works well.  
When I decided to change the remoteblast to local blast, I always get the following error. I downloaded nt database from NCBI as preformatted, but it works ok for both subroutines when I use command line blastall -p blastn.... I changed the db name to 'nt', 'nt.00', the same error message was returned. The error says: "program name was not given an argument", but I apparently gave it there.  Can anybody help me? The code for the two subrountines are very similar:  
   
sub search {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  
my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"query");
my $len=$query->length();
@db=('nt.nal');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new( -program =>"blastn",
                                                         -database =>"$db",
                                                         -e =>"$_[1]");
    my $rc = $factory->blastall($query);  
......  
   
   
sub ancestor {
    use Bio::Tools::Run::StandAloneBlast;
    use Bio::SearchIO::blast;  
my $query = Bio::Seq -> new ( -seq=>"$_[0]",
                              -id=>"test");
my $len=$query->length();
my $long_name=$_[1];
my $long_start=$_[2];
my $long_end=$_[3];
@db=('TNDB');
foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new(-program => "blastx",
                                                        -database => "$db",
                                                        -e => 1e-3,
                                                        );
    my $blast_report = $factory->blastall($query);

  
Thanks a lot!  
Guojun Yang  
Department of Plant Biology  
University of Georgia


From zhaodj at ioz.ac.cn  Wed Aug 15 08:05:36 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Wed, 15 Aug 2007 16:05:36 +0800 (CST)
Subject: [Bioperl-l] the most weird thing  I've seen, help please
In-Reply-To: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>
References: <20070814190107.4834b14b@dogwood.plantbio.uga.edu>
Message-ID: <52820.159.226.67.49.1187165136.squirrel@mail.ioz.ac.cn>

Hi Guojun Yang,

I tested your code,modifying part of them. However,I did not
encounter the error.The modified code follows (see below and the
attachment). The codes run without any error on my Windows XP and
generates a file named lclblastResult.txt

In the codes I use the NCBI ecoli.nt database instead. Some
parameters change without affecting its function.

I think errors may happen in other part of your codes and more
details are needed.

-------code starts-------
#sub search {
use Bio::Tools::Run::StandAloneBlast;
use Bio::SearchIO::blast;

#my $query = Bio::Seq -> new ( -seq=>"$_[0]",
#                              -id=>"query");
my $query=Bio::Seq->new(-seq=>"ctgtattctgggatgca");
my $len=$query->length();

#@db=('nt.nal');
#foreach my $db (@db) {
    my $factory = Bio::Tools::Run::StandAloneBlast->new( -program
=>"blastn",
                                                         -database
=>'D:/blast/bin/ecoli.nt',
                                                         -e =>1,
														 -o=>'lclblastResult.txt');
my $rc = $factory->blastall($query);
-----code ends--------


On Wed, Aug 15, 2007 03:01, Guojun Yang wrote:
> Hi, all,
> I have two subroutines in my code. One is remoteblast and the
other
> local blast. It works well.
> When I decided to change the remoteblast to local blast, I always
get the following error. I downloaded nt database from NCBI as
> preformatted, but it works ok for both subroutines when I use
> command line blastall -p blastn.... I changed the db name to 'nt',
'nt.00', the same error message was returned. The error says:
> "program name was not given an argument", but I apparently gave it
there.  Can anybody help me? The code for the two subrountines are
very similar:
>
> sub search {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"query");
> my $len=$query->length();
> @db=('nt.nal');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new( -program
> =>"blastn",
>                                                          -database
> =>"$db",
>                                                          -e
> =>"$_[1]");
>     my $rc = $factory->blastall($query);
> ......
>
>
> sub ancestor {
>     use Bio::Tools::Run::StandAloneBlast;
>     use Bio::SearchIO::blast;
> my $query = Bio::Seq -> new ( -seq=>"$_[0]",
>                               -id=>"test");
> my $len=$query->length();
> my $long_name=$_[1];
> my $long_start=$_[2];
> my $long_end=$_[3];
> @db=('TNDB');
> foreach my $db (@db) {
>     my $factory = Bio::Tools::Run::StandAloneBlast->new(-program
=>
> "blastx",
>                                                         -database
=>
> "$db",
>                                                         -e =>
1e-3,
>                                                         );
>     my $blast_report = $factory->blastall($query);
>
>
> Thanks a lot!
> Guojun Yang
> Department of Plant Biology
> University of Georgia
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


-------------- next part --------------
A non-text attachment was scrubbed...
Name: lclblast.pl
Type: application/octet-stream
Size: 644 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/f40b2950/attachment-0004.obj>

From tania.oh at brasenose.oxford.ac.uk  Wed Aug 15 16:05:15 2007
From: tania.oh at brasenose.oxford.ac.uk (Tania Oh)
Date: Wed, 15 Aug 2007 17:05:15 +0100
Subject: [Bioperl-l] exonerate parser in bioperl-live fails when protein2dna
	comparison is performed
Message-ID: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>

Dear All,

I was trying to use the Bio::SearchIO::Alignment::Exonerate module to  
run and parse my exonerate output. But I've noticed that the parser  
which is actually Bio::SearchIO::Exonerate works if the model used in  
Exonerate is --model est2genome. I used exonerate with the model -- 
model protein2dna and the parser was unable to parse the hsps.


Below is a simple of code I used for testing the output from exonerate:

use Bio::SearchIO;
use strict;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exonerate.output.works
Type: application/octet-stream
Size: 6056 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/e4e43d75/attachment-0008.obj>
-------------- next part --------------
my $searchio = Bio::SearchIO->new(-file => 'test_data/ 
exonerate.output.dontwork
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exonerate.output.dontwork
Type: application/octet-stream
Size: 3283 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070815/e4e43d75/attachment-0009.obj>
-------------- next part --------------
',
                                    -format => 'exonerate');

   while( my $r = $searchio->next_result ) {
           while(my $hit = $r->next_hit){
                   while(my $hsp = $hit->next_hsp){
                           print $hsp->start. "\t". $hsp->end. "\n";
                   }
           }

     print $r->query_name, "\n";
   }


There are 2 files attached to show the examples of using either the  
est2genome or protein2dna model:
1. exonerate.output.works  - produced from the command line:
exonerate -q exonerate_cdna.fa -t exonerate_genomic.fa --model  
est2genome --bestn 1 > exonerate.output.works

2. exonerate.output.dontwork - produced from the command line:
exonerate -q test_aa.fa -t test_cds.fa --model protein2dna >  
exonerate.output.dontwork


Line 239 in Bio::searchIO::exonerate (cut and pasted below)

elsif(  s/^vulgar:\s+(\S+)\s+         # query sequence id
                  (\d+)\s+(\d+)\s+([\-\+])\s+   # query start-end-strand
                  (\S+)\s+                      # target sequence id
                  (\d+)\s+(\d+)\s+([\-\+])\s+   # target start-end- 
strand
                  (\d+)\s+                      # score
                  //ox ) {

parses the vulgar line of an --model est2genome exonerate output  
well. An example of the (complex) vulgar line which I've truncated  
for readability is:
vulgar: MUSSPSYN 3 1279 + 4.143962167-143965267 28 3074 + 6137 M 8 8  
G 0 1 M 231 231 5 0 2 I 0 253 3 0

whereas the vulgar line I've obtained from a --model protein2dna  
exonerate output is much simpler and the parser fails to pick it up:
vulgar: SJCHGC00851 0 204 . SJCHGC00851 2 614 + 1059 M 204 612

Has anyone encountered this situation before? I've not changed the  
parser as exonerate is widely used for it's est2genome model, and  
thought I'd run it pass the list to see if there is a work around  
solution.

many thanks in advance,
tania


From johnsonmar at mail.nih.gov  Wed Aug 15 16:47:10 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 12:47:10 -0400
Subject: [Bioperl-l] Need assistance with make error
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>

I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
Enterprise Linux 4, and the other running RHEL3.  I'm getting the
following 'make Error 255' when running make test.  I'm not sure what
this error indicates, and whether I should continue with a force
install?  Could you please advise.

 
Failed Test        Stat Wstat Total Fail  Failed  List of Failed

------------------------------------------------------------------------
-------

t/BioFetch_DB.t                  27    1   3.70%  8

t/EMBL_DB.t                      15    3  20.00%  6 13-14

t/Ontology.t          9  2304    50  100 200.00%  1-50

t/TreeIO.t                       41    1   2.44%  42

t/Variation_IO.t                 25    3  12.00%  15 20 25

t/simpleGOparser.t    9  2304    98  196 200.00%  1-98

120 subtests skipped.

Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
okay.

make: *** [test_dynamic] Error 255

 
Thanks,

 
Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/ <http://www.terpsys.com/> 

 
From arareko at campus.iztacala.unam.mx  Wed Aug 15 17:45:39 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 15 Aug 2007 12:45:39 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
Message-ID: <46C33BC3.9000409@campus.iztacala.unam.mx>

Which version of bioperl you're trying to install?

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.
> 
>  
> 
>  
> 
> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> 
> ------------------------------------------------------------------------
> -------
> 
> t/BioFetch_DB.t                  27    1   3.70%  8
> 
> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> 
> t/Ontology.t          9  2304    50  100 200.00%  1-50
> 
> t/TreeIO.t                       41    1   2.44%  42
> 
> t/Variation_IO.t                 25    3  12.00%  15 20 25
> 
> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> 
> 120 subtests skipped.
> 
> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
> okay.
> 
> make: *** [test_dynamic] Error 255
> 
>  
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/ <http://www.terpsys.com/> 
> 
>  
> 
>  
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From mbasu at mail.nih.gov  Wed Aug 15 17:55:50 2007
From: mbasu at mail.nih.gov (Malay)
Date: Wed, 15 Aug 2007 13:55:50 -0400
Subject: [Bioperl-l] Developer docs
Message-ID: <46C33E26.2050004@mail.nih.gov>

Hello All:

I apologize for not searching throughly. But I'd appreciate if someone 
point to a location where I can find any bioperl coding convention that 
I need follow for any code contribution to Bioperl.

-Malay

-- 
Malay K Basu
www.malaybasu.net


From arareko at campus.iztacala.unam.mx  Wed Aug 15 18:39:29 2007
From: arareko at campus.iztacala.unam.mx (Mauricio Herrera Cuadra)
Date: Wed, 15 Aug 2007 13:39:29 -0500
Subject: [Bioperl-l] Developer docs
In-Reply-To: <46C33E26.2050004@mail.nih.gov>
References: <46C33E26.2050004@mail.nih.gov>
Message-ID: <46C34861.8090400@campus.iztacala.unam.mx>

You may want to bookmark this one:

http://bioperl.org/wiki/Developer_Information#BioPerl_Code

Mauricio.

Malay wrote:
> Hello All:
> 
> I apologize for not searching throughly. But I'd appreciate if someone 
> point to a location where I can find any bioperl coding convention that 
> I need follow for any code contribution to Bioperl.
> 
> -Malay
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From johnsonmar at mail.nih.gov  Wed Aug 15 19:01:23 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 15:01:23 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <46C33BC3.9000409@campus.iztacala.unam.mx>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>

This is version 1.4.

Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx] 
Sent: Wednesday, August 15, 2007 1:46 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

Which version of bioperl you're trying to install?

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.
> 
>  
> 
>  
> 
> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> 
> ------------------------------------------------------------------------
> -------
> 
> t/BioFetch_DB.t                  27    1   3.70%  8
> 
> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> 
> t/Ontology.t          9  2304    50  100 200.00%  1-50
> 
> t/TreeIO.t                       41    1   2.44%  42
> 
> t/Variation_IO.t                 25    3  12.00%  15 20 25
> 
> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> 
> 120 subtests skipped.
> 
> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 98.14%
> okay.
> 
> make: *** [test_dynamic] Error 255
> 
>  
> 
>  
> 
>  
> 
> Thanks,
> 
>  
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/ <http://www.terpsys.com/> 
> 
>  
> 
>  
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
MAURICIO HERRERA CUADRA
arareko at campus.iztacala.unam.mx
Laboratorio de Gen?tica
Unidad de Morfofisiolog?a y Funci?n
Facultad de Estudios Superiores Iztacala, UNAM


From cjfields at uiuc.edu  Wed Aug 15 20:25:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 15:25:30 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805713@NIHCESMLBX11.nih.gov>
Message-ID: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>

You'll definitely want to update to the latest (v 1.5.2).  We hope to  
get a new stable release out sometime soon and possibly move to a  
more regular release cycle.

chris

On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> This is version 1.4.
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx]
> Sent: Wednesday, August 15, 2007 1:46 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> Which version of bioperl you're trying to install?
>
> Johnson, Mary (NIH/NCI) [C] wrote:
>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>> following 'make Error 255' when running make test.  I'm not sure what
>> this error indicates, and whether I should continue with a force
>> install?  Could you please advise.
>>
>>
>>
>>
>>
>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>
>> --------------------------------------------------------------------- 
>> ---
>> -------
>>
>> t/BioFetch_DB.t                  27    1   3.70%  8
>>
>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>
>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>
>> t/TreeIO.t                       41    1   2.44%  42
>>
>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>
>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>
>> 120 subtests skipped.
>>
>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,  
>> 98.14%
>> okay.
>>
>> make: *** [test_dynamic] Error 255
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> MAURICIO HERRERA CUADRA
> arareko at campus.iztacala.unam.mx
> Laboratorio de Gen?tica
> Unidad de Morfofisiolog?a y Funci?n
> Facultad de Estudios Superiores Iztacala, UNAM
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonmar at mail.nih.gov  Wed Aug 15 20:32:43 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 16:32:43 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>

I saw the 1.5.2 version, but it stated that this was a developer release and that 1.4 was the latest stable version, so I went with 1.4.  I'll give 1.5.2 a try.

Thanks,


Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu] 
Sent: Wednesday, August 15, 2007 4:26 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

You'll definitely want to update to the latest (v 1.5.2).  We hope to  
get a new stable release out sometime soon and possibly move to a  
more regular release cycle.

chris

On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> This is version 1.4.
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Mauricio Herrera Cuadra [mailto:arareko at campus.iztacala.unam.mx]
> Sent: Wednesday, August 15, 2007 1:46 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> Which version of bioperl you're trying to install?
>
> Johnson, Mary (NIH/NCI) [C] wrote:
>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>> following 'make Error 255' when running make test.  I'm not sure what
>> this error indicates, and whether I should continue with a force
>> install?  Could you please advise.
>>
>>
>>
>>
>>
>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>
>> --------------------------------------------------------------------- 
>> ---
>> -------
>>
>> t/BioFetch_DB.t                  27    1   3.70%  8
>>
>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>
>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>
>> t/TreeIO.t                       41    1   2.44%  42
>>
>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>
>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>
>> 120 subtests skipped.
>>
>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,  
>> 98.14%
>> okay.
>>
>> make: *** [test_dynamic] Error 255
>>
>>
>>
>>
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> MAURICIO HERRERA CUADRA
> arareko at campus.iztacala.unam.mx
> Laboratorio de Gen?tica
> Unidad de Morfofisiolog?a y Funci?n
> Facultad de Estudios Superiores Iztacala, UNAM
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Aug 15 20:40:32 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 15:40:32 -0500
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
Message-ID: <E16950D3-9F60-4862-9325-57CA26107649@uiuc.edu>

The term 'stable' is relative in this case; tons of bugs fixes were  
incorporated in the 1.5.2 release.  There are a few dev-specific  
issues we'll need to resolve prior to a new release; once those are  
out of the way we'll try to get a new 'stable' out.

chris

On Aug 15, 2007, at 3:32 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> I saw the 1.5.2 version, but it stated that this was a developer  
> release and that 1.4 was the latest stable version, so I went with  
> 1.4.  I'll give 1.5.2 a try.
>
> Thanks,
>
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> You'll definitely want to update to the latest (v 1.5.2).  We hope to
> get a new stable release out sometime soon and possibly move to a
> more regular release cycle.
>
> chris
>
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
>
>> This is version 1.4.
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/
>>
>>
>>
>> -----Original Message-----
>> From: Mauricio Herrera Cuadra  
>> [mailto:arareko at campus.iztacala.unam.mx]
>> Sent: Wednesday, August 15, 2007 1:46 PM
>> To: Johnson, Mary (NIH/NCI) [C]
>> Cc: bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Need assistance with make error
>>
>> Which version of bioperl you're trying to install?
>>
>> Johnson, Mary (NIH/NCI) [C] wrote:
>>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>>> following 'make Error 255' when running make test.  I'm not sure  
>>> what
>>> this error indicates, and whether I should continue with a force
>>> install?  Could you please advise.
>>>
>>>
>>>
>>>
>>>
>>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> ---
>>> -------
>>>
>>> t/BioFetch_DB.t                  27    1   3.70%  8
>>>
>>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>>
>>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>>
>>> t/TreeIO.t                       41    1   2.44%  42
>>>
>>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>>
>>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>>
>>> 120 subtests skipped.
>>>
>>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,
>>> 98.14%
>>> okay.
>>>
>>> make: *** [test_dynamic] Error 255
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Mary Johnson
>>>
>>> Sr. Network Engineer
>>>
>>> National Cancer Institute Center for Bioinformatics
>>> Contractor, TerpSys
>>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> MAURICIO HERRERA CUADRA
>> arareko at campus.iztacala.unam.mx
>> Laboratorio de Gen?tica
>> Unidad de Morfofisiolog?a y Funci?n
>> Facultad de Estudios Superiores Iztacala, UNAM
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Kevin.M.Brown at asu.edu  Wed Aug 15 20:54:04 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Wed, 15 Aug 2007 13:54:04 -0700
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
References: <DA0EFC65-4A35-48FA-9280-447654BAFF7F@uiuc.edu>
	<EBA7AA82BA858348BAC2FA036AD3D2BF805715@NIHCESMLBX11.nih.gov>
Message-ID: <1A4207F8295607498283FE9E93B775B40386D612@EX02.asurite.ad.asu.edu>

It technically is a developer release, but given the age of the 1.4 release it is better because of fixes for things like doing webblasts and other improvements and I've found that it is reliable in the results that come out of the various objects that I've had to use in my current projects.

> I saw the 1.5.2 version, but it stated that this was a 
> developer release and that 1.4 was the latest stable version, 
> so I went with 1.4.  I'll give 1.5.2 a try.
> 
> Thanks,
> 
> 
> Mary Johnson
> 
> Sr. Network Engineer
> 
> National Cancer Institute Center for Bioinformatics 
> Contractor, TerpSys http://www.terpsys.com/
> 
>  
> 
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
> 
> You'll definitely want to update to the latest (v 1.5.2).  We 
> hope to get a new stable release out sometime soon and 
> possibly move to a more regular release cycle.
> 
> chris
> 
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
> 
> > This is version 1.4.
> >
> > Mary Johnson
> >
> > Sr. Network Engineer
> >
> > National Cancer Institute Center for Bioinformatics Contractor, 
> > TerpSys http://www.terpsys.com/
> >
> >
> >
> > -----Original Message-----
> > From: Mauricio Herrera Cuadra 
> [mailto:arareko at campus.iztacala.unam.mx]
> > Sent: Wednesday, August 15, 2007 1:46 PM
> > To: Johnson, Mary (NIH/NCI) [C]
> > Cc: bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] Need assistance with make error
> >
> > Which version of bioperl you're trying to install?
> >
> > Johnson, Mary (NIH/NCI) [C] wrote:
> >> I'm trying to install bioperl on 2 Linux servers - 1 
> running Redhat 
> >> Enterprise Linux 4, and the other running RHEL3.  I'm getting the 
> >> following 'make Error 255' when running make test.  I'm 
> not sure what 
> >> this error indicates, and whether I should continue with a force 
> >> install?  Could you please advise.
> >>
> >>
> >>
> >>
> >>
> >> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
> >>
> >> 
> ---------------------------------------------------------------------
> >> ---
> >> -------
> >>
> >> t/BioFetch_DB.t                  27    1   3.70%  8
> >>
> >> t/EMBL_DB.t                      15    3  20.00%  6 13-14
> >>
> >> t/Ontology.t          9  2304    50  100 200.00%  1-50
> >>
> >> t/TreeIO.t                       41    1   2.44%  42
> >>
> >> t/Variation_IO.t                 25    3  12.00%  15 20 25
> >>
> >> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
> >>
> >> 120 subtests skipped.
> >>
> >> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed, 
> >> 98.14% okay.
> >>
> >> make: *** [test_dynamic] Error 255
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Thanks,
> >>
> >>
> >>
> >> Mary Johnson
> >>
> >> Sr. Network Engineer
> >>
> >> National Cancer Institute Center for Bioinformatics Contractor, 
> >> TerpSys http://www.terpsys.com/ <http://www.terpsys.com/>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> > --
> > MAURICIO HERRERA CUADRA
> > arareko at campus.iztacala.unam.mx
> > Laboratorio de Gen?tica
> > Unidad de Morfofisiolog?a y Funci?n
> > Facultad de Estudios Superiores Iztacala, UNAM
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From bix at sendu.me.uk  Wed Aug 15 20:50:02 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 15 Aug 2007 21:50:02 +0100
Subject: [Bioperl-l] Developer docs
In-Reply-To: <46C34861.8090400@campus.iztacala.unam.mx>
References: <46C33E26.2050004@mail.nih.gov>
	<46C34861.8090400@campus.iztacala.unam.mx>
Message-ID: <46C366FA.40609@sendu.me.uk>

Mauricio Herrera Cuadra wrote:
> You may want to bookmark this one:
> 
> http://bioperl.org/wiki/Developer_Information#BioPerl_Code

Yup. The important one is http://bioperl.org/wiki/Bioperl_Best_Practices 
, which I've just updated with the latest info on writing test scripts.


From bix at sendu.me.uk  Wed Aug 15 20:54:45 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 15 Aug 2007 21:54:45 +0100
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
References: <EBA7AA82BA858348BAC2FA036AD3D2BF805711@NIHCESMLBX11.nih.gov>
Message-ID: <46C36815.5010908@sendu.me.uk>

Johnson, Mary (NIH/NCI) [C] wrote:
> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
> following 'make Error 255' when running make test.  I'm not sure what
> this error indicates, and whether I should continue with a force
> install?  Could you please advise.

Unless you know you really must install Bioperl 1.4, install 1.5.2 instead.

http://www.bioperl.org/wiki/Release_1.5.2

If you use the Build.PL installation, at the very least you certainly 
won't get a make error.

http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix#PRELIMINARY_PREPARATION


From cjfields at uiuc.edu  Wed Aug 15 21:16:27 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 15 Aug 2007 16:16:27 -0500
Subject: [Bioperl-l] exonerate parser in bioperl-live fails when
	protein2dna comparison is performed
In-Reply-To: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>
References: <AA5E6FAF-A635-4F6C-99CF-82F6589C677B@bnc.ox.ac.uk>
Message-ID: <F853DDF2-3165-4F88-A087-744D60682104@uiuc.edu>

I can confirm this with bioperl-live.  Bio::SearchIO::exonerate docs  
indicate protein2genome and est2genome model output is supported but  
doesn't specifically indicate that it can parse any other output.   
You can add an enhancement request to bugzilla indicating this  
deficiency or, if you are inclined, add the functionality yourself  
and donate the code.

chris

On Aug 15, 2007, at 11:05 AM, Tania Oh wrote:

> Dear All,
>
> I was trying to use the Bio::SearchIO::Alignment::Exonerate module  
> to run and parse my exonerate output. But I've noticed that the  
> parser which is actually Bio::SearchIO::Exonerate works if the  
> model used in Exonerate is --model est2genome. I used exonerate  
> with the model --model protein2dna and the parser was unable to  
> parse the hsps.
>
>
> Below is a simple of code I used for testing the output from  
> exonerate:
>
> use Bio::SearchIO;
> use strict;
> <exonerate.output.works>
> my $searchio = Bio::SearchIO->new(-file => 'test_data/ 
> exonerate.output.dontwork
> <exonerate.output.dontwork>
> ',
>                                    -format => 'exonerate');
>
>   while( my $r = $searchio->next_result ) {
>           while(my $hit = $r->next_hit){
>                   while(my $hsp = $hit->next_hsp){
>                           print $hsp->start. "\t". $hsp->end. "\n";
>                   }
>           }
>
>     print $r->query_name, "\n";
>   }
>
>
> There are 2 files attached to show the examples of using either the  
> est2genome or protein2dna model:
> 1. exonerate.output.works  - produced from the command line:
> exonerate -q exonerate_cdna.fa -t exonerate_genomic.fa --model  
> est2genome --bestn 1 > exonerate.output.works
>
> 2. exonerate.output.dontwork - produced from the command line:
> exonerate -q test_aa.fa -t test_cds.fa --model protein2dna >  
> exonerate.output.dontwork
>
>
> Line 239 in Bio::searchIO::exonerate (cut and pasted below)
>
> elsif(  s/^vulgar:\s+(\S+)\s+         # query sequence id
>                  (\d+)\s+(\d+)\s+([\-\+])\s+   # query start-end- 
> strand
>                  (\S+)\s+                      # target sequence id
>                  (\d+)\s+(\d+)\s+([\-\+])\s+   # target start-end- 
> strand
>                  (\d+)\s+                      # score
>                  //ox ) {
>
> parses the vulgar line of an --model est2genome exonerate output  
> well. An example of the (complex) vulgar line which I've truncated  
> for readability is:
> vulgar: MUSSPSYN 3 1279 + 4.143962167-143965267 28 3074 + 6137 M 8  
> 8 G 0 1 M 231 231 5 0 2 I 0 253 3 0
>
> whereas the vulgar line I've obtained from a --model protein2dna  
> exonerate output is much simpler and the parser fails to pick it up:
> vulgar: SJCHGC00851 0 204 . SJCHGC00851 2 614 + 1059 M 204 612
>
> Has anyone encountered this situation before? I've not changed the  
> parser as exonerate is widely used for it's est2genome model, and  
> thought I'd run it pass the list to see if there is a work around  
> solution.
>
> many thanks in advance,
> tania
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From johnsonmar at mail.nih.gov  Wed Aug 15 21:45:36 2007
From: johnsonmar at mail.nih.gov (Johnson, Mary (NIH/NCI) [C])
Date: Wed, 15 Aug 2007 17:45:36 -0400
Subject: [Bioperl-l] Need assistance with make error
In-Reply-To: <E16950D3-9F60-4862-9325-57CA26107649@uiuc.edu>
Message-ID: <EBA7AA82BA858348BAC2FA036AD3D2BF805716@NIHCESMLBX11.nih.gov>

Version 1.5.2 worked fine!  Thanks to all of you for your quick response.  I wish all of our vendors were that quick in getting back to me:)


Mary Johnson

Sr. Network Engineer

National Cancer Institute Center for Bioinformatics
Contractor, TerpSys
http://www.terpsys.com/

 
-----Original Message-----
From: Chris Fields [mailto:cjfields at uiuc.edu] 
Sent: Wednesday, August 15, 2007 4:41 PM
To: Johnson, Mary (NIH/NCI) [C]
Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
Subject: Re: [Bioperl-l] Need assistance with make error

The term 'stable' is relative in this case; tons of bugs fixes were  
incorporated in the 1.5.2 release.  There are a few dev-specific  
issues we'll need to resolve prior to a new release; once those are  
out of the way we'll try to get a new 'stable' out.

chris

On Aug 15, 2007, at 3:32 PM, Johnson, Mary (NIH/NCI) [C] wrote:

> I saw the 1.5.2 version, but it stated that this was a developer  
> release and that 1.4 was the latest stable version, so I went with  
> 1.4.  I'll give 1.5.2 a try.
>
> Thanks,
>
>
> Mary Johnson
>
> Sr. Network Engineer
>
> National Cancer Institute Center for Bioinformatics
> Contractor, TerpSys
> http://www.terpsys.com/
>
>
>
> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Wednesday, August 15, 2007 4:26 PM
> To: Johnson, Mary (NIH/NCI) [C]
> Cc: Mauricio Herrera Cuadra; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] Need assistance with make error
>
> You'll definitely want to update to the latest (v 1.5.2).  We hope to
> get a new stable release out sometime soon and possibly move to a
> more regular release cycle.
>
> chris
>
> On Aug 15, 2007, at 2:01 PM, Johnson, Mary (NIH/NCI) [C] wrote:
>
>> This is version 1.4.
>>
>> Mary Johnson
>>
>> Sr. Network Engineer
>>
>> National Cancer Institute Center for Bioinformatics
>> Contractor, TerpSys
>> http://www.terpsys.com/
>>
>>
>>
>> -----Original Message-----
>> From: Mauricio Herrera Cuadra  
>> [mailto:arareko at campus.iztacala.unam.mx]
>> Sent: Wednesday, August 15, 2007 1:46 PM
>> To: Johnson, Mary (NIH/NCI) [C]
>> Cc: bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] Need assistance with make error
>>
>> Which version of bioperl you're trying to install?
>>
>> Johnson, Mary (NIH/NCI) [C] wrote:
>>> I'm trying to install bioperl on 2 Linux servers - 1 running Redhat
>>> Enterprise Linux 4, and the other running RHEL3.  I'm getting the
>>> following 'make Error 255' when running make test.  I'm not sure  
>>> what
>>> this error indicates, and whether I should continue with a force
>>> install?  Could you please advise.
>>>
>>>
>>>
>>>
>>>
>>> Failed Test        Stat Wstat Total Fail  Failed  List of Failed
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> ---
>>> -------
>>>
>>> t/BioFetch_DB.t                  27    1   3.70%  8
>>>
>>> t/EMBL_DB.t                      15    3  20.00%  6 13-14
>>>
>>> t/Ontology.t          9  2304    50  100 200.00%  1-50
>>>
>>> t/TreeIO.t                       41    1   2.44%  42
>>>
>>> t/Variation_IO.t                 25    3  12.00%  15 20 25
>>>
>>> t/simpleGOparser.t    9  2304    98  196 200.00%  1-98
>>>
>>> 120 subtests skipped.
>>>
>>> Failed 6/179 test scripts, 96.65% okay. 154/8268 subtests failed,
>>> 98.14%
>>> okay.
>>>
>>> make: *** [test_dynamic] Error 255
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Mary Johnson
>>>
>>> Sr. Network Engineer
>>>
>>> National Cancer Institute Center for Bioinformatics
>>> Contractor, TerpSys
>>> http://www.terpsys.com/ <http://www.terpsys.com/>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> -- 
>> MAURICIO HERRERA CUADRA
>> arareko at campus.iztacala.unam.mx
>> Laboratorio de Gen?tica
>> Unidad de Morfofisiolog?a y Funci?n
>> Facultad de Estudios Superiores Iztacala, UNAM
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From neetisomaiya at gmail.com  Thu Aug 16 04:22:18 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 09:52:18 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <46C1D557.7090101@pharm.stonybrook.edu>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<764978cf0708140624s5c198b5akee38bf98866fd7f2@mail.gmail.com>
	<46C1D557.7090101@pharm.stonybrook.edu>
Message-ID: <764978cf0708152122oba56e13qef83544cdde7e795@mail.gmail.com>

Hi Siddhartha,

Thanks a lot for your mail.
It would be great if you could send me your parser, I will see how I can
modify it for my purpose.

Thanks and Regards,
Neeti.

On 8/14/07, Siddhartha Basu <basu at pharm.stonybrook.edu> wrote:
>
> neeti somaiya wrote:
> > Hi Andrew,
> >
> > I think the homologene data files have changed now on the ftp, from what
> you
> > had used.
> > It is now homologene.data and homologene.xml.
> > I tried using your parser, but because it was written on the file
> > hmlg.trip.ftp, it doesnt work anymore.
> >
> > I came across a parser
> >
> http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
> > .
> > I am looking at it to see if it works for me. NOt sure if it will.
> >
> > ~Neeti.
>
> Hi Neeti,
> I have recently written a parser for 'homologene' xml data specific for
> my purpose. I am not sure whether it will suit your purpose but it could
> be extended for general purpose parsing, so i am putting it forward.
> Here is how it works .......
>
> * It only parses a single homologene entry <HG-Entry>.....</HG-Entry>.
> * It does SAX based parsing (currently uses XML::SAX::ExpatXS)
> * Returns a graph(uses Graph module of perl) object where each node is a
> homologue entry with its corresponding entrez gene id. Each node also
> contain the following attributes ...
>         * Refseq protein id.
>         * Protein id (pid)
>         * ncbi taxon id.
> * The edge attribute contain information about the ortholog(true/false)
> relationship between two nodes.
> * The rest of tags currently are not being extracted. However, parsing
> the rest of the tags should not be very difficult.
>
> Generally i get homologene xml stream from an 'efetch' through
> Bio::DB::EUtilities, feed it to the parser, gets back 'Graph' object and
> then works on it.
>
> So, to make it more generic and work on local file
>
> * We need another class that reads the chunk between
> <HG-Entry>.....</HG-Entry> and sends it to the parser.
> * Add supports for most of the tags.
> * Massage the data to a bioperl compatible object.
>
> The first two i could work it out and for the last one i have to figure
> out the bioperl object that could be suitable (like  Bio::Cluster or
> Bio::NetWork::Node/Edge).
>
> Let me know if it sounds interesting and i will send you the code.
>
> -siddhartha
>
>
> >
> > On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
> >> On 13/08/2007, at 6:29 PM, neeti somaiya wrote:
> >>
> >>> Hi,
> >>>
> >>> Does anyone know of any Homologene parser, if available?
> >>> Please let me know.
> >>>
> >>> Thanks and Regards,
> >>> Neeti.
> >> Hi Neeti,
> >>
> >> Quite a long time ago now I wrote an Homologene parser and posted it
> >> to the mailing list:
> >>
> >> <http://www.bioperl.org/pipermail/bioperl-l/2002-February/007288.html>
> >>
> >> I don't know if this still works but you could use it as a starting
> >> point. There may also be something newer out there too, I don't know.
> >> If you search the mailing list archives you'll get a few messages
> >> around the topic.
> >>
> >> Cheers, Andrew.
> >>
> >>
> >> Andrew Macgregor
> >> Centre for Comparative Genomics, Murdoch University
> >> Email: amacgregor at ccg.murdoch.edu.au
> >> Tel: (08) 9360 2961
> >>
> >>
> >>
> >>
> >
> >
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive


From neetisomaiya at gmail.com  Thu Aug 16 05:56:21 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 11:26:21 +0530
Subject: [Bioperl-l] PDB Parser
Message-ID: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>

Hi,

After a lot of search I could find this link from where PDB files can be
downloaded :
ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/
Is there any other link where one can download all pdb data from?

I tried using Bio::Structure::IO::pdb with some code like :-
use Bio::Structure::IO;

    $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
                                   -format => 'pdb');

    while ( my $struc = $in->next_structure() ) {
       print "Structure ", $struc->id,"\n";
    }

It works well. But I am not able to find documentation of other methods
which will give me various specific details available in a pdb file, right
from title, keywords, references to structure details, atoms, coordinates
etc. There must be different methods to fetch and parse each of this data
from a pdb file, right? Where can I find the details? Any example code of
the same would also be of great use.

Thanks and Regards,
Neeti Somaiya.

-- 
-Neeti
Even my blood says, B positive


From hrh at sanger.ac.uk  Thu Aug 16 08:48:16 2007
From: hrh at sanger.ac.uk (Hans Rudolf Hotz)
Date: Thu, 16 Aug 2007 09:48:16 +0100 (BST)
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0708160942310.14241@deskpro50.dynamic.sanger.ac.uk>


On Thu, 16 Aug 2007, neeti somaiya wrote:

> Hi,
>
> After a lot of search I could find this link from where PDB files can be
> downloaded :
> ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/
> Is there any other link where one can download all pdb data from?

try: ftp://pdb.protein.osaka-u.ac.jp/v3/pub/pdb/   or
      ftp://ftp.ebi.ac.uk/pub/databases/rcsb/pdb-remediated/

it is not BioPerl, but James Tisdall's book: O'Reilly: "Begiining Perl for 
Bioinformatics" has a nice introduction into parsing PDB files


Regards, Hans


>
> I tried using Bio::Structure::IO::pdb with some code like :-
> use Bio::Structure::IO;
>
>    $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>                                   -format => 'pdb');
>
>    while ( my $struc = $in->next_structure() ) {
>       print "Structure ", $struc->id,"\n";
>    }
>
> It works well. But I am not able to find documentation of other methods
> which will give me various specific details available in a pdb file, right
> from title, keywords, references to structure details, atoms, coordinates
> etc. There must be different methods to fetch and parse each of this data
> from a pdb file, right? Where can I find the details? Any example code of
> the same would also be of great use.
>
> Thanks and Regards,
> Neeti Somaiya.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.


From neetisomaiya at gmail.com  Thu Aug 16 09:30:42 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 16 Aug 2007 15:00:42 +0530
Subject: [Bioperl-l] Homologene parser?
In-Reply-To: <C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>
References: <764978cf0708130329r484bf210qd0e6e9ad39274bbc@mail.gmail.com>
	<22FEA28C-1720-4519-B129-B7A5ADA452D4@ccg.murdoch.edu.au>
	<4E7F8A99-68A7-49C2-9919-E2FC5652C8D7@uiuc.edu>
	<C762C291-D3D2-4CBC-B5EC-6B6E4935A004@ccg.murdoch.edu.au>
Message-ID: <764978cf0708160230o4ade944er8c8529199f3a0262@mail.gmail.com>

Hi,

For now I am using the homologene parser available here :-
http://bioinformatics.tgen.org/brunit/software/bioparser/docs/pod_bio_parser_homologene_fileparser_pm.shtml
,
for parsing the homologene.data file. But the README at the ftp site says
HOMOLOGENE.XML has much more data, I am still to see how to parse this one.

~Neeti.


On 8/14/07, Andrew Macgregor <amacgregor at ccg.murdoch.edu.au> wrote:
>
> On 14/08/2007, at 11:21 AM, Chris Fields wrote:
>
> > It looks like Heikki responded and thought a good place for it
> > would be Bio::SeqIO, but it didn't go anywhere I suppose.  I see
> > that a few other posts suggest it could be placed in Bio::Cluster
> > as well which I'm not familiar with.  We could add it in if you
> > were still interested, just need to find a good place for it; might
> > be nice to have a Parse::RecDescent-based parser.
> >
> > chris
> >
>
> Hi Chris,
>
> I was also doing some parsing of UniGene at the time but found
> RecDescent was too slow and went back to regexes. That code found
> it's way into Bio::Cluster. Occasionally I see a message with someone
> looking for a Homologene parser but not very often, so I'm not sure
> it is worth the effort of moving the code into bioperl.
>
> Cheers, Andrew.
>


-- 
-Neeti
Even my blood says, B positive


From bix at sendu.me.uk  Thu Aug 16 09:59:08 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 16 Aug 2007 10:59:08 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
Message-ID: <46C41FEC.2000206@sendu.me.uk>

neeti somaiya wrote:
> I tried using Bio::Structure::IO::pdb with some code like :-
> use Bio::Structure::IO;
> 
>     $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>                                    -format => 'pdb');
> 
>     while ( my $struc = $in->next_structure() ) {
>        print "Structure ", $struc->id,"\n";
>     }
> 
> It works well. But I am not able to find documentation of other methods
> which will give me various specific details available in a pdb file, right
> from title, keywords, references to structure details, atoms, coordinates
> etc. There must be different methods to fetch and parse each of this data
> from a pdb file, right? Where can I find the details?

$struct is a Bio::Structure::Entry, so look at the docs for that:
http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html

You'll probably want to look at the docs for the other Structure modules 
as well:
http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html


I agree, the documentation in this area could be improved. 
Bio::Structure::StructureI could actually contain something, and 
Bio::Structure should actually exist or not be referenced in the docs.


From ewijaya at gmail.com  Thu Aug 16 04:18:57 2007
From: ewijaya at gmail.com (Edward Wijaya)
Date: Thu, 16 Aug 2007 12:18:57 +0800
Subject: [Bioperl-l] How to create contrasting colors in every singe track -
	Bio::Graphics
Message-ID: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>

Dear experts,

I am trying to draw a figures that shows binding sites hits for various
program (see attached) for example.

Now, I have a problem in creating contrasting colour for each of
the Programs (MEME, AlignACE, etc).  I want to avoid "graded segments",
so that I can have more contrasting color, e.g: red, blue, yellow, etc.

Can anybody suggest how can we achieve that?

My full source code can be found here: http://dpaste.com/16985/
The portion of the script is this:

__BEGIN__
    my %prog_color = (
        "Actual"   => 800000,
        "ALIGNACE" => 230000,
        "BP"       => 80000,
        "MDSCAN"   => 5000,
        "MITRA"    => 10000,
        "MTSAMP"   => 200000,
        "SPACE"    => 40000,
        "NONE"     => 0,
    );

    foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
        my $track = $panel->add_track(
            -glyph     => 'graded_segments',
            -key       => "SEQ " . $seqid,
            -connector => "dashed",
            -label     => 1,
            -fontcolor => 'red',
            -bgcolor   => 'blue',
            -bump      => +1,
            -height    => 8,
            -min_score => 0,
            -max_score => 500000
        );
# rest of the script
__END__

Regards,
Edward
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hits.png
Type: image/png
Size: 2509 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070816/31057225/attachment-0004.png>

From pratchusha.kamireddy at aamu.edu  Thu Aug 16 03:45:22 2007
From: pratchusha.kamireddy at aamu.edu (pratchusha kamireddy)
Date: Wed, 15 Aug 2007 22:45:22 -0500 (CDT)
Subject: [Bioperl-l] Request for Activeperl software
Message-ID: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>

Hello
  I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am working under Dr.Kantety in Plant and Soil Science Department.I am the beginner to learn perl programming. I need Activeperl software to run the perl programs. Can you help me in this regard like: where can I dowmload this software, how can i Install this and how can i use this. I am eagerlu waiting for your reply.Please help me in this regard.
   Thanking you
   Pratchusha Kamireddy


From spiros at lokku.com  Thu Aug 16 13:32:05 2007
From: spiros at lokku.com (Spiros Denaxas)
Date: Thu, 16 Aug 2007 14:32:05 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
References: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <bba689ec0708160632w315b00d5na3bf55d97ac03728@mail.gmail.com>

Hi,

You can download ActivePerl from ActiveStates website at

http://www.activestate.com/Products/ActivePerl/

Get a book: http://www.oreilly.com/catalog/lperl3/

Visit:

http://perl-begin.org/
http://learn.perl.org/

Usenet:

http://www.nntp.perl.org/group/perl.beginners/

Spiros

On 8/16/07, pratchusha kamireddy <pratchusha.kamireddy at aamu.edu> wrote:
> Hello
>   I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am working under Dr.Kantety in Plant and Soil Science Department.I am the beginner to learn perl programming. I need Activeperl software to run the perl programs. Can you help me in this regard like: where can I dowmload this software, how can i Install this and how can i use this. I am eagerlu waiting for your reply.Please help me in this regard.
>    Thanking you
>    Pratchusha Kamireddy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From razi.khaja at gmail.com  Thu Aug 16 13:37:09 2007
From: razi.khaja at gmail.com (Razi Khaja)
Date: Thu, 16 Aug 2007 09:37:09 -0400
Subject: [Bioperl-l] How to create contrasting colors in every singe
	track - Bio::Graphics
In-Reply-To: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
References: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
Message-ID: <62e9dabc0708160637o36380ecbv69fe479d0a26989d@mail.gmail.com>

You would probably want to consider a "Graph-Coloring" algorithm in
order to optimally pick contrasting colors for the features being
displayed.  This might be overkill for what your trying to accomplish
and may not be possible (depending on how many features you have in
your dataset ... ie. how big your graph is).

In anycase, some resources are:
http://en.wikipedia.org/wiki/Graph_coloring
http://web.cs.ualberta.ca/~joe/Coloring/

If your problem is simpler, see the modifications to your program Ive
made below:

Razi Khaja

On 8/16/07, Edward Wijaya <ewijaya at gmail.com> wrote:
> Dear experts,
>
> I am trying to draw a figures that shows binding sites hits for various
> program (see attached) for example.
>
> Now, I have a problem in creating contrasting colour for each of
> the Programs (MEME, AlignACE, etc).  I want to avoid "graded segments",
> so that I can have more contrasting color, e.g: red, blue, yellow, etc.
>
> Can anybody suggest how can we achieve that?
>
> My full source code can be found here: http://dpaste.com/16985/
> The portion of the script is this:
>
> __BEGIN__
>     my %prog_color = (
>         "Actual"   => 800000,
>         "ALIGNACE" => 230000,
>         "BP"       => 80000,
>         "MDSCAN"   => 5000,
>         "MITRA"    => 10000,
>         "MTSAMP"   => 200000,
>         "SPACE"    => 40000,
>         "NONE"     => 0,
>     );
>
       my %color = ( 'MEME' => 'red', 'ALIGNACE => 'blue');

>     foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
           my( @feild ) = split( /\s+/, $nlist{$seqid} );
           my $prog_name = $feild[3];

>         my $track = $panel->add_track(
>             -glyph     => 'graded_segments',
>             -key       => "SEQ " . $seqid,
>             -connector => "dashed",
>             -label     => 1,
>             -fontcolor => 'red',
               -bgcolor   => $color{ $prog_name },
>             -bump      => +1,
>             -height    => 8,
>             -min_score => 0,
>             -max_score => 500000
>         );
> # rest of the script
> __END__
>
> Regards,
> Edward
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


From bix at sendu.me.uk  Thu Aug 16 13:49:52 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 16 Aug 2007 14:49:52 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
References: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <46C45600.4040906@sendu.me.uk>

pratchusha kamireddy wrote:
> I am Pratchusha Kamireddy doing masters in Alabama A&M University. I
> am working under Dr.Kantety in Plant and Soil Science Department.I am
> the beginner to learn perl programming. I need Activeperl software to
> run the perl programs. Can you help me in this regard like: where can
> I dowmload this software, how can i Install this and how can i use
> this. I am eagerlu waiting for your reply.Please help me in this
> regard.

Firstly, Google is your friend:
http://www.google.co.uk/search?q=activeperl

The first hit is the correct one:

http://www.activestate.com/Products/activeperl/


I suppose your next question will be how to install Bioperl (if not, 
you're in the wrong place):

http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
(which also tells you where to get ActivePerl from)


From cjfields at uiuc.edu  Thu Aug 16 14:11:22 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:11:22 -0500
Subject: [Bioperl-l] How to create contrasting colors in every singe
	track - Bio::Graphics
In-Reply-To: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
References: <3521d3670708152118y415f512clc51046cd7ae8c11a@mail.gmail.com>
Message-ID: <F3E88224-4AA2-451B-97FE-5DED15015FA2@uiuc.edu>


On Aug 15, 2007, at 11:18 PM, Edward Wijaya wrote:

> Dear experts,
>
> I am trying to draw a figures that shows binding sites hits for  
> various
> program (see attached) for example.
>
> Now, I have a problem in creating contrasting colour for each of
> the Programs (MEME, AlignACE, etc).  I want to avoid "graded  
> segments",
> so that I can have more contrasting color, e.g: red, blue, yellow,  
> etc.
>
> Can anybody suggest how can we achieve that?
>
> My full source code can be found here: http://dpaste.com/16985/
> The portion of the script is this:
>
> __BEGIN__
>     my %prog_color = (
>         "Actual"   => 800000,
>         "ALIGNACE" => 230000,
>         "BP"       => 80000,
>         "MDSCAN"   => 5000,
>         "MITRA"    => 10000,
>         "MTSAMP"   => 200000,
>         "SPACE"    => 40000,
>         "NONE"     => 0,
>     );
>
>     foreach my $seqid ( sort {$a <=> $b }keys %nlist ) {
>         my $track = $panel->add_track(
>             -glyph     => 'graded_segments',
>             -key       => "SEQ " . $seqid,
>             -connector => "dashed",
>             -label     => 1,
>             -fontcolor => 'red',
>             -bgcolor   => 'blue',
>             -bump      => +1,
>             -height    => 8,
>             -min_score => 0,
>             -max_score => 500000
>         );
> # rest of the script
> __END__
>
> Regards,
> Edward

I think you have two options:

1) Split the seqfeatures into different tracks based on the source  
(AlignACE, MP, etc), then give each it's own graded segment color.  I  
like this personally as it doesn't glob various results together onto  
one track and (at least to me) is easier to maintain.  It also allows  
one more flexibility in using varying scoring schemes.
2) Use a callback for bgcolor which changes the color explicitly  
based on the source/score.

The GenBank/EMBL section of the Bio::Graphics HOWTO reveals how to  
add different tracks, and there are several scattered examples on how  
to use callbacks.

http://www.bioperl.org/wiki/HOWTO:Graphics

chris


From cjfields at uiuc.edu  Thu Aug 16 14:12:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:12:30 -0500
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C41FEC.2000206@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
Message-ID: <5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>


On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:

> neeti somaiya wrote:
>> I tried using Bio::Structure::IO::pdb with some code like :-
>> use Bio::Structure::IO;
>>
>>     $in  = Bio::Structure::IO->new(-file => "pdb100d.ent",
>>                                    -format => 'pdb');
>>
>>     while ( my $struc = $in->next_structure() ) {
>>        print "Structure ", $struc->id,"\n";
>>     }
>>
>> It works well. But I am not able to find documentation of other  
>> methods
>> which will give me various specific details available in a pdb  
>> file, right
>> from title, keywords, references to structure details, atoms,  
>> coordinates
>> etc. There must be different methods to fetch and parse each of  
>> this data
>> from a pdb file, right? Where can I find the details?
>
> $struct is a Bio::Structure::Entry, so look at the docs for that:
> http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
>
> You'll probably want to look at the docs for the other Structure  
> modules
> as well:
> http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
>
>
> I agree, the documentation in this area could be improved.
> Bio::Structure::StructureI could actually contain something, and
> Bio::Structure should actually exist or not be referenced in the docs.

There was a discussion a while back on refactoring the code within  
Bio::Structure to better deal with HETATM and other stuff.  As far as  
I'm concerned it's open for anyone wanted to tinker with it.

chris


From cjfields at uiuc.edu  Thu Aug 16 14:37:31 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 09:37:31 -0500
Subject: [Bioperl-l] Announcement: infernal/erpin/rnamotif parsers
Message-ID: <7CE60504-FA1A-4AFF-A02E-036B8E37C3F9@uiuc.edu>

To anyone using the aforementioned parsers:

I don't plan on continuing development of the Bio::Tools-related  
Infernal, RNAMotif, and ERPIN parsers at this time unless there is  
substantial interest in doing so.  Instead, I plan on focusing my  
efforts on the Bio::SearchIO-based parsers as I feel they are much  
better at representing the data present in the output.  In my opinion  
having two sets of parsers that accomplish essentially the same task  
is redundant and non-productive.  Again, if there is considerable  
interest in keeping them I suggest responding to this message,  
otherwise I would consider them deprecated and removed completely by  
rel 1.7 (maybe sooner).

Infernal: It's very likely that a new stable version (v. 1.0) of  
Infernal will be released in the near future.  I may upgrade the  
Bio::SearchIO-based parser in the meantime to parse the latest  
Infernal output (v 0.81), but I don't plan on supporting pre-1.0  
releases once the final version is out.  Infernal has been in  
developer release for some time now and the program output has  
changed dramatically over time; however, the format is expected to  
solidify once a stable release is made, which makes supporting the  
parser much easier over time.

Questions?  Gripes?

chris


From awitney at sgul.ac.uk  Thu Aug 16 14:07:02 2007
From: awitney at sgul.ac.uk (Adam Witney)
Date: Thu, 16 Aug 2007 15:07:02 +0100
Subject: [Bioperl-l] Request for Activeperl software
In-Reply-To: <32393254.1187235922749.JavaMail.oracle@my.aamu.edu>
Message-ID: <C2EA1896.17575%awitney@sgul.ac.uk>


This would be the best place to start

http://www.activeperl.org/

Or more specifically for the language:

http://www.activeperl.org/store/activeperl/download/

(Which will require you to register with them)

adam


On 16/8/07 04:45, "pratchusha kamireddy" <pratchusha.kamireddy at aamu.edu>
wrote:

> Hello
>   I am Pratchusha Kamireddy doing masters in Alabama A&M University. I am
> working under Dr.Kantety in Plant and Soil Science Department.I am the
> beginner to learn perl programming. I need Activeperl software to run the perl
> programs. Can you help me in this regard like: where can I dowmload this
> software, how can i Install this and how can i use this. I am eagerlu waiting
> for your reply.Please help me in this regard.
>    Thanking you
>    Pratchusha Kamireddy
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From muratem at eng.uah.edu  Thu Aug 16 19:10:34 2007
From: muratem at eng.uah.edu (muratem at eng.uah.edu)
Date: Thu, 16 Aug 2007 14:10:34 -0500 (CDT)
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
Message-ID: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>

Hello

This might not be the correct list for this particular problem, but
hopefully someone can help. I am trying to install ...staden::read on a
Mac OS X 10.4. I tried installing cpan but it wouldn't work so I went to
the manual methods. Perl is on the system and appears to be installed
correctly for a Mac. Bioperl 1.5.2 was installed via fink and appears to
be OK also. I'm trying to install the Bio::SeqIO::staden::read module. I
downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the usual
perl Makefile.PL and make and get:

newyork:/usr/local/bioperl-ext-1.5.1 root# make
Makefile:1148: *** multiple target patterns.  Stop.

A snippet from the Makefile...

   1148 pm_to_blib: $(TO_INST_PM)
   1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
   1150           Bio/Ext/Align/libs/hscore.h
$(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
   1151           Bio/Ext/Align/libs/probability.c
$(INST_LIB)/Bio/Ext/Align/libs/probability.c \
   1152           Bio/Ext/Align/libs/linesubs.h
$(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
   1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/test.pl \
   1154           Bio/Ext/Align/libs/wiseoverlay.h
$(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
   1155           Bio/Ext/Align/libs/proteinsw.h
$(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
   1156           Bio/Ext/Align/libs/wisebase.h
$(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
   1157           Bio/Ext/Align/libs/seqaligndisplay.h
$(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
   1158           Bio/Ext/Align/libs/dyna.h
$(INST_LIB)/Bio/Ext/Align/libs/dyna.h \

The README says you don't have to build the whole package, so I descended
to the staden directory and did a Make and didn't get any problems
reported. But when I did a make test I get:

newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
"test_harness(0, '../blib/lib', '../blib/arch')" test.pl
test....Had problems bootstrapping Inline module 'Bio::SeqIO::staden::read'

Can't load
'/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle'
for module Bio::SeqIO::staden::read:
dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle,
2): Symbol not found: _curl_easy_init
  Referenced from:
/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/SeqIO/staden/read/read.bundle
  Expected in: dynamic lookup
 at /Library/Perl/5.8.6/Inline.pm line 500


 at test.pl line 0
INIT failed--call queue aborted, <DATA> line 1.
test....dubious
        Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 1-94
        Failed 94/94 tests, 0.00% okay
Failed Test Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
test.pl      255 65280    94  188 200.00%  1-94
Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00% okay.
make: *** [test_dynamic] Error 2

The missing symbol is apparently from libcurl. I have both libcurl.2.dylib
and libcurl.3.dylib with copies in multiple locations including /usr/lib,
/usr/local/lib and the usual Mac directories. I used the Mac otool to look
at the externals in read.bundle and it references libz.1.dylib and
libSystem.B.dylib. Could this be a case where there should have been a
link to libcurl and wasn't?

I've searched the list and see only the Inline versioning problem (which I
had and fixed). Has anybody seen this problem before or built the module
on a Mac? How did you do it? Is this a question for the Staden list on
sourceforge?

Thanks

Mike


From cjfields at uiuc.edu  Thu Aug 16 19:55:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 16 Aug 2007 14:55:05 -0500
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
In-Reply-To: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
References: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
Message-ID: <9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>


On Aug 16, 2007, at 2:10 PM, muratem at eng.uah.edu wrote:

> Hello
>
> This might not be the correct list for this particular problem, but
> hopefully someone can help. I am trying to install ...staden::read  
> on a
> Mac OS X 10.4. I tried installing cpan but it wouldn't work so I  
> went to
> the manual methods. Perl is on the system and appears to be installed
> correctly for a Mac. Bioperl 1.5.2 was installed via fink and  
> appears to
> be OK also. I'm trying to install the Bio::SeqIO::staden::read  
> module. I
> downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the  
> usual
> perl Makefile.PL and make and get:
>
> newyork:/usr/local/bioperl-ext-1.5.1 root# make
> Makefile:1148: *** multiple target patterns.  Stop.
>
> A snippet from the Makefile...
>
>    1148 pm_to_blib: $(TO_INST_PM)
>    1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
> 'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
>    1150           Bio/Ext/Align/libs/hscore.h
> $(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
>    1151           Bio/Ext/Align/libs/probability.c
> $(INST_LIB)/Bio/Ext/Align/libs/probability.c \
>    1152           Bio/Ext/Align/libs/linesubs.h
> $(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
>    1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/ 
> test.pl \
>    1154           Bio/Ext/Align/libs/wiseoverlay.h
> $(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
>    1155           Bio/Ext/Align/libs/proteinsw.h
> $(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
>    1156           Bio/Ext/Align/libs/wisebase.h
> $(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
>    1157           Bio/Ext/Align/libs/seqaligndisplay.h
> $(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
>    1158           Bio/Ext/Align/libs/dyna.h
> $(INST_LIB)/Bio/Ext/Align/libs/dyna.h \
>
> The README says you don't have to build the whole package, so I  
> descended
> to the staden directory and did a Make and didn't get any problems
> reported. But when I did a make test I get:
>
> newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
> "test_harness(0, '../blib/lib', '../blib/arch')" test.pl
> test....Had problems bootstrapping Inline module  
> 'Bio::SeqIO::staden::read'
>
> Can't load
> '/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/ 
> Bio/SeqIO/staden/read/read.bundle'
> for module Bio::SeqIO::staden::read:
> dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/ 
> auto/Bio/SeqIO/staden/read/read.bundle,
> 2): Symbol not found: _curl_easy_init
>   Referenced from:
> /usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/ 
> SeqIO/staden/read/read.bundle
>   Expected in: dynamic lookup
>  at /Library/Perl/5.8.6/Inline.pm line 500
>
>
>  at test.pl line 0
> INIT failed--call queue aborted, <DATA> line 1.
> test....dubious
>         Test returned status 255 (wstat 65280, 0xff00)
> DIED. FAILED tests 1-94
>         Failed 94/94 tests, 0.00% okay
> Failed Test Stat Wstat Total Fail  Failed  List of Failed
> ---------------------------------------------------------------------- 
> ---------
> test.pl      255 65280    94  188 200.00%  1-94
> Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00%  
> okay.
> make: *** [test_dynamic] Error 2
>
> The missing symbol is apparently from libcurl. I have both libcurl. 
> 2.dylib
> and libcurl.3.dylib with copies in multiple locations including / 
> usr/lib,
> /usr/local/lib and the usual Mac directories. I used the Mac otool  
> to look
> at the externals in read.bundle and it references libz.1.dylib and
> libSystem.B.dylib. Could this be a case where there should have been a
> link to libcurl and wasn't?
>
> I've searched the list and see only the Inline versioning problem  
> (which I
> had and fixed). Has anybody seen this problem before or built the  
> module
> on a Mac? How did you do it? Is this a question for the Staden list on
> sourceforge?
>
> Thanks
>
> Mike

Haven't seen the problem you list.  I have installed it on Mac OS X  
(intel) w/o problems so I know it works; at least all tests passed  
though I remember Inline complaining for some reason.

You should try using bioperl-ext from CVS (it is really 1.5.1 but  
with updated docs and maybe a change or two).  The process is a  
little tricky but is documented in the README in the package.  You'll  
need the old io_lib (1.8.12 or earlier) from Staden if memory serves.

chris


From zhaodj at ioz.ac.cn  Fri Aug 17 02:13:16 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 17 Aug 2007 10:13:16 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
Message-ID: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>

Dear list members,

I have a question about the methods of bioperl objects.It is how and
where we can get the whole methods of a bioperl object.

Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
this object, some sample codes are given.The following five clauses
are excerpted from the synopsis.
(1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
(2)while ( my @rids = $factory->each_rid ) {
(3)$factory->remove_rid($rid);
(4)my $rc = $factory->retrieve_blast($rid);
(5)my $r = $factory->submit_blast($input);

The five clauses use five methods of the RemoteBlast object,i.e.
(1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
(5)submit_blast. However,I only find part of them(45) are listed in
the appendix while others(123) are absent. Are there some more
methods not explictly declared? I don't know.This will lead to the
partial understanding and utilization of the module.Therefore I come
here for the way to get the full methods of a bioperl object.

Thanks!
-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From zhaodj at ioz.ac.cn  Fri Aug 17 02:13:16 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 17 Aug 2007 10:13:16 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
Message-ID: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>

Dear list members,

I have a question about the methods of bioperl objects.It is how and
where we can get the whole methods of a bioperl object.

Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
this object, some sample codes are given.The following five clauses
are excerpted from the synopsis.
(1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
(2)while ( my @rids = $factory->each_rid ) {
(3)$factory->remove_rid($rid);
(4)my $rc = $factory->retrieve_blast($rid);
(5)my $r = $factory->submit_blast($input);

The five clauses use five methods of the RemoteBlast object,i.e.
(1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
(5)submit_blast. However,I only find part of them(45) are listed in
the appendix while others(123) are absent. Are there some more
methods not explictly declared? I don't know.This will lead to the
partial understanding and utilization of the module.Therefore I come
here for the way to get the full methods of a bioperl object.

Thanks!
-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From neetisomaiya at gmail.com  Fri Aug 17 06:23:08 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 17 Aug 2007 11:53:08 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
Message-ID: <764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>

Hi,

My main concern is just the pdb id and title. PDB id I am able to fetch
easily, but is there a method which can give me the title of the PDB
structure?

Like for example from the following :-

HEADER    DNA/RNA                                 05-DEC-94   100D
TITLE     CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
TITLE    2 R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO
TITLE    3 PHOSPHATE ONLY AND MINOR GROOVE TERTIARY BASE-PAIRING
COMPND    MOL_ID: 1;
COMPND   2 MOLECULE: DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*GP*)-
COMPND   3 R(*G)-3');
COMPND   4 CHAIN: A, B;
.
.
.
.

I just want "CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO PHOSPHATE ONLY AND
MINOR GROOVE TERTIARY BASE-PAIRING".

Thanks,
Neeti.

On 8/16/07, Chris Fields <cjfields at uiuc.edu> wrote:
>
>
> On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:
>
> > neeti somaiya wrote:
> >> I tried using Bio::Structure::IO::pdb with some code like :-
> >> use Bio::Structure::IO;
> >>
> >>     $in  = Bio::Structure::IO->new(-file => " pdb100d.ent",
> >>                                    -format => 'pdb');
> >>
> >>     while ( my $struc = $in->next_structure() ) {
> >>        print "Structure ", $struc->id,"\n";
> >>     }
> >>
> >> It works well. But I am not able to find documentation of other
> >> methods
> >> which will give me various specific details available in a pdb
> >> file, right
> >> from title, keywords, references to structure details, atoms,
> >> coordinates
> >> etc. There must be different methods to fetch and parse each of
> >> this data
> >> from a pdb file, right? Where can I find the details?
> >
> > $struct is a Bio::Structure::Entry, so look at the docs for that:
> > http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
> >
> > You'll probably want to look at the docs for the other Structure
> > modules
> > as well:
> > http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
> >
> >
> > I agree, the documentation in this area could be improved.
> > Bio::Structure::StructureI could actually contain something, and
> > Bio::Structure should actually exist or not be referenced in the docs.
>
> There was a discussion a while back on refactoring the code within
> Bio::Structure to better deal with HETATM and other stuff.  As far as
> I'm concerned it's open for anyone wanted to tinker with it.
>
> chris
>


-- 
-Neeti
Even my blood says, B positive


From alexl at users.sourceforge.net  Fri Aug 17 07:22:16 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 00:22:16 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
Message-ID: <cg3ayi39sn.fsf@allele2.localdomain>

Hi all,

I'd like to clarify the license of bioperl.  Currently the LICENSE
only includes the text of the Artistic artist.  But the wiki
http://www.bioperl.org/wiki/FAQ#What_are_the_license_terms_for_BioPerl.3F
says:

 BioPerl is licensed under the same terms as Perl itself which is the
 Perl Artistic License (see
 http://www.perl.com/pub/a/language/misc/Artistic.html or
 http://www.opensource.org/licenses/artistic-license.html

and most of the modules in the source say:

 "You may distribute this module under the same terms as perl itself"

But the current distribution of Perl is actually dually-licensed under
the GPL or Artistic licenses (so the wiki is technically out of sync
with the "same terms as Perl itself"), see:

 http://dev.perl.org/licenses/

I assume that the intent of the bioperl authors is to license with the
same terms as Perl's *current* license (which would mean bioperl is
really effectively dually-licensed under the GPL or Artistic license).
If so, it would be good if the LICENSE text and the wiki were updated
to reflect this.

Also some of the source modules say "under the same terms as perl
itself", but then only mention the Artistic license.

This has important ramifications for distribution: I maintain the
Fedora package for bioperl and I have currently listed the license of
bioperl as "GPL or Artistic".  But if bioperl were distributed under
the Artistic license only then I would have to pull the package from
the distribution, because the Artistic 1.0 (original)-only license is
deprecated (but "GPL or Artistic" is OK):

http://fedoraproject.org/wiki/Licensing#head-d8cc605dd386091c8b6be97b8a43fb6a5d624ae1

Thanks!

Alex


From alexl at users.sourceforge.net  Fri Aug 17 07:42:07 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 00:42:07 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <cg3ayi39sn.fsf@allele2.localdomain> (Alex Lancaster's message of
	"Fri\, 17 Aug 2007 00\:22\:16 -0700")
References: <cg3ayi39sn.fsf@allele2.localdomain>
Message-ID: <nrsl6i1ub4.fsf@allele2.localdomain>

>>>>> "AL" == Alex Lancaster  writes:

[...]

AL> I assume that the intent of the bioperl authors is to license with
AL> the same terms as Perl's *current* license (which would mean
AL> bioperl is really effectively dually-licensed under the GPL or
AL> Artistic license).  If so, it would be good if the LICENSE text
AL> and the wiki were updated to reflect this.

Also note that since Perl's license is a dual-license "GPL or
Artistic" then people aren't required to submit their modifications
back to the bioperl distribution because they can choose to follow the
Artistic (rather than the GPL) license which doesn't require
modifications to be submitted back.  This means the point:

 "If you fix bugs, please let us know about them. This is not the GPL
 license so you are not required to submit the code fixes, but in the
 spirit of making a better product we hope you'll contribute back to
 the community any insight or code improvements."

listed here:

 http://www.bioperl.org/wiki/Licensing_BioPerl

would still stand, because you can choose the Artistic license, but
you could modify the clause to say:

 "If you fix bugs, please let us know about them. Because Bioperl is
 dual-licensed under the GPL or Artistic licenses, you can choose the
 Artistic license, which means that you are not required to submit the
 code fixes, but in the spirit of making a better product we hope
 you'll contribute back to the community any insight or code
 improvements."


From n.haigh at sheffield.ac.uk  Fri Aug 17 10:27:43 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 17 Aug 2007 11:27:43 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <46C5781F.60301@sheffield.ac.uk>

De-Jian,ZHAO wrote:
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
>   


You should check out the Deobfuscator at:
http://bioperl.org/cgi-bin/deob_interface.cgi

Search and choose the object of choice. e.g. Bio::Tools::Run::RemoteBlast

You will be provided a list of methods available to that object,
including all the methods up the inheritance hierarchy. Unfortunately,
some bioperl modules are documented more thoroughly than others.

Nath


From neetisomaiya at gmail.com  Fri Aug 17 10:42:09 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 17 Aug 2007 16:12:09 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
Message-ID: <764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>

Hi,

I have done it currently as follows :

 while ( my $struc = $in->next_structure() )
                {
                        my $title;

                        my $pdb_id = $struc->id;
                        print "Structure ", $pdb_id,"\n";

                        my $ac = $struc->annotation();

                        foreach my $key ( $ac->get_all_annotation_keys() )
                        {
                                if($key eq "title")
                                {
                                        my @values =
$ac->get_Annotations($key);
                                        foreach my $value (@values)
                                        {
                                                $title = $value->as_text;
                                                chomp($title);
                                                if($title =~ /Value\: (.*)/)
                                                {
                                                        $title = $1;
                                                }
                                                $title =~ s/\s+/ /g;

                                                print "Title ",$title,"\n";
                                                last;
                                        }
                                        last;
                                }
                  }
}

Is this ok?

On 8/17/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> My main concern is just the pdb id and title. PDB id I am able to fetch
> easily, but is there a method which can give me the title of the PDB
> structure?
>
> Like for example from the following :-
>
> HEADER    DNA/RNA                                 05-DEC-94   100D
> TITLE     CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
> TITLE    2 R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO
> TITLE    3 PHOSPHATE ONLY AND MINOR GROOVE TERTIARY BASE-PAIRING
> COMPND    MOL_ID: 1;
> COMPND   2 MOLECULE: DNA/RNA (5'-R(*CP*)-D(*CP*GP*GP*CP*GP*CP*CP*GP*)-
> COMPND   3 R(*G)-3');
> COMPND   4 CHAIN: A, B;
> .
> .
> .
> .
>
> I just want "CRYSTAL STRUCTURE OF THE HIGHLY DISTORTED CHIMERIC DECAMER
> R(C)D(CGGCGCCG)R(G)-SPERMINE COMPLEX-SPERMINE BINDING TO PHOSPHATE ONLY AND
> MINOR GROOVE TERTIARY BASE-PAIRING".
>
> Thanks,
> Neeti.
>
> On 8/16/07, Chris Fields <cjfields at uiuc.edu> wrote:
> >
> >
> > On Aug 16, 2007, at 4:59 AM, Sendu Bala wrote:
> >
> > > neeti somaiya wrote:
> > >> I tried using Bio::Structure::IO::pdb with some code like :-
> > >> use Bio::Structure::IO;
> > >>
> > >>     $in  = Bio::Structure::IO->new(-file => " pdb100d.ent",
> > >>                                    -format => 'pdb');
> > >>
> > >>     while ( my $struc = $in->next_structure() ) {
> > >>        print "Structure ", $struc->id,"\n";
> > >>     }
> > >>
> > >> It works well. But I am not able to find documentation of other
> > >> methods
> > >> which will give me various specific details available in a pdb
> > >> file, right
> > >> from title, keywords, references to structure details, atoms,
> > >> coordinates
> > >> etc. There must be different methods to fetch and parse each of
> > >> this data
> > >> from a pdb file, right? Where can I find the details?
> > >
> > > $struct is a Bio::Structure::Entry, so look at the docs for that:
> > > http://doc.bioperl.org/bioperl-live/Bio/Structure/Entry.html
> > >
> > > You'll probably want to look at the docs for the other Structure
> > > modules
> > > as well:
> > > http://doc.bioperl.org/bioperl-live/Bio/Structure/modules.html
> > >
> > >
> > > I agree, the documentation in this area could be improved.
> > > Bio::Structure::StructureI could actually contain something, and
> > > Bio::Structure should actually exist or not be referenced in the docs.
> >
> >
> > There was a discussion a while back on refactoring the code within
> > Bio::Structure to better deal with HETATM and other stuff.  As far as
> > I'm concerned it's open for anyone wanted to tinker with it.
> >
> > chris
> >
>
>
>
> --
> -Neeti
> Even my blood says, B positive
>


-- 
-Neeti
Even my blood says, B positive


From n.haigh at sheffield.ac.uk  Fri Aug 17 10:27:43 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Fri, 17 Aug 2007 11:27:43 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <46C5781F.60301@sheffield.ac.uk>

De-Jian,ZHAO wrote:
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
>   


You should check out the Deobfuscator at:
http://bioperl.org/cgi-bin/deob_interface.cgi

Search and choose the object of choice. e.g. Bio::Tools::Run::RemoteBlast

You will be provided a list of methods available to that object,
including all the methods up the inheritance hierarchy. Unfortunately,
some bioperl modules are documented more thoroughly than others.

Nath


From bix at sendu.me.uk  Fri Aug 17 13:35:01 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 17 Aug 2007 14:35:01 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	
	<46C41FEC.2000206@sendu.me.uk>	
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>	
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
	<764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
Message-ID: <46C5A405.2070005@sendu.me.uk>

neeti somaiya wrote:
> Hi,
> 
> I have done it currently as follows :
[snip]
> Is this ok?

If it works, of course. There seems to be some redundant code there, 
however. I'm guessing this would be better (assuming your code worked in 
the first place):

while (my $struc = $in->next_structure()) {
     my $pdb_id = $struc->id;
     print "Structure ", $pdb_id,"\n";

     my $ac = $struc->annotation();
     my ($title) = $ac->get_Annotations('title');
     $title = $title->as_text;
     chomp($title);
     if ($title =~ /Value\: (.*)/) {
         $title = $1;
     }
     $title =~ s/\s+/ /g;

     print "Title ",$title,"\n";
}


From muratem at eng.uah.edu  Fri Aug 17 14:03:22 2007
From: muratem at eng.uah.edu (Mike Muratet)
Date: Fri, 17 Aug 2007 09:03:22 -0500 (CDT)
Subject: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
In-Reply-To: <9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>
References: <27981.69.147.139.126.1187291434.squirrel@webmail.eng.uah.edu>
	<9BBC30AD-9AFE-4D52-88E4-656D9EB8924E@uiuc.edu>
Message-ID: <Pine.GSO.4.60.0708170902570.23859@eng.uah.edu>


On Thu, 16 Aug 2007, Chris Fields wrote:

> Date: Thu, 16 Aug 2007 14:55:05 -0500
> From: Chris Fields <cjfields at uiuc.edu>
> To: muratem at eng.uah.edu
> Cc: bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] Problem with Bio::SeqIO::staden::read on Mac OS X
> 
>
> On Aug 16, 2007, at 2:10 PM, muratem at eng.uah.edu wrote:
>
>> Hello
>> 
>> This might not be the correct list for this particular problem, but
>> hopefully someone can help. I am trying to install ...staden::read on a
>> Mac OS X 10.4. I tried installing cpan but it wouldn't work so I went to
>> the manual methods. Perl is on the system and appears to be installed
>> correctly for a Mac. Bioperl 1.5.2 was installed via fink and appears to
>> be OK also. I'm trying to install the Bio::SeqIO::staden::read module. I
>> downloaded the bioperl-ext-1.5.1 tarball from bioperl.org, did the usual
>> perl Makefile.PL and make and get:
>> 
>> newyork:/usr/local/bioperl-ext-1.5.1 root# make
>> Makefile:1148: *** multiple target patterns.  Stop.
>> 
>> A snippet from the Makefile...
>> 
>>    1148 pm_to_blib: $(TO_INST_PM)
>>    1149         $(NOECHO) $(PERLRUN) -MExtUtils::Install -e
>> 'pm_to_blib({@ARGV}, '\''$(INST_LIB)/auto'\'', '\''$(PM_FILTER)'\'')'\
>>    1150           Bio/Ext/Align/libs/hscore.h
>> $(INST_LIB)/Bio/Ext/Align/libs/hscore.h \
>>    1151           Bio/Ext/Align/libs/probability.c
>> $(INST_LIB)/Bio/Ext/Align/libs/probability.c \
>>    1152           Bio/Ext/Align/libs/linesubs.h
>> $(INST_LIB)/Bio/Ext/Align/libs/linesubs.h \
>>    1153           Bio/Ext/Align/test.pl $(INST_LIB)/Bio/Ext/Align/test.pl 
>> \
>>    1154           Bio/Ext/Align/libs/wiseoverlay.h
>> $(INST_LIB)/Bio/Ext/Align/libs/wiseoverlay.h \
>>    1155           Bio/Ext/Align/libs/proteinsw.h
>> $(INST_LIB)/Bio/Ext/Align/libs/proteinsw.h \
>>    1156           Bio/Ext/Align/libs/wisebase.h
>> $(INST_LIB)/Bio/Ext/Align/libs/wisebase.h \
>>    1157           Bio/Ext/Align/libs/seqaligndisplay.h
>> $(INST_LIB)/Bio/Ext/Align/libs/seqaligndisplay.h \
>>    1158           Bio/Ext/Align/libs/dyna.h
>> $(INST_LIB)/Bio/Ext/Align/libs/dyna.h \
>> 
>> The README says you don't have to build the whole package, so I descended
>> to the staden directory and did a Make and didn't get any problems
>> reported. But when I did a make test I get:
>> 
>> newyork:/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden root# make test
>> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
>> "test_harness(0, '../blib/lib', '../blib/arch')" test.pl
>> test....Had problems bootstrapping Inline module 
>> 'Bio::SeqIO::staden::read'
>> 
>> Can't load
>> '/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/ 
>> Bio/SeqIO/staden/read/read.bundle'
>> for module Bio::SeqIO::staden::read:
>> dlopen(/usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/ 
>> auto/Bio/SeqIO/staden/read/read.bundle,
>> 2): Symbol not found: _curl_easy_init
>>   Referenced from:
>> /usr/local/bioperl-ext-1.5.1/Bio/SeqIO/staden/../blib/arch/auto/Bio/ 
>> SeqIO/staden/read/read.bundle
>>   Expected in: dynamic lookup
>>  at /Library/Perl/5.8.6/Inline.pm line 500
>> 
>> 
>>  at test.pl line 0
>> INIT failed--call queue aborted, <DATA> line 1.
>> test....dubious
>>         Test returned status 255 (wstat 65280, 0xff00)
>> DIED. FAILED tests 1-94
>>         Failed 94/94 tests, 0.00% okay
>> Failed Test Stat Wstat Total Fail  Failed  List of Failed
>> ---------------------------------------------------------------------- 
>> ---------
>> test.pl      255 65280    94  188 200.00%  1-94
>> Failed 1/1 test scripts, 0.00% okay. 94/94 subtests failed, 0.00% okay.
>> make: *** [test_dynamic] Error 2
>> 
>> The missing symbol is apparently from libcurl. I have both libcurl.2.dylib
>> and libcurl.3.dylib with copies in multiple locations including /usr/lib,
>> /usr/local/lib and the usual Mac directories. I used the Mac otool to look
>> at the externals in read.bundle and it references libz.1.dylib and
>> libSystem.B.dylib. Could this be a case where there should have been a
>> link to libcurl and wasn't?
>> 
>> I've searched the list and see only the Inline versioning problem (which I
>> had and fixed). Has anybody seen this problem before or built the module
>> on a Mac? How did you do it? Is this a question for the Staden list on
>> sourceforge?
>> 
>> Thanks
>> 
>> Mike
>
> Haven't seen the problem you list.  I have installed it on Mac OS X (intel) 
> w/o problems so I know it works; at least all tests passed though I remember 
> Inline complaining for some reason.
>
> You should try using bioperl-ext from CVS (it is really 1.5.1 but with 
> updated docs and maybe a change or two).  The process is a little tricky but 
> is documented in the README in the package.  You'll need the old io_lib 
> (1.8.12 or earlier) from Staden if memory serves.
>
> chris
>

Thanks, I'll give that a try.

Mike


From alexl at users.sourceforge.net  Fri Aug 17 15:23:33 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Fri, 17 Aug 2007 08:23:33 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	(Kevin Brown's message of "Fri\, 17 Aug 2007 08\:11\:40 -0700")
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
Message-ID: <n9ir7e18y2.fsf@allele2.localdomain>

>>>>> "KB" == Kevin Brown  writes:

[...]

>> Also note that since Perl's license is a dual-license "GPL or
>> Artistic" then people aren't required to submit their modifications
>> back to the bioperl distribution because they can choose to follow
>> the Artistic (rather than the GPL) license which doesn't require
>> modifications to be submitted back.  This means the point:

KB> You aren't required to submit patches even under the GPL.  If I
KB> make changes and don't distribute them then I have no requirement
KB> to reveal my changes to the bioperl source code.  Also the GPL
KB> does not require that the code be made freely available to all,
KB> just that users of GPL'd software can request the source from the
KB> vendor/distributor and should not find lots of little hoops to
KB> jump through to get it.  You can even charge to get access if that
KB> charge is to cover the cost of the expense to get it (such as the
KB> cost of a cd + mail delivery charge).

Sure, I was just pointing out that you can avoid even these things if
you choose the Artistic license.  I have no problem with the GPL, but
some people do.  The other possibility (if the current Perl "GPL or
Artistic" is not a possibility) is simply upgrading to the "Artistic
2.0" license adopted by the Perl Foundation for Perl 6 and later (I
think?):

http://www.perlfoundation.org/artistic_license_2_0

it's a GPL-compatible free software license.

Alex


From Kevin.M.Brown at asu.edu  Fri Aug 17 15:11:40 2007
From: Kevin.M.Brown at asu.edu (Kevin Brown)
Date: Fri, 17 Aug 2007 08:11:40 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <nrsl6i1ub4.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
Message-ID: <1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>

> AL> I assume that the intent of the bioperl authors is to 
> license with 
> AL> the same terms as Perl's *current* license (which would 
> mean bioperl 
> AL> is really effectively dually-licensed under the GPL or Artistic 
> AL> license).  If so, it would be good if the LICENSE text 
> and the wiki 
> AL> were updated to reflect this.
> 
> Also note that since Perl's license is a dual-license "GPL or 
> Artistic" then people aren't required to submit their 
> modifications back to the bioperl distribution because they 
> can choose to follow the Artistic (rather than the GPL) 
> license which doesn't require modifications to be submitted 
> back.  This means the point:

You aren't required to submit patches even under the GPL.  If I make
changes and don't distribute them then I have no requirement to reveal
my changes to the bioperl source code.  Also the GPL does not require
that the code be made freely available to all, just that users of GPL'd
software can request the source from the vendor/distributor and should
not find lots of little hoops to jump through to get it.  You can even
charge to get access if that charge is to cover the cost of the expense
to get it (such as the cost of a cd + mail delivery charge).


From cjfields at uiuc.edu  Fri Aug 17 16:07:47 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 17 Aug 2007 11:07:47 -0500
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <n9ir7e18y2.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
Message-ID: <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>


On Aug 17, 2007, at 10:23 AM, Alex Lancaster wrote:

>>>>>> "KB" == Kevin Brown  writes:
>
> [...]
>
>>> Also note that since Perl's license is a dual-license "GPL or
>>> Artistic" then people aren't required to submit their modifications
>>> back to the bioperl distribution because they can choose to follow
>>> the Artistic (rather than the GPL) license which doesn't require
>>> modifications to be submitted back.  This means the point:
>
> KB> You aren't required to submit patches even under the GPL.  If I
> KB> make changes and don't distribute them then I have no requirement
> KB> to reveal my changes to the bioperl source code.  Also the GPL
> KB> does not require that the code be made freely available to all,
> KB> just that users of GPL'd software can request the source from the
> KB> vendor/distributor and should not find lots of little hoops to
> KB> jump through to get it.  You can even charge to get access if that
> KB> charge is to cover the cost of the expense to get it (such as the
> KB> cost of a cd + mail delivery charge).
>
> Sure, I was just pointing out that you can avoid even these things if
> you choose the Artistic license.  I have no problem with the GPL, but
> some people do.  The other possibility (if the current Perl "GPL or
> Artistic" is not a possibility) is simply upgrading to the "Artistic
> 2.0" license adopted by the Perl Foundation for Perl 6 and later (I
> think?):
>
> http://www.perlfoundation.org/artistic_license_2_0
>
> it's a GPL-compatible free software license.
>
> Alex

Switching to Artistic 2.0 is probably the best way to go.  We'll need  
a more involved discussion but I don't think there'll be too many  
objections.  You mention GPL-compatibility; is that for v2 and v3?

chris


From gonzaled at tcd.ie  Fri Aug 17 17:03:35 2007
From: gonzaled at tcd.ie (David Gonzalez)
Date: Fri, 17 Aug 2007 18:03:35 +0100
Subject: [Bioperl-l] Bio::SeqIO::swiss species parsing bug?
Message-ID: <46C5D4E7.6000605@tcd.ie>

	Hi,

	I had a problem with a swissprot file in which the genus and species
were being left undefined, and I believe it could be a bug in the
swiss.pm module.


	When I tried to parse the file with Bio::SeqIO, I got the following
error messages:

Use of uninitialized value in pattern match (m//) at
/sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 965, <GEN0> line 12.
Use of uninitialized value in string eq at
/sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 967, <GEN0> line 12.

	The fields I wanted from the file (gene_id , etc.. ) were fine however,
so it was being parsed.

	I checked the output with Data::Dumper and I found the following in the
species entry; the species is left undefined, and the common name is absent.

 	'species' => bless( {
                             '_ncbi_taxid' => 'Not',
                             '_classification' => [
                                                   	undef,
                                                   	undef,
                                                   	'Aedes',
                                                  						    	'Culicini',
                                                        'Culicinae',
                                                        'Culicidae',
                                                        'Culicoidea',
                                                        'Nematocera',
                                                        'Diptera',
                                                        'Endopterygota',
                                                        'Neoptera',
                                                        'Pterygota',
                                                        'Insecta',
                                                        'Hexapoda',
                               							'Arthropoda',
                                         							'Metazoa',
                                                        'Eukaryota'
                                                            ]
                                     }, 'Bio::Species' ),

	The species line in the file is formatted according to the swissprot
specifications and includes a common name

OS   Aedes aegypti (yellow fever mosquito)
OC   Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; Neoptera;
OC   Endopterygota; Diptera; Nematocera; Culicoidea; Culicidae; Culicinae;
OC   Culicini; Aedes.
OX   NCBI_TaxID=Not defined;

	I think the problem is in the line 905 of the swiss.pm file:

902	if(/^OS\s+(\S.+)/ && (! defined($binomial))) {
903	    $osline .= " " if $osline;
904	    $osline .= $1;
905	    if($osline =~ s/(,|, and|\.)$//) {
906		($binomial, $descr) = $osline =~ /(\S[^\(]+)(.*)/;
907             ($ns_name) = $binomial;
908             $ns_name =~ s/\s+$//; #####


	The problem seems to be that there are no punctuation signs, so 905
returns false. The swissprot format does not require the line to end in
'.' I think although it normally does. By just removing the requirement
for the substitution the output of Data::Dumper seemed normal

	....
	'_common_name' => 'yellow fever mosquito',
        '_ncbi_taxid' => 'Not',
        '_classification' => [
                              'aegypti',
                              'Aedes',
                              'Culicini',
	....

	I am using the fink installed bioperl:
	bioperl-pm586   1.4-5   Perl module for biology

	I don't know if this has  been reported/solved in the newer versions of
bioperl.

	David

-- 
David Gonzalez Knowles
Smurfit Institute of Genetics
Trinity College
Dublin


From cjfields at uiuc.edu  Fri Aug 17 17:20:21 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 17 Aug 2007 12:20:21 -0500
Subject: [Bioperl-l] Bio::SeqIO::swiss species parsing bug?
In-Reply-To: <46C5D4E7.6000605@tcd.ie>
References: <46C5D4E7.6000605@tcd.ie>
Message-ID: <04912FDE-2AA4-414C-9CE4-A0BA5E9C89C9@uiuc.edu>


On Aug 17, 2007, at 12:03 PM, David Gonzalez wrote:

> 	Hi,
>
> 	I had a problem with a swissprot file in which the genus and species
> were being left undefined, and I believe it could be a bug in the
> swiss.pm module.
>
>
> 	When I tried to parse the file with Bio::SeqIO, I got the following
> error messages:
>
> Use of uninitialized value in pattern match (m//) at
> /sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 965, <GEN0> line 12.
> Use of uninitialized value in string eq at
> /sw/lib/perl5/5.8.6/Bio/SeqIO/swiss.pm line 967, <GEN0> line 12.
> ...
> 	I am using the fink installed bioperl:
> 	bioperl-pm586   1.4-5   Perl module for biology
>
> 	I don't know if this has  been reported/solved in the newer  
> versions of
> bioperl.
>
> 	David
>
> -- 
> David Gonzalez Knowles
> Smurfit Institute of Genetics
> Trinity College
> Dublin

That looks like bioperl 1.4, which is several years old.  You should  
update to the latest official release (1.5.2), then see if the  
problem persists.

chris


From alexl at users.sourceforge.net  Sat Aug 18 11:33:34 2007
From: alexl at users.sourceforge.net (Alex Lancaster)
Date: Sat, 18 Aug 2007 04:33:34 -0700
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu> (Chris Fields's
	message of "Fri\, 17 Aug 2007 11\:07\:47 -0500")
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
Message-ID: <8td4xlyt4h.fsf@allele2.localdomain>

>>>>> "CF" == Chris Fields  writes:

[...]

>> Sure, I was just pointing out that you can avoid even these things
>> if you choose the Artistic license.  I have no problem with the
>> GPL, but some people do.  The other possibility (if the current
>> Perl "GPL or Artistic" is not a possibility) is simply upgrading to
>> the "Artistic 2.0" license adopted by the Perl Foundation for Perl
>> 6 and later (I think?):

>> http://www.perlfoundation.org/artistic_license_2_0

>> it's a GPL-compatible free software license.

CF> Switching to Artistic 2.0 is probably the best way to go.  We'll
CF> need a more involved discussion but I don't think there'll be too
CF> many objections.  You mention GPL-compatibility; is that for v2
CF> and v3?

IANAL, but looking at:

http://www.perlfoundation.org/artistic_2_0_notes

http://www.gnu.org/licenses/license-list.html (scroll down to
"Artistic 2.0")

it looks like you can choose any GPL license (i.e. v1 to v3).

I was really more concerned with clarifying what the bioperl license
was *right now*, because "the same license as Perl" implies the
so-called "disjunctive" "GPL or Artistic license":

http://www.gnu.org/licenses/license-list.html#PerlLicense

which is what I've marked the Fedora package as (since it listed "the
same license as Perl" in most of the source files), which is fine for
Fedora.

Fedora may possibly (still under discussion I believe) require removal
of any package that is licensed under the original (1.0) Artistic
alone and it would be a real shame if that required bioperl being
pulled from the repo.  I imagine the intent of the bioperl
contributors is that it should be under the same terms as Perl,
whatever that happens to be (which just happens to be GPL or Artistic,
which is fine).  A clarification to that effect would be useful.

Cheers,
Alex


From zhaodj at ioz.ac.cn  Sat Aug 18 15:06:41 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Sat, 18 Aug 2007 23:06:41 +0800 (CST)
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <46C5781F.60301@sheffield.ac.uk>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
	<46C5781F.60301@sheffield.ac.uk>
Message-ID: <52869.159.226.67.49.1187449601.squirrel@mail.ioz.ac.cn>

Thank you,Nathan.
The Deobfuscator is very helpful.

On Fri, Aug 17, 2007 18:27, Nathan Haigh wrote:
> De-Jian,ZHAO wrote:
>> Dear list members,
>>
>> I have a question about the methods of bioperl objects.It is how
>> and
>> where we can get the whole methods of a bioperl object.
>>
>> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
>> this object, some sample codes are given.The following five
>> clauses
>> are excerpted from the synopsis.
>> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
>> (2)while ( my @rids = $factory->each_rid ) {
>> (3)$factory->remove_rid($rid);
>> (4)my $rc = $factory->retrieve_blast($rid);
>> (5)my $r = $factory->submit_blast($input);
>>
>> The five clauses use five methods of the RemoteBlast object,i.e.
>> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
>> (5)submit_blast. However,I only find part of them(45) are listed
>> in
>> the appendix while others(123) are absent. Are there some more
>> methods not explictly declared? I don't know.This will lead to the
>> partial understanding and utilization of the module.Therefore I
>> come
>> here for the way to get the full methods of a bioperl object.
>>
>> Thanks!
>>
>
>
> You should check out the Deobfuscator at:
> http://bioperl.org/cgi-bin/deob_interface.cgi
>
> Search and choose the object of choice. e.g.
> Bio::Tools::Run::RemoteBlast
>
> You will be provided a list of methods available to that object,
> including all the methods up the inheritance hierarchy.
> Unfortunately,
> some bioperl modules are documented more thoroughly than others.
>
> Nath
>


From hlapp at gmx.net  Sat Aug 18 16:13:28 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 18 Aug 2007 12:13:28 -0400
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <8td4xlyt4h.fsf@allele2.localdomain>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
	<8td4xlyt4h.fsf@allele2.localdomain>
Message-ID: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>


On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote:

> I imagine the intent of the bioperl
> contributors is that it should be under the same terms as Perl,
> whatever that happens to be (which just happens to be GPL or Artistic,
> which is fine).

I fully agree.

>   A clarification to that effect would be useful.

Agreed, too. Would you mind changing that language on the wiki, since  
you seem to have a fairly good grasp on the issue?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Aug 18 16:42:04 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 18 Aug 2007 11:42:04 -0500
Subject: [Bioperl-l] Clarifying license of bioperl
In-Reply-To: <8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>
References: <cg3ayi39sn.fsf@allele2.localdomain>
	<nrsl6i1ub4.fsf@allele2.localdomain>
	<1A4207F8295607498283FE9E93B775B40390172E@EX02.asurite.ad.asu.edu>
	<n9ir7e18y2.fsf@allele2.localdomain>
	<3515AB25-9919-407B-93E9-352BC426AFA1@uiuc.edu>
	<8td4xlyt4h.fsf@allele2.localdomain>
	<8D3FBCDF-47E7-4A6E-8001-C034CA27BF3F@gmx.net>
Message-ID: <D3B67BC2-CB56-420F-B4E3-E0A57FEA7E80@uiuc.edu>


On Aug 18, 2007, at 11:13 AM, Hilmar Lapp wrote:

>
> On Aug 18, 2007, at 7:33 AM, Alex Lancaster wrote:
>
>> I imagine the intent of the bioperl
>> contributors is that it should be under the same terms as Perl,
>> whatever that happens to be (which just happens to be GPL or  
>> Artistic,
>> which is fine).
>
> I fully agree.
>
>>   A clarification to that effect would be useful.
>
> Agreed, too. Would you mind changing that language on the wiki, since
> you seem to have a fairly good grasp on the issue?
>
> 	-hilmar

Looks like the modules mostly state 'You may distribute this module  
under the same terms as perl itself', but there are likely a few  
which need to be changed.  Might be worth running a quick code audit  
to see what's present.

chris


From avilella at gmail.com  Sat Aug 18 20:38:10 2007
From: avilella at gmail.com (Albert Vilella)
Date: Sat, 18 Aug 2007 21:38:10 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <358f4d650708181338s5a5caadbscfa85786327f4304@mail.gmail.com>

I particularly like to code and debug at the same time. When you are using
the perl debugger, you can do an:

<DB> m $object

and it will show up all the information and methods for that object.

Cheers,

    Albert.

On 8/17/07, De-Jian,ZHAO <zhaodj at ioz.ac.cn> wrote:
>
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
> --
> De-Jian Zhao
> Institute of Zoology,Chinese Academy of Sciences
> +86-10-64807217
> zhaodj at ioz.ac.cn
>
>
>
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From avilella at gmail.com  Sat Aug 18 20:38:10 2007
From: avilella at gmail.com (Albert Vilella)
Date: Sat, 18 Aug 2007 21:38:10 +0100
Subject: [Bioperl-l] How to get the full methods of a bioperl object?
In-Reply-To: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
References: <55928.159.226.67.49.1187316796.squirrel@mail.ioz.ac.cn>
Message-ID: <358f4d650708181338s5a5caadbscfa85786327f4304@mail.gmail.com>

I particularly like to code and debug at the same time. When you are using
the perl debugger, you can do an:

<DB> m $object

and it will show up all the information and methods for that object.

Cheers,

    Albert.

On 8/17/07, De-Jian,ZHAO <zhaodj at ioz.ac.cn> wrote:
>
> Dear list members,
>
> I have a question about the methods of bioperl objects.It is how and
> where we can get the whole methods of a bioperl object.
>
> Take Bio::Tools::Run::RemoteBlast for example. In the synopsis of
> this object, some sample codes are given.The following five clauses
> are excerpted from the synopsis.
> (1)my $factory = Bio::Tools::Run::RemoteBlast->new(@params);
> (2)while ( my @rids = $factory->each_rid ) {
> (3)$factory->remove_rid($rid);
> (4)my $rc = $factory->retrieve_blast($rid);
> (5)my $r = $factory->submit_blast($input);
>
> The five clauses use five methods of the RemoteBlast object,i.e.
> (1)new, (2)each_rid, (3)remove_rid,(4)retrieve_blast,and
> (5)submit_blast. However,I only find part of them(45) are listed in
> the appendix while others(123) are absent. Are there some more
> methods not explictly declared? I don't know.This will lead to the
> partial understanding and utilization of the module.Therefore I come
> here for the way to get the full methods of a bioperl object.
>
> Thanks!
> --
> De-Jian Zhao
> Institute of Zoology,Chinese Academy of Sciences
> +86-10-64807217
> zhaodj at ioz.ac.cn
>
>
>
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From neetisomaiya at gmail.com  Mon Aug 20 04:33:17 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 10:03:17 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C5A405.2070005@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C41FEC.2000206@sendu.me.uk>
	<5D32F747-60FC-4EEE-BD38-3A522A67EA27@uiuc.edu>
	<764978cf0708162323r17c4fc59w5adfb61ccfc5ac6@mail.gmail.com>
	<764978cf0708170342q45acbea1vebaf1a8defb93896@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
Message-ID: <764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>

Hi,

Thanks for the responses.
Another question I had was, I am interested only in pdb id and title, and
for this I am downloading and unzipping each of the full pdb structure
files, parsing to get just id and title. Is there any other data source
which can give me just id and title of pdb structures, without me having to
download the full file of each structre?

Thanks,
Neeti.

On 8/17/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> neeti somaiya wrote:
> > Hi,
> >
> > I have done it currently as follows :
> [snip]
> > Is this ok?
>
> If it works, of course. There seems to be some redundant code there,
> however. I'm guessing this would be better (assuming your code worked in
> the first place):
>
> while (my $struc = $in->next_structure()) {
>      my $pdb_id = $struc->id;
>      print "Structure ", $pdb_id,"\n";
>
>      my $ac = $struc->annotation();
>      my ($title) = $ac->get_Annotations('title');
>      $title = $title->as_text;
>      chomp($title);
>      if ($title =~ /Value\: (.*)/) {
>          $title = $1;
>      }
>      $title =~ s/\s+/ /g;
>
>      print "Title ",$title,"\n";
> }
>


-- 
-Neeti
Even my blood says, B positive


From jaudall at gmail.com  Mon Aug 20 04:39:18 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Sun, 19 Aug 2007 21:39:18 -0700
Subject: [Bioperl-l] concatenating aln splices
Message-ID: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>

Based on several criteria, I've extracted several splices from a
single alignment and I'm trying to concatenate my selected sequences
together.  Unfortunately, one of the sequences in the original
alignment only has gap characters for one or more of the splices.  I'd
like to keep the gap splices because other downstream aligned bases
depend on them.  I get these two warning messages splicing my
alignments together:

-------------------- WARNING ---------------------
MSG: Got a sequence with no letters in it cannot guess alphabet []
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: Slice [232-233] of sequence [X2A/1-202] contains no residues.
Sequence excluded from the new alignment.
---------------------------------------------------

and now because of missing gaps, I get this error when trying to
concatenate them:

-------------------- WARNING ---------------------
MSG: expecting 236 not 203 from X2A
---------------------------------------------------

------------- EXCEPTION  -------------
MSG: All sequences in the alignment must be the same length
STACK Bio::AlignIO::phylip::write_aln
/sw/lib/perl5/5.8.6/Bio/AlignIO/phylip.pm:292

I don't mind the warnings, in fact I like them, but is there a way to
stop the splice function from removing the 'gap' sequence from the
alignment?  Perhaps catching the warning and inserting the gaps
afterwards might work, but I'm wondering if there's is a simpler
modification of SimpleAlign.pm that might work.  Any thoughts?

Josh


From bix at sendu.me.uk  Mon Aug 20 07:43:45 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 20 Aug 2007 08:43:45 +0100
Subject: [Bioperl-l] concatenating aln splices
In-Reply-To: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
References: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
Message-ID: <46C94631.2060704@sendu.me.uk>

Joshua Udall wrote:
> Based on several criteria, I've extracted several splices from a
> single alignment and I'm trying to concatenate my selected sequences
> together.  Unfortunately, one of the sequences in the original
> alignment only has gap characters for one or more of the splices.  I'd
> like to keep the gap splices because other downstream aligned bases
> depend on them.
[snip]
> I don't mind the warnings, in fact I like them, but is there a way to
> stop the splice function from removing the 'gap' sequence from the
> alignment?  Perhaps catching the warning and inserting the gaps
> afterwards might work, but I'm wondering if there's is a simpler
> modification of SimpleAlign.pm that might work.  Any thoughts?

Let us see some code, so we can get a better idea of what you're doing 
and what you've tried.

You can avoid losing sequences during a slice by not doing a slice. 
Instead, remove_columns(). This way you don't have to splice alignments 
together; you go from original alignment to 'spliced' version in one step.


From Oliver.Wafzig at sygnis.de  Mon Aug 20 08:42:55 2007
From: Oliver.Wafzig at sygnis.de (Oliver Wafzig)
Date: Mon, 20 Aug 2007 10:42:55 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
Message-ID: <200708201042.55292.Oliver.Wafzig@sygnis.de>

On Monday 20 August 2007 06:33, neeti somaiya wrote:
> Another question I had was, I am interested only in pdb id and title, and
> for this I am downloading and unzipping each of the full pdb structure
> files, parsing to get just id and title. Is there any other data source

Hi Neeti,
this is a non bioperl way to download the data.
Use the SRS server on the EBI page to download only id and title lines from 
pdb.

1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
2) Search for 'PDB' on the 'library page' and select it.
3) Use the standard query form. Select 'id' in the dropdown list and 
insert '*' (wildcard).
4) Create a view by selecting 'ID' and 'Title', then click the search button.
5) Click the save results button.
6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of entries 
to download' field. Press 'save'.

If the download is slow, read the 'download tips' on the download page and 
split the results in chunks. 

-- 
Oliver


From neetisomaiya at gmail.com  Mon Aug 20 13:05:01 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 18:35:01 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <200708201042.55292.Oliver.Wafzig@sygnis.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
Message-ID: <764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>

Thanks for your response.
Actually I am looking for something standalone and not on the web, as in
something which I can download onto my machine and parse later to get id and
title.

On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>
> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> > Another question I had was, I am interested only in pdb id and title,
> and
> > for this I am downloading and unzipping each of the full pdb structure
> > files, parsing to get just id and title. Is there any other data source
>
> Hi Neeti,
> this is a non bioperl way to download the data.
> Use the SRS server on the EBI page to download only id and title lines
> from
> pdb.
>
> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
> 2) Search for 'PDB' on the 'library page' and select it.
> 3) Use the standard query form. Select 'id' in the dropdown list and
> insert '*' (wildcard).
> 4) Create a view by selecting 'ID' and 'Title', then click the search
> button.
> 5) Click the save results button.
> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> entries
> to download' field. Press 'save'.
>
> If the download is slow, read the 'download tips' on the download page and
> split the results in chunks.
>
> --
> Oliver
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive


From bernd at kirx.de  Mon Aug 20 16:57:28 2007
From: bernd at kirx.de (Bernd Mueller)
Date: Mon, 20 Aug 2007 18:57:28 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
Message-ID: <46C9C7F8.3020608@kirx.de>

Hello,

Maybe you wanna try the Database-EUtilities module from bioperl. They 
are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

I tried them for a similar search on pubmed but without any reasonable 
results because my target was too focused.

 From EUtilities documentation on 
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases

"Protein Database

The Protein database contains sequence data from the translated coding 
regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein 
sequences submitted to Protein Information Resource (PIR), SWISS-PROT, 
Protein Research Foundation (PRF), and Protein Data Bank (PDB) 
(sequences from solved structures). "

So PDB is included in eutilities from NCBI.

Regards,
Bernd

neeti somaiya wrote:
> Thanks for your response.
> Actually I am looking for something standalone and not on the web, as in
> something which I can download onto my machine and parse later to get id and
> title.
> 
> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>> Another question I had was, I am interested only in pdb id and title,
>> and
>>> for this I am downloading and unzipping each of the full pdb structure
>>> files, parsing to get just id and title. Is there any other data source
>> Hi Neeti,
>> this is a non bioperl way to download the data.
>> Use the SRS server on the EBI page to download only id and title lines
>> from
>> pdb.
>>
>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>> 2) Search for 'PDB' on the 'library page' and select it.
>> 3) Use the standard query form. Select 'id' in the dropdown list and
>> insert '*' (wildcard).
>> 4) Create a view by selecting 'ID' and 'Title', then click the search
>> button.
>> 5) Click the save results button.
>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>> entries
>> to download' field. Press 'save'.
>>
>> If the download is slow, read the 'download tips' on the download page and
>> split the results in chunks.
>>
>> --
>> Oliver
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> 
> 
> 

-- 
Dipl.-Inform.(FH)
Bernd Mueller
phone: +49 179 2336692
email: bernd at kirx.de


From neetisomaiya at gmail.com  Mon Aug 20 17:39:01 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Mon, 20 Aug 2007 23:09:01 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C9C7F8.3020608@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
Message-ID: <764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>

Hi,

Thanks for all the responses.
I got the solution from RCBS people :-

Dear Dr. Somaiya,

Thank you for your email message.

Please try the following:
1) Go to http://www.pdb.org/pdb/statistics/holdings.do and select the
number in the bottom right corner of the table (currently 45213).
2) From the menu on the left select 'Tabulate'>>'Custom Report' and
under 'Primary Citation' select 'Title'
3) At the bottom, select 'Create Report' and then one of the 'Download'
options.

Please let us know if we can be of additional assistance.

Sincerely,
Rachel Green

On 8/20/07, Bernd Mueller <bernd at kirx.de> wrote:
>
> Hello,
>
> Maybe you wanna try the Database-EUtilities module from bioperl. They
> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> I tried them for a similar search on pubmed but without any reasonable
> results because my target was too focused.
>
> From EUtilities documentation on
>
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases
>
> "Protein Database
>
> The Protein database contains sequence data from the translated coding
> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> (sequences from solved structures). "
>
> So PDB is included in eutilities from NCBI.
>
> Regards,
> Bernd
>
> neeti somaiya wrote:
> > Thanks for your response.
> > Actually I am looking for something standalone and not on the web, as in
> > something which I can download onto my machine and parse later to get id
> and
> > title.
> >
> > On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
> >> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> >>> Another question I had was, I am interested only in pdb id and title,
> >> and
> >>> for this I am downloading and unzipping each of the full pdb structure
> >>> files, parsing to get just id and title. Is there any other data
> source
> >> Hi Neeti,
> >> this is a non bioperl way to download the data.
> >> Use the SRS server on the EBI page to download only id and title lines
> >> from
> >> pdb.
> >>
> >> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
> >> 2) Search for 'PDB' on the 'library page' and select it.
> >> 3) Use the standard query form. Select 'id' in the dropdown list and
> >> insert '*' (wildcard).
> >> 4) Create a view by selecting 'ID' and 'Title', then click the search
> >> button.
> >> 5) Click the save results button.
> >> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> >> entries
> >> to download' field. Press 'save'.
> >>
> >> If the download is slow, read the 'download tips' on the download page
> and
> >> split the results in chunks.
> >>
> >> --
> >> Oliver
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >
> >
> >
>
> --
> Dipl.-Inform.(FH)
> Bernd Mueller
> phone: +49 179 2336692
> email: bernd at kirx.de
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
-Neeti
Even my blood says, B positive


From jaudall at gmail.com  Mon Aug 20 18:30:26 2007
From: jaudall at gmail.com (Joshua Udall)
Date: Mon, 20 Aug 2007 12:30:26 -0600
Subject: [Bioperl-l] concatenating aln splices
In-Reply-To: <46C94631.2060704@sendu.me.uk>
References: <52cea20c0708192139r3886fe71j58f69a0aaa8c8a4f@mail.gmail.com>
	<46C94631.2060704@sendu.me.uk>
Message-ID: <52cea20c0708201130u29af2e10w78a852d7f88c23d1@mail.gmail.com>

Thanks, Sendu!  That suggestion was exactly what I needed.  I have it worked
out now with the remove_columns function.  Much easier that way :)

Josh

On 8/20/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> Joshua Udall wrote:
> > Based on several criteria, I've extracted several splices from a
> > single alignment and I'm trying to concatenate my selected sequences
> > together.  Unfortunately, one of the sequences in the original
> > alignment only has gap characters for one or more of the splices.  I'd
> > like to keep the gap splices because other downstream aligned bases
> > depend on them.
> [snip]
> > I don't mind the warnings, in fact I like them, but is there a way to
> > stop the splice function from removing the 'gap' sequence from the
> > alignment?  Perhaps catching the warning and inserting the gaps
> > afterwards might work, but I'm wondering if there's is a simpler
> > modification of SimpleAlign.pm that might work.  Any thoughts?
>
> Let us see some code, so we can get a better idea of what you're doing
> and what you've tried.
>
> You can avoid losing sequences during a slice by not doing a slice.
> Instead, remove_columns(). This way you don't have to splice alignments
> together; you go from original alignment to 'spliced' version in one step.
>


From cjfields at uiuc.edu  Mon Aug 20 18:51:14 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 20 Aug 2007 13:51:14 -0500
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46C9C7F8.3020608@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
Message-ID: <4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>

Just curious, but what kind of query were you trying?  It might be  
worth trying to work through it to add as an example to the cookbook  
page.

chris

On Aug 20, 2007, at 11:57 AM, Bernd Mueller wrote:

> Hello,
>
> Maybe you wanna try the Database-EUtilities module from bioperl. They
> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> I tried them for a similar search on pubmed but without any reasonable
> results because my target was too focused.
>
>  From EUtilities documentation on
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi? 
> rid=helpentrez.section.EntrezHelp.The_Databases
>
> "Protein Database
>
> The Protein database contains sequence data from the translated coding
> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as  
> protein
> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> (sequences from solved structures). "
>
> So PDB is included in eutilities from NCBI.
>
> Regards,
> Bernd
>
> neeti somaiya wrote:
>> Thanks for your response.
>> Actually I am looking for something standalone and not on the web,  
>> as in
>> something which I can download onto my machine and parse later to  
>> get id and
>> title.
>>
>> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>>> Another question I had was, I am interested only in pdb id and  
>>>> title,
>>> and
>>>> for this I am downloading and unzipping each of the full pdb  
>>>> structure
>>>> files, parsing to get just id and title. Is there any other data  
>>>> source
>>> Hi Neeti,
>>> this is a non bioperl way to download the data.
>>> Use the SRS server on the EBI page to download only id and title  
>>> lines
>>> from
>>> pdb.
>>>
>>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>>> 2) Search for 'PDB' on the 'library page' and select it.
>>> 3) Use the standard query form. Select 'id' in the dropdown list and
>>> insert '*' (wildcard).
>>> 4) Create a view by selecting 'ID' and 'Title', then click the  
>>> search
>>> button.
>>> 5) Click the save results button.
>>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>>> entries
>>> to download' field. Press 'save'.
>>>
>>> If the download is slow, read the 'download tips' on the download  
>>> page and
>>> split the results in chunks.
>>>
>>> --
>>> Oliver
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>>
>
> -- 
> Dipl.-Inform.(FH)
> Bernd Mueller
> phone: +49 179 2336692
> email: bernd at kirx.de
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bernd at kirx.de  Mon Aug 20 19:03:29 2007
From: bernd at kirx.de (Bernd Mueller)
Date: Mon, 20 Aug 2007 21:03:29 +0200
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
Message-ID: <46C9E581.1010907@kirx.de>

I attached my script.

Actually I tried to download all articles to a certain search term with
that script. The problem was that the retrieved documents were not free
as mentioned in the documentation of EUtilities on the NCBI page. So
many of the downloaded documents in xml-format were just dummies
containing only the abstract but not the fulltext article.

Bernd

Chris Fields wrote:
> Just curious, but what kind of query were you trying?  It might be worth 
> trying to work through it to add as an example to the cookbook page.
> 
> chris
> 
> On Aug 20, 2007, at 11:57 AM, Bernd Mueller wrote:
> 
>> Hello,
>>
>> Maybe you wanna try the Database-EUtilities module from bioperl. They
>> are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>>
>> I tried them for a similar search on pubmed but without any reasonable
>> results because my target was too focused.
>>
>>  From EUtilities documentation on
>> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases 
>>
>>
>> "Protein Database
>>
>> The Protein database contains sequence data from the translated coding
>> regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
>> sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
>> Protein Research Foundation (PRF), and Protein Data Bank (PDB)
>> (sequences from solved structures). "
>>
>> So PDB is included in eutilities from NCBI.
>>
>> Regards,
>> Bernd
>>
>> neeti somaiya wrote:
>>> Thanks for your response.
>>> Actually I am looking for something standalone and not on the web, as in
>>> something which I can download onto my machine and parse later to get 
>>> id and
>>> title.
>>>
>>> On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
>>>> On Monday 20 August 2007 06:33, neeti somaiya wrote:
>>>>> Another question I had was, I am interested only in pdb id and title,
>>>> and
>>>>> for this I am downloading and unzipping each of the full pdb structure
>>>>> files, parsing to get just id and title. Is there any other data 
>>>>> source
>>>> Hi Neeti,
>>>> this is a non bioperl way to download the data.
>>>> Use the SRS server on the EBI page to download only id and title lines
>>>> from
>>>> pdb.
>>>>
>>>> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk).
>>>> 2) Search for 'PDB' on the 'library page' and select it.
>>>> 3) Use the standard query form. Select 'id' in the dropdown list and
>>>> insert '*' (wildcard).
>>>> 4) Create a view by selecting 'ID' and 'Title', then click the search
>>>> button.
>>>> 5) Click the save results button.
>>>> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
>>>> entries
>>>> to download' field. Press 'save'.
>>>>
>>>> If the download is slow, read the 'download tips' on the download 
>>>> page and
>>>> split the results in chunks.
>>>>
>>>> -- 
>>>> Oliver
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>>
>>>
>>
>> --Dipl.-Inform.(FH)
>> Bernd Mueller
>> phone: +49 179 2336692
>> email: bernd at kirx.de
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> 
> 
> 

-- 
Dipl.-Inform.(FH)
Bernd Mueller
phone: +49 179 2336692
email: bernd at kirx.de


-------------- next part --------------
A non-text attachment was scrubbed...
Name: myBioPerl.pl
Type: application/x-perl
Size: 1983 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070820/af579f0a/attachment.pl>

From jayoung at fhcrc.org  Mon Aug 20 22:09:04 2007
From: jayoung at fhcrc.org (Janet Young)
Date: Mon, 20 Aug 2007 15:09:04 -0700
Subject: [Bioperl-l] Assembly::IO write_assembly and remove_seq
Message-ID: <EE800ED8-52E7-4D80-A18F-EDBABB90056C@fhcrc.org>

Hi all,

I realized last week that write_assembly isn't implemented in  
Assemble::IO
(see http://bioperl.org/pipermail/bioperl-l/2006-May/021619.html )
I know this has been asked before, but I wondered if anything has  
changed - does anyone have any plans to write a write_assembly  
method? Alternatively, any suggestions for an alternative solution to  
what I'm trying to do?

I'm trying to write a script to make improvements to the assembly  
that phredPhrap comes out with - it seems to quite frequently throw  
an unrelated sequence into a contig with either no matching sequence  
at all, or very little matching sequence. Mysterious. Anyway, my  
script can recognize the bad sequences easily enough, and thought I'd  
be able to remove them and then write the modified assembly. No joy.  
One very inelegant solution I've played with is that I can add some  
"markedHighQuality" tags to the discrepant sequences in the ace file,  
meaning that next time phredPhrap is run, it sometimes manages not to  
assemble the sequences that shouldn't be there. I'm not sure this  
will work in all cases, and it seems like quite an unsatisfactory way  
to do it.

For the same reason, I'm hoping someone can tell me what remove_seq  
does to a contig object? I'm using it and I don't get any error  
messages (returns 1), but when I check the contig object afterwards  
with get_seq_ids, the sequence I wanted to remove didn't seem to go  
away. Also, when I check out the primary_tags for that contig in the  
objects returned by get_features_collection, nothing seems to have  
changed. So I'm not sure whether the sequence really was removed from  
anything at all, and if it was, which object did it get removed  
from?  (a snippet of my code is below)
           my @seqids  = $contig->get_seq_ids();
           print OUT "seqids @seqids\n";
           my $seqobj = $contig->get_seq_by_name($seq);
           $contig->remove_seq($seqobj) || die "failed to remove seq\n";
           @seqids  = $contig->get_seq_ids();
           print OUT "seqids @seqids\n";

thanks for any advice,

Janet Young


-------------------------------------------------------------------

Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung at fhcrc.org

http://www.fhcrc.org/labs/trask/

-------------------------------------------------------------------


From cjfields at uiuc.edu  Tue Aug 21 04:06:26 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 20 Aug 2007 23:06:26 -0500
Subject: [Bioperl-l] EUtilities, was Re:  PDB Parser
In-Reply-To: <46C9E581.1010907@kirx.de>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<4EAE752E-CACB-41AF-BF55-7A83071CE590@uiuc.edu>
	<46C9E581.1010907@kirx.de>
Message-ID: <7BE17595-9BC0-498B-AFA9-03ED0C853BFC@uiuc.edu>

Bernd,

Just in case you weren't aware, I have changed several aspects of  
EUtilities since the 1.5.2 release, so any code in the HOWTO cookbook  
applies ONLY to the version found in CVS (there is a big note at the  
top stating such).  This should be the finalized API which I intend  
on supporting from this point on.  The reason I indicate that is  
there are several giveaways which indicate you are using the older  
API from 1.5.2 (using next_cookie, for instance).

The following modification of your script (using the API in bioperl- 
live) works for me.  You should be able to do something similar with  
the older API as well but I haven't tried.  Note that PMC full-text  
retrieval only works if the article is declared 'open-access'; not  
all journals allow that.  Also, any full-text is only available as  
XML which (I'm guessing here) is transformed to HTML for PMC.

....
my $agent = Bio::DB::EUtilities->new(-eutil      => 'esearch',
-db         => $db,
-term       => $query,
-usehistory => 'y');

my $ct = $agent->get_count;

print "Count = $ct\n";

my $history = $agent->next_History;

if ($fetch eq 'yes') {
   my ($retmax, $retstart) = (1,0);
   while ($retstart < $ct) {
	  $agent->set_parameters(
               -eutil => 'efetch',
               -history => $history,
               -rettype => 'xml',
               -retmax => $retmax,
               -retstart => $retstart,
		  );
           $agent->get_Response(-file => ">./papers/paper_ 
$retstart.xml");
           $retstart += $retmax;
   }
}

------------------------------

It may also be possible to grab the LinkOut for these and try to nab  
the PDF or use the DOI, but I haven't tried anything like that.

chris

On Aug 20, 2007, at 2:03 PM, Bernd Mueller wrote:

> I attached my script.
>
> Actually I tried to download all articles to a certain search term  
> with
> that script. The problem was that the retrieved documents were not  
> free
> as mentioned in the documentation of EUtilities on the NCBI page. So
> many of the downloaded documents in xml-format were just dummies
> containing only the abstract but not the fulltext article.
>
> Bernd
>
> Chris Fields wrote:
>> Just curious, but what kind of query were you trying?  It might be  
>> worth trying to work through it to add as an example to the  
>> cookbook page.
>> chris


From n.haigh at sheffield.ac.uk  Tue Aug 21 08:19:59 2007
From: n.haigh at sheffield.ac.uk (Nathan Haigh)
Date: Tue, 21 Aug 2007 09:19:59 +0100
Subject: [Bioperl-l] subversion progress
Message-ID: <46CAA02F.60803@sheffield.ac.uk>

Hi,

I was just wondering if there was any further progress towards the svn
migration recently? What is still needing to be done?

Cheers
Nath


From neetisomaiya at gmail.com  Tue Aug 21 09:41:22 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Tue, 21 Aug 2007 15:11:22 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
Message-ID: <764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>

Hi,

I wanted to automate my pdb script, right from downloading of data. As per
the solution given by RCSB about custom report for pdb ids and titles only,
I was trying something like the code below, but it doesnt seem to work :-

my $url = '
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv';
use LWP::Simple;
my $content = get $url;
die "Couldn't get $url" unless defined $content;

Can anyone tell how I can do it, if there is any other way to do it, or if I
am going wrong somewhere, or if it is't possible for this case at all.

Please help.

On 8/20/07, neeti somaiya <neetisomaiya at gmail.com> wrote:
>
> Hi,
>
> Thanks for all the responses.
> I got the solution from RCBS people :-
>
> Dear Dr. Somaiya,
>
> Thank you for your email message.
>
> Please try the following:
> 1) Go to http://www.pdb.org/pdb/statistics/holdings.do and select the
> number in the bottom right corner of the table (currently 45213).
> 2) From the menu on the left select 'Tabulate'>>'Custom Report' and
> under 'Primary Citation' select 'Title'
> 3) At the bottom, select 'Create Report' and then one of the 'Download'
> options.
>
> Please let us know if we can be of additional assistance.
>
> Sincerely,
> Rachel Green
>
> On 8/20/07, Bernd Mueller <bernd at kirx.de> wrote:
> >
> > Hello,
> >
> > Maybe you wanna try the Database-EUtilities module from bioperl. They
> > are described on http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
> >
> > I tried them for a similar search on pubmed but without any reasonable
> > results because my target was too focused.
> >
> > From EUtilities documentation on
> >
> > http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpentrez.section.EntrezHelp.The_Databases
> >
> > "Protein Database
> >
> > The Protein database contains sequence data from the translated coding
> > regions from DNA sequences in GenBank, EMBL, and DDBJ as well as protein
> >
> > sequences submitted to Protein Information Resource (PIR), SWISS-PROT,
> > Protein Research Foundation (PRF), and Protein Data Bank (PDB)
> > (sequences from solved structures). "
> >
> > So PDB is included in eutilities from NCBI.
> >
> > Regards,
> > Bernd
> >
> > neeti somaiya wrote:
> > > Thanks for your response.
> > > Actually I am looking for something standalone and not on the web, as
> > in
> > > something which I can download onto my machine and parse later to get
> > id and
> > > title.
> > >
> > > On 8/20/07, Oliver Wafzig <Oliver.Wafzig at sygnis.de> wrote:
> > >> On Monday 20 August 2007 06:33, neeti somaiya wrote:
> > >>> Another question I had was, I am interested only in pdb id and
> > title,
> > >> and
> > >>> for this I am downloading and unzipping each of the full pdb
> > structure
> > >>> files, parsing to get just id and title. Is there any other data
> > source
> > >> Hi Neeti,
> > >> this is a non bioperl way to download the data.
> > >> Use the SRS server on the EBI page to download only id and title
> > lines
> > >> from
> > >> pdb.
> > >>
> > >> 1) Point your browser to the SRS page (http://srs.ebi.ac.uk ).
> > >> 2) Search for 'PDB' on the 'library page' and select it.
> > >> 3) Use the standard query form. Select 'id' in the dropdown list and
> > >> insert '*' (wildcard).
> > >> 4) Create a view by selecting 'ID' and 'Title', then click the search
> > >> button.
> > >> 5) Click the save results button.
> > >> 6) Select 'file' in the 'output to' area and 'ALL' in the 'Number of
> > >> entries
> > >> to download' field. Press 'save'.
> > >>
> > >> If the download is slow, read the 'download tips' on the download
> > page and
> > >> split the results in chunks.
> > >>
> > >> --
> > >> Oliver
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > >>
> > >
> > >
> > >
> >
> > --
> > Dipl.-Inform.(FH)
> > Bernd Mueller
> > phone: +49 179 2336692
> > email: bernd at kirx.de
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
>
>
>
> --
> -Neeti
> Even my blood says, B positive
>


-- 
-Neeti
Even my blood says, B positive


From cjfields at uiuc.edu  Tue Aug 21 14:40:03 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 09:40:03 -0500
Subject: [Bioperl-l] subversion progress
In-Reply-To: <46CAA02F.60803@sheffield.ac.uk>
References: <46CAA02F.60803@sheffield.ac.uk>
Message-ID: <5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>

Not sure myself, to tell the truth.  Pretty much everything was ready  
to go (i.e. svn commits work, commits post to bioperl-guts, etc.);  
the only possible exception was svn->cvs syncing.  I believe the  
decision for svn access is to stick with ssh only for now for  
simplicity's sake.  I may have to go back into the archives to  
refresh my memory on all the details...

I think a time for the switchover just has to be set so that  
everybody is adequately forewarned, and the docs for getting started  
on SVN need to be updated accordingly.

chris

On Aug 21, 2007, at 3:19 AM, Nathan Haigh wrote:

> Hi,
>
> I was just wondering if there was any further progress towards the svn
> migration recently? What is still needing to be done?
>
> Cheers
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jwalker at watson.wustl.edu  Tue Aug 21 15:20:46 2007
From: jwalker at watson.wustl.edu (Jason Walker)
Date: Tue, 21 Aug 2007 10:20:46 -0500
Subject: [Bioperl-l] RemoteBlast not handling NCBI Error message
Message-ID: <46CB02CE.1080803@watson.wustl.edu>

I've noticed RemoteBlast does not handle a specific error message from 
NCBI correctly.  retrieve_blast() should return 0 if waiting, -1 on 
error, or the results when completed.  It looks like the method relies 
on a specific tag in the NCBI return,  'QBlastInfoBegin'.  The error 
message I'm getting does not have this tag or a value of 
'Status=ERROR'.  After contacting NCBI 'Blast-help', they stated that 
QBlastInfoBegin should not be expected from all GET requests.  The error 
can be reproduced by using RID CM2YJJW501R, until it expires tomorrow.

my $rid = 'CM2YJJW501R';
my $factory = Bio::Tools::Run::RemoteBlast->new( -verbose => 1,);
my $rc = $factory->retrieve_blast($rid);
print $rc ."\n";

The content returned from NCBI looks like:
<hr><font color="red">ERROR: An error has occurred on the server, Too 
many HSPs to save all
 Contact Blast-help at ncbi.nlm.nih.gov and include your RID: 
CM2YJJW501R</font><hr>

I added a conditional statement as seen below to correct my local copy.  
I'm not sure this is the best fix, but it works.
sub retrieve_blast {
    ...
    if( /QBlastInfoBegin/i ) {
        $s = 1;
    } elsif( $s ) {
        if( /Status=(WAITING|ERROR|READY)/i ) {
            ...
         }
    } elsif( /^(?:#\s)?[\w-]*?BLAST\w+/ ) {
        $waiting = 0;
        last;
    } elsif ( /ERROR/i ) {
        close($TMP);
        open(my $ERR, "<$tempfile") or $self->throw("cannot open file 
$tempfile");
        $self->warn(join("", <$ERR>));
        close $ERR;
        return -1;
    }
    ...
}

Thanks,
Jason Walker


From cjfields at uiuc.edu  Tue Aug 21 16:15:36 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 11:15:36 -0500
Subject: [Bioperl-l] RemoteBlast not handling NCBI Error message
In-Reply-To: <46CB02CE.1080803@watson.wustl.edu>
References: <46CB02CE.1080803@watson.wustl.edu>
Message-ID: <348D8645-5DC2-4606-9650-EB08D8053F3D@uiuc.edu>


On Aug 21, 2007, at 10:20 AM, Jason Walker wrote:

> I've noticed RemoteBlast does not handle a specific error message from
> NCBI correctly.  retrieve_blast() should return 0 if waiting, -1 on
> error, or the results when completed.  It looks like the method relies
> on a specific tag in the NCBI return,  'QBlastInfoBegin'.  The error
> message I'm getting does not have this tag or a value of
> 'Status=ERROR'.  After contacting NCBI 'Blast-help', they stated that
> QBlastInfoBegin should not be expected from all GET requests.  The  
> error
> can be reproduced by using RID CM2YJJW501R, until it expires tomorrow.
> ...
> I added a conditional statement as seen below to correct my local  
> copy.
> I'm not sure this is the best fix, but it works.
> sub retrieve_blast {
>     ...
>     if( /QBlastInfoBegin/i ) {
>         $s = 1;
>     } elsif( $s ) {
>         if( /Status=(WAITING|ERROR|READY)/i ) {
>             ...
>          }
>     } elsif( /^(?:#\s)?[\w-]*?BLAST\w+/ ) {
>         $waiting = 0;
>         last;
>     } elsif ( /ERROR/i ) {
>         close($TMP);
>         open(my $ERR, "<$tempfile") or $self->throw("cannot open file
> $tempfile");
>         $self->warn(join("", <$ERR>));
>         close $ERR;
>         return -1;
>     }
>     ...
> }
>
> Thanks,
> Jason Walker

I have added this to RemoteBlast in bioperl cvs.  Thanks for the notice!

chris


From bernd.web at gmail.com  Tue Aug 21 16:32:09 2007
From: bernd.web at gmail.com (Bernd Web)
Date: Tue, 21 Aug 2007 18:32:09 +0200
Subject: [Bioperl-l] SearchIO-BLAST
Message-ID: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>

Dear all,

Recently, I stumbled on something with parsing BLAST reports.  To a
plain text blast report from NCBI a ">aaa" got prepended. This
(fasta-like header) changes the $result->hits array.
The amount of hits is now 2*num_hits + 1. Clearly, this is related to
faulty input, but still the effect of this line is great. Does someone
see what is causing this, and should the BLAST parser maybe be
slightly more relaxed wrt pre/appended text? I have not seen yet why
this extra fastaheader line has such a "large" effect.

A short example BLASTN output is attached.
Example code is:

use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'blast',
                           -file   => 'apoe_plain.bls');
while( my $result = $in->next_result ) {
  print "Num of hits: ", $result->num_hits, "\n";
  my @hits = $result->hits;
  foreach my $el (@hits) {
  	print $el->name, "\n";
  }


Kind regards,
Bernd
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apoe_plain.bls
Type: application/octet-stream
Size: 7890 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070821/a367809e/attachment-0004.obj>

From cjfields at uiuc.edu  Tue Aug 21 21:53:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 21 Aug 2007 16:53:44 -0500
Subject: [Bioperl-l] SearchIO-BLAST
In-Reply-To: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>
References: <716af09c0708210932m34bfb2a7o2094124a8832d705@mail.gmail.com>
Message-ID: <59FF775C-8CAC-4947-A5BA-835ADD45CD32@uiuc.edu>

I can confirm this (I'm using bioperl-live).  The output I get is:

Num of hits: 9
ref|NM_000039.1|
ref|NT_113960.1|Hs22_111679
ref|NT_033899.7|Hs11_34054
ref|NW_925173.1|HsCraAADB02_444
ref|NM_000039.1|
ref|NT_113960.1|Hs22_111679
ref|NT_033899.7|Hs11_34054
ref|NW_925173.1|HsCraAADB02_444
ref|NW_925173.1|HsCraAADB02_444

The extra '>' is definitely throwing the event calls for a loop; the  
2x increase is b/c an extra iteration is started when '>' is  
encountered (changing the event handler reduces the number to 5).   
The extra hit is from the '>' at the beginning.

I hate to say it, but this is an instance where we can't be more  
flexible, primarily b/c '>' is a legit token the parser looks for (it  
is the beginning of the hit block in reports).  Finding it as the  
initial token in the report is also legitimate for some older BLAST  
output, so we also can't simply bypass it.  You'll unfortunately have  
to preparse the reports to get rid of those lines prior to feeding  
them to the BLAST text report parser.

chris

On Aug 21, 2007, at 11:32 AM, Bernd Web wrote:

> Dear all,
>
> Recently, I stumbled on something with parsing BLAST reports.  To a
> plain text blast report from NCBI a ">aaa" got prepended. This
> (fasta-like header) changes the $result->hits array.
> The amount of hits is now 2*num_hits + 1. Clearly, this is related to
> faulty input, but still the effect of this line is great. Does someone
> see what is causing this, and should the BLAST parser maybe be
> slightly more relaxed wrt pre/appended text? I have not seen yet why
> this extra fastaheader line has such a "large" effect.
>
> A short example BLASTN output is attached.
> Example code is:
>
> use Bio::SearchIO;
> my $in = new Bio::SearchIO(-format => 'blast',
>                            -file   => 'apoe_plain.bls');
> while( my $result = $in->next_result ) {
>   print "Num of hits: ", $result->num_hits, "\n";
>   my @hits = $result->hits;
>   foreach my $el (@hits) {
>   	print $el->name, "\n";
>   }
>
>
> Kind regards,
> Bernd
> <apoe_plain.bls>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Wed Aug 22 03:03:55 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 21 Aug 2007 23:03:55 -0400
Subject: [Bioperl-l] subversion progress
In-Reply-To: <5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>
References: <46CAA02F.60803@sheffield.ac.uk>
	<5C65BAED-61CF-4028-977E-0CD451FA2EC3@uiuc.edu>
Message-ID: <51A5996D-A976-47FD-8807-20F6EBAF9E42@gmx.net>


On Aug 21, 2007, at 10:40 AM, Chris Fields wrote:

> I think a time for the switchover just has to be set so that
> everybody is adequately forewarned, and the docs for getting started
> on SVN need to be updated accordingly.

That was my recollection too. -hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From bix at sendu.me.uk  Wed Aug 22 07:51:42 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 22 Aug 2007 08:51:42 +0100
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>	<46C5A405.2070005@sendu.me.uk>	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>	<200708201042.55292.Oliver.Wafzig@sygnis.de>	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>	<46C9C7F8.3020608@kirx.de>	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
	<764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
Message-ID: <46CBEB0E.8030200@sendu.me.uk>

neeti somaiya wrote:
> Hi,
> 
> I wanted to automate my pdb script, right from downloading of data. As per
> the solution given by RCSB about custom report for pdb ids and titles only,
> I was trying something like the code below, but it doesnt seem to work :-
> 
> my $url = '
> http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
> VStructureSummary.structureId~VCitation.title&format=csv';
> use LWP::Simple;
> my $content = get $url;
> die "Couldn't get $url" unless defined $content;
> 
> Can anyone tell how I can do it, if there is any other way to do it, or if I
> am going wrong somewhere, or if it is't possible for this case at all.

Use LWP::UserAgent so you can see what's going on.

my $ua = LWP::UserAgent->new;
$ua->timeout(10);
my $response = $ua->get($url);
if ($response->is_success) {
   print $response->content;
}
else {
   die $response->status_line;
}


Gives:
500 Internal Server Error

Most likely the server is expecting some kind of cookie and falls over 
when you try to visit that url without it. So start where they told you 
to and grab pages successively, keeping any cookies.


From neetisomaiya at gmail.com  Wed Aug 22 10:06:38 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Wed, 22 Aug 2007 15:36:38 +0530
Subject: [Bioperl-l] PDB Parser
In-Reply-To: <46CBEB0E.8030200@sendu.me.uk>
References: <764978cf0708152256w38b2bfa7ic4d196ae08a46bd0@mail.gmail.com>
	<46C5A405.2070005@sendu.me.uk>
	<764978cf0708192133i3d4ef031ic34d11f1f91e59b7@mail.gmail.com>
	<200708201042.55292.Oliver.Wafzig@sygnis.de>
	<764978cf0708200605j2dd63973p9cd88787b3acdbc8@mail.gmail.com>
	<46C9C7F8.3020608@kirx.de>
	<764978cf0708201039g53b29f29i36eed1a7acd5a892@mail.gmail.com>
	<764978cf0708210241h4c4b802en8ec2f6e9b0c01a74@mail.gmail.com>
	<46CBEB0E.8030200@sendu.me.uk>
Message-ID: <764978cf0708220306u77cedf22xdd132b324e306f33@mail.gmail.com>

Thanks a lot. It worked for me.

use LWP::UserAgent;
use HTTP::Cookies;

$ua = LWP::UserAgent->new;
$ua->cookie_jar(HTTP::Cookies->new(file => "lwpcookies.txt",
                                     autosave => 1));

$request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=HoldingsQuery&moleculeType=ignore&experimentalMethod=ignore'
);

$response = $ua->request($request);

if ($response->is_success)
{
        print "\nSuccessfully connected to url
http://www.pdb.org/pdb/search/smartSubquery.do?smartSearchSubtype=HoldingsQuery&moleculeType=ignore&experimentalMethod=ignore\n
";

        $request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/results/tabularForm.do');

        $response = $ua->request($request);

        if ($response->is_success)
        {
                print "\nSuccessfully connected to url
http://www.pdb.org/pdb/results/tabularForm.do\n";

                $request = HTTP::Request->new('GET', '
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv');

                $response = $ua->request($request);

                if ($response->is_success)
                {
                        print "\nSuccessfully connected to url
http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
VStructureSummary.structureId~VCitation.title&format=csv\n";
                       open(FH,">tabularResults.csv");
                        print FH $response->content;
                        close(FH);
                }
                else
                {
                        die $response->status_line;
                }
        }
        else
        {
                die $response->status_line;
        }
}
else
{
  die $response->status_line;
}


On 8/22/07, Sendu Bala <bix at sendu.me.uk> wrote:
>
> neeti somaiya wrote:
> > Hi,
> >
> > I wanted to automate my pdb script, right from downloading of data. As
> per
> > the solution given by RCSB about custom report for pdb ids and titles
> only,
> > I was trying something like the code below, but it doesnt seem to work
> :-
> >
> > my $url = '
> >
> http://www.pdb.org/pdb/results/tabularReport.do?reportTitle=CustomReport&customReportColumns=
> > VStructureSummary.structureId~VCitation.title&format=csv';
> > use LWP::Simple;
> > my $content = get $url;
> > die "Couldn't get $url" unless defined $content;
> >
> > Can anyone tell how I can do it, if there is any other way to do it, or
> if I
> > am going wrong somewhere, or if it is't possible for this case at all.
>
> Use LWP::UserAgent so you can see what's going on.
>
> my $ua = LWP::UserAgent->new;
> $ua->timeout(10);
> my $response = $ua->get($url);
> if ($response->is_success) {
>    print $response->content;
> }
> else {
>    die $response->status_line;
> }
>
>
> Gives:
> 500 Internal Server Error
>
> Most likely the server is expecting some kind of cookie and falls over
> when you try to visit that url without it. So start where they told you
> to and grab pages successively, keeping any cookies.
>


-- 
-Neeti
Even my blood says, B positive


From jay at jays.net  Wed Aug 22 12:54:29 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 22 Aug 2007 07:54:29 -0500
Subject: [Bioperl-l] wiki: Current Events
Message-ID: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>

http://www.bioperl.org/wiki/Main_Page

Please change:

< BOSC 2007 will be held July 19-20, 2007
 > BOSC 2007 was held July 19-20, 2007

I'd change it but the page is locked. Even when I'm logged in.   :)

Thanks,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Wed Aug 22 13:58:32 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 22 Aug 2007 08:58:32 -0500
Subject: [Bioperl-l] wiki: Current Events
In-Reply-To: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>
References: <24715480-EC15-493F-85C9-C367348E28F1@jays.net>
Message-ID: <A7C5314E-662C-4160-85B1-0225B95C0BD2@uiuc.edu>

Done.

chris

On Aug 22, 2007, at 7:54 AM, Jay Hannah wrote:

> http://www.bioperl.org/wiki/Main_Page
>
> Please change:
>
> < BOSC 2007 will be held July 19-20, 2007
>> BOSC 2007 was held July 19-20, 2007
>
> I'd change it but the page is locked. Even when I'm logged in.   :)
>
> Thanks,
>
> Jay Hannah
> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From shameer at ncbs.res.in  Wed Aug 22 19:45:42 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Thu, 23 Aug 2007 01:15:42 +0530 (IST)
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw image according to
 input file ?
In-Reply-To: <A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
Message-ID: <44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>

Dear All,

Is there any option in Bio::Graphics to draw image based on the hits as
explained in the hits file.

For example I am using an input file:
# hit   score   start   end
Query   0       1       101
Sequence_Segment_1      0       1       101
PD:LRR_1|CS:AAC34139        0.16        1        23
PD:LRR_1|CS:AAC34139        3.6        1        22
PD:LRR_1|CS:AAC34139        1.8        1        22
PD:LRR_1|CS:AAC34139        1.3        1        22
PD:LRR_1|CS:XP_640228        2.5        2        23
..... Cropped
PD:LRR_1|CS:NP_611007        55        3        23
PD:LRR_1|CS:NP_611007        3.7        3        24
PD:LRR_1|CS:NP_611007        4.5        3        24
PD:LRR_1|CS:NP_611007        0.71        3        24
If you look at the image, you can see that, its all jumbled up and it
doesnt make any sense in the first look. I am looking for an option to
draw each of the  glyph one by one (say \n), rather that accomodating it
internally by the Bio::Graphics.

PS. Image is attached with this mail.
I am using  Dr. L. Stein's example :

use strict;
use Bio::Graphics;
use Bio::SeqFeature::Generic;
my $panel = Bio::Graphics::Panel->new(-length => 700,
                                      -width  => 800,
                                      -pad_left => 10,
                                      -pad_right => 10,
                                     );

my $full_length = Bio::SeqFeature::Generic->new(-start=>1,-end=>700);
$panel->add_track($full_length,
                  -glyph   => 'arrow',
                  -tick    => 2,
                  -fgcolor => 'black',
                  -double  => 1,
                 );

my $track = $panel->add_track(
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.png
Type: image/png
Size: 27974 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/be285f43/attachment-0004.png>

From cjfields at uiuc.edu  Thu Aug 23 04:53:55 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 22 Aug 2007 23:53:55 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
Message-ID: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>

As many of the devs know, there are a number of Feature/Annotation  
issues that need to be resolved prior to a 1.6 release:

http://www.bioperl.org/wiki/Release_Schedule#SeqFeature. 
2FAnnotation_changes:_Keep_or_roll_back.3F

There has been little work done over the last 2 1/2 years to undo or  
rectify problems associated with those additions; I feel like those  
of us still routinely contributing have been left holding the bag.   
There has also been very little attempt to document any of this  
adequately enough; as an example see POD for  
Bio::SeqFeature::Annotated (what little there is).

I would like to suggest the radical idea of rolling back AnnotatableI/ 
SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags  
are simple scalars) and possibly work in implementing Ewan's  
SeqFeature::TypedSeqFeatureI for those who want strong data types  
(i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various  
AnnotatableI changes, odd inheritance, and operator overloading have  
really obfuscated the code to the point where no one wants to touch  
it in case it breaks something important.  However, I believe it is  
the one serious impediment to a new stable release.

My thought is we simplify all the relevant interfaces, essentially  
reverting back to rel 1.4.  For instance, we move the various  
Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.   
Bio::SeqFeature::Annotated would implement Bio::AnnotatableI  
directly, and (if needed) also implement  
Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on  
Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI  
methods correctly, just as any other class would when implementing an  
abstract interface.  I have played around with this a bit and managed  
to get most tests working again for Bio::SeqFeature::Generic and  
FeatureIO but a number of others break.

If needed I can try this out on a branch (a bit ironic, since the  
changes instigating this mess should have been tested on a branch!).   
Maybe this will get the ball rolling towards a 1.6 release.  Any  
thoughts?

chris


From shameer at ncbs.res.in  Thu Aug 23 07:06:34 2007
From: shameer at ncbs.res.in (Shameer Khadar)
Date: Thu, 23 Aug 2007 12:36:34 +0530 (IST)
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw image
 according to input file ?
In-Reply-To: <44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
	<44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
Message-ID: <34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>

Dear All,

I will make my question simple :
Is there any way to force the 'Bio::graphics' module to print only one
glyph in a track ?

PS. More Detailed explanation is in my earlier mail (Dont want to spam the
community with my same mail)

Eagerly waiting for a reply.
Thanks,
-- 
Shameer Khadar
Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group
National Centre for Biological Sciences (TIFR)
GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India
T - 91-080-23666001 EXT - 6251
W - http://www.ncbs.res.in


From cain.cshl at gmail.com  Thu Aug 23 08:54:40 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 23 Aug 2007 04:54:40 -0400
Subject: [Bioperl-l] How to 'force' Bio::Graphics to draw
	image	according to input file ?
In-Reply-To: <34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>
References: <10259461.post@talk.nabble.com>
	<a79f6a4b0704301722s6b20c216if262ea9747f7d03f@mail.gmail.com>
	<41667.192.168.1.1.1178019391.squirrel@mail.ncbs.res.in>
	<1178028249.2644.13.camel@localhost.localdomain>
	<42391.192.168.1.1.1178035451.squirrel@mail.ncbs.res.in>
	<6dce9a0b0705030901w203344b4te03ad271a5482faf@mail.gmail.com>
	<51133.192.168.1.1.1187003265.squirrel@mail.ncbs.res.in>
	<46C05896.1010002@sendu.me.uk>
	<59564.192.168.1.1.1187016455.squirrel@mail.ncbs.res.in>
	<46C07257.1000308@sendu.me.uk>
	<A74F50A3-FA32-45E7-BC5A-5EBC1F5C8E7F@uiuc.edu>
	<44632.192.168.1.1.1187811942.squirrel@mail.ncbs.res.in>
	<34980.192.168.1.1.1187852794.squirrel@mail.ncbs.res.in>
Message-ID: <1187859296.2546.6.camel@103.48.216.10.in-addr.arpa>

Shameer,

I don't think that's really what you want.  It seems to me that sorting
them in some useful way (say, by score) would make more sense.  There is
an example using the -sort_order option in Lincoln's howto.

Scott


On Thu, 2007-08-23 at 12:36 +0530, Shameer Khadar wrote:
> Dear All,
> 
> I will make my question simple :
> Is there any way to force the 'Bio::graphics' module to print only one
> glyph in a track ?
> 
> PS. More Detailed explanation is in my earlier mail (Dont want to spam the
> community with my same mail)
> 
> Eagerly waiting for a reply.
> Thanks,
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/6066f0ec/attachment.sig>

From cjfields at uiuc.edu  Thu Aug 23 14:14:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 09:14:51 -0500
Subject: [Bioperl-l] extra rel. 1.6 suggestion
Message-ID: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>

Some interesting points by Sendu:

http://www.bioperl.org/wiki/Release_Schedule#Need_tests

which I agree with completely.

Maybe the best way out if this is a variation on something that was  
suggested before, which was 'splitting' the code into groups.  What  
if we set up a way to automatically gauge test coverage,  
documentation, etc.?  If I remember correctly Nathan had something  
running at one point which did this.

If so, we could determine which code is potentially 'non-compliant'  
and needs to be fixed (tests added, docs brought up to spec, so on),  
and thus prioritize at the minimum what needs to be done for a 1.6  
release.  If it's deemed not worth worrying about (no active  
development, author is out of contact, we have more important  
priorities) we split that code off into a separate 'dev' package.   
That would save some of the headache of trying to split maintenance  
of ~1000 modules up on only a few devs.

Thoughts?

chris


From bix at sendu.me.uk  Thu Aug 23 14:57:21 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Thu, 23 Aug 2007 15:57:21 +0100
Subject: [Bioperl-l] extra rel. 1.6 suggestion
In-Reply-To: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
References: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
Message-ID: <46CDA051.40408@sendu.me.uk>

Chris Fields wrote:
> Maybe the best way out if this is a variation on something that was  
> suggested before, which was 'splitting' the code into groups.  What  
> if we set up a way to automatically gauge test coverage,  
> documentation, etc.?  If I remember correctly Nathan had something  
> running at one point which did this.

You can generate this yourself by doing
./Build testcover

Mauricio was going to sort out having this run daily with the results 
displayed on the website... Mauricio?

The major 'annoyance' is that the coverage results don't get generated 
if any test fails. But they shouldn't be failing anyway ;)


From cain.cshl at gmail.com  Thu Aug 23 19:53:37 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 23 Aug 2007 15:53:37 -0400
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
Message-ID: <1187898817.2562.19.camel@localhost.localdomain>

Hi Chris,

GBrowse would be unaffected by this as it doesn't use
Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
hand will almost certainly break horribly, as it depends on the strong
typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
your ideas out in a branch that I could checkout and test on, that would
be good.

Thanks,
Scott


On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
> As many of the devs know, there are a number of Feature/Annotation  
> issues that need to be resolved prior to a 1.6 release:
> 
> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature. 
> 2FAnnotation_changes:_Keep_or_roll_back.3F
> 
> There has been little work done over the last 2 1/2 years to undo or  
> rectify problems associated with those additions; I feel like those  
> of us still routinely contributing have been left holding the bag.   
> There has also been very little attempt to document any of this  
> adequately enough; as an example see POD for  
> Bio::SeqFeature::Annotated (what little there is).
> 
> I would like to suggest the radical idea of rolling back AnnotatableI/ 
> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags  
> are simple scalars) and possibly work in implementing Ewan's  
> SeqFeature::TypedSeqFeatureI for those who want strong data types  
> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various  
> AnnotatableI changes, odd inheritance, and operator overloading have  
> really obfuscated the code to the point where no one wants to touch  
> it in case it breaks something important.  However, I believe it is  
> the one serious impediment to a new stable release.
> 
> My thought is we simplify all the relevant interfaces, essentially  
> reverting back to rel 1.4.  For instance, we move the various  
> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.   
> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI  
> directly, and (if needed) also implement  
> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on  
> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI  
> methods correctly, just as any other class would when implementing an  
> abstract interface.  I have played around with this a bit and managed  
> to get most tests working again for Bio::SeqFeature::Generic and  
> FeatureIO but a number of others break.
> 
> If needed I can try this out on a branch (a bit ironic, since the  
> changes instigating this mess should have been tested on a branch!).   
> Maybe this will get the ball rolling towards a 1.6 release.  Any  
> thoughts?
> 
> chris
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070823/11ce47d3/attachment.sig>

From N.Haigh at sheffield.ac.uk  Thu Aug 23 20:32:12 2007
From: N.Haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Thu, 23 Aug 2007 21:32:12 +0100
Subject: [Bioperl-l] extra rel. 1.6 suggestion
In-Reply-To: <46CDA051.40408@sendu.me.uk>
References: <3A2C3BFD-2FA1-402B-9597-6E51A72E7096@uiuc.edu>
	<46CDA051.40408@sendu.me.uk>
Message-ID: <1187901132.46cdeeccce68d@webmail.shef.ac.uk>

Quoting Sendu Bala <bix at sendu.me.uk>:

> Chris Fields wrote:
> > Maybe the best way out if this is a variation on something that was  
> > suggested before, which was 'splitting' the code into groups.  What  
> > if we set up a way to automatically gauge test coverage,  
> > documentation, etc.?  If I remember correctly Nathan had something  
> > running at one point which did this.
> 
> You can generate this yourself by doing
> ./Build testcover

What I did was to patch Devel::Cover to include JavaScript to allow soring of the results by clicking a header in the table. This way, it was easier
to find those modules with poor POD coverage, and any other coverage metric. The developer(s) of Devel::Cover are introducing this into their next
release, ut who knows when that release will be. I could provide a diff, but we may be able to check out Devel::Cover from cvs/svn until the 0.62 is
made.

> 
> Mauricio was going to sort out having this run daily with the results 
> displayed on the website... Mauricio?
> 
> The major 'annoyance' is that the coverage results don't get generated 
> if any test fails. But they shouldn't be failing anyway ;)
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From cjfields at uiuc.edu  Thu Aug 23 21:33:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 16:33:25 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <1187898817.2562.19.camel@localhost.localdomain>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<1187898817.2562.19.camel@localhost.localdomain>
Message-ID: <38B989E4-34CA-42CD-A608-9D2A095E7ADF@uiuc.edu>

Scott,

So far most of FeatureIO.t passes, with only a few exceptions dealing  
with the from_feature method (I know what the problem is there).  A  
large number of other tests crash horribly (not so surprising), so  
I'll have to go through those.  Ergo any changes and testing will  
definitely be conducted on a branch then merged back to main trunk  
once everything is okay.  I'll probably start a branch in the next  
few days or so.

Here's what I have been working on so far, which I think is reasonable:

1) Move all *_tag_* related methods out of Bio::AnnotatableI and into  
Bio::SeqFeature::Annotatable.

2) Reinstate the same tag methods in Bio::SeqFeatureI and remove  
Bio::AnnotatableI from the inheritance tree.

3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it  
already was, strangely enough).  Now it simple implements the proper  
methods from the interface classes SeqFeatureI and AnnotatableI.

4) Revert Bio::SeqFeature::Generic tags back to simple untyped  
strings (reimplement the 1.4 rel methods).

I'm interested in seeing whether this results in a significant  
performance increase in SeqIO since the Annotation instantiation is  
removed.

ToDo: I plan on removing the operator overloading in Bio::Annotation,  
which was a serious sticking point with most of the devs.  This won't  
be done until after tests pass for everything else.

What we will need at some point which I can't provide:  
Bio::SeqFeature::Annotated has no docs (no synopsis, no  
description).  The reason I bring this up is Sendu and I are  
seriously considering running an automated code audits in order to  
gauge which modules lack docs, test coverage, etc..  We're likely  
splitting those without adequate test/doc coverage off into a  
separate 'dev' release.

chris

On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:

> Hi Chris,
>
> GBrowse would be unaffected by this as it doesn't use
> Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
> hand will almost certainly break horribly, as it depends on the strong
> typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
> your ideas out in a branch that I could checkout and test on, that  
> would
> be good.
>
> Thanks,
> Scott
>
>
> On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
>> As many of the devs know, there are a number of Feature/Annotation
>> issues that need to be resolved prior to a 1.6 release:
>>
>> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
>> 2FAnnotation_changes:_Keep_or_roll_back.3F
>>
>> There has been little work done over the last 2 1/2 years to undo or
>> rectify problems associated with those additions; I feel like those
>> of us still routinely contributing have been left holding the bag.
>> There has also been very little attempt to document any of this
>> adequately enough; as an example see POD for
>> Bio::SeqFeature::Annotated (what little there is).
>>
>> I would like to suggest the radical idea of rolling back  
>> AnnotatableI/
>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>> are simple scalars) and possibly work in implementing Ewan's
>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
>> AnnotatableI changes, odd inheritance, and operator overloading have
>> really obfuscated the code to the point where no one wants to touch
>> it in case it breaks something important.  However, I believe it is
>> the one serious impediment to a new stable release.
>>
>> My thought is we simplify all the relevant interfaces, essentially
>> reverting back to rel 1.4.  For instance, we move the various
>> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
>> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
>> directly, and (if needed) also implement
>> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
>> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
>> methods correctly, just as any other class would when implementing an
>> abstract interface.  I have played around with this a bit and managed
>> to get most tests working again for Bio::SeqFeature::Generic and
>> FeatureIO but a number of others break.
>>
>> If needed I can try this out on a branch (a bit ironic, since the
>> changes instigating this mess should have been tested on a branch!).
>> Maybe this will get the ball rolling towards a 1.6 release.  Any
>> thoughts?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                          
> cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From smarkel at accelrys.com  Thu Aug 23 21:59:37 2007
From: smarkel at accelrys.com (Scott Markel)
Date: Thu, 23 Aug 2007 14:59:37 -0700
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <38B989E4-34CA-42CD-A608-9D2A095E7ADF@uiuc.edu>
Message-ID: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>

Chris,

Pipeline Pilot's Sequence Analysis Collection wraps BioPerl.
Once you think the branch changes have converged a bit we'd
be happy to try running our regression suite and report what
we find.

Scott

Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel at accelrys.com
Accelrys, Inc.                      mobile: +1 858 205 3653
10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
San Diego, CA 92121                 fax:    +1 858 799 5222
USA                                 web:    http://www.accelrys.com


bioperl-l-bounces at lists.open-bio.org wrote on 23.08.2007 14:33:25:

> Scott,
> 
> So far most of FeatureIO.t passes, with only a few exceptions dealing 
> with the from_feature method (I know what the problem is there).  A 
> large number of other tests crash horribly (not so surprising), so 
> I'll have to go through those.  Ergo any changes and testing will 
> definitely be conducted on a branch then merged back to main trunk 
> once everything is okay.  I'll probably start a branch in the next 
> few days or so.
> 
> Here's what I have been working on so far, which I think is reasonable:
> 
> 1) Move all *_tag_* related methods out of Bio::AnnotatableI and into 
> Bio::SeqFeature::Annotatable.
> 
> 2) Reinstate the same tag methods in Bio::SeqFeatureI and remove 
> Bio::AnnotatableI from the inheritance tree.
> 
> 3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it 
> already was, strangely enough).  Now it simple implements the proper 
> methods from the interface classes SeqFeatureI and AnnotatableI.
> 
> 4) Revert Bio::SeqFeature::Generic tags back to simple untyped 
> strings (reimplement the 1.4 rel methods).
> 
> I'm interested in seeing whether this results in a significant 
> performance increase in SeqIO since the Annotation instantiation is 
> removed.
> 
> ToDo: I plan on removing the operator overloading in Bio::Annotation, 
> which was a serious sticking point with most of the devs.  This won't 
> be done until after tests pass for everything else.
> 
> What we will need at some point which I can't provide: 
> Bio::SeqFeature::Annotated has no docs (no synopsis, no 
> description).  The reason I bring this up is Sendu and I are 
> seriously considering running an automated code audits in order to 
> gauge which modules lack docs, test coverage, etc..  We're likely 
> splitting those without adequate test/doc coverage off into a 
> separate 'dev' release.
> 
> chris
> 
> On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > GBrowse would be unaffected by this as it doesn't use
> > Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
> > hand will almost certainly break horribly, as it depends on the strong
> > typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you could try
> > your ideas out in a branch that I could checkout and test on, that 
> > would
> > be good.
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
> >> As many of the devs know, there are a number of Feature/Annotation
> >> issues that need to be resolved prior to a 1.6 release:
> >>
> >> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
> >> 2FAnnotation_changes:_Keep_or_roll_back.3F
> >>
> >> There has been little work done over the last 2 1/2 years to undo or
> >> rectify problems associated with those additions; I feel like those
> >> of us still routinely contributing have been left holding the bag.
> >> There has also been very little attempt to document any of this
> >> adequately enough; as an example see POD for
> >> Bio::SeqFeature::Annotated (what little there is).
> >>
> >> I would like to suggest the radical idea of rolling back 
> >> AnnotatableI/
> >> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
> >> are simple scalars) and possibly work in implementing Ewan's
> >> SeqFeature::TypedSeqFeatureI for those who want strong data types
> >> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
> >> AnnotatableI changes, odd inheritance, and operator overloading have
> >> really obfuscated the code to the point where no one wants to touch
> >> it in case it breaks something important.  However, I believe it is
> >> the one serious impediment to a new stable release.
> >>
> >> My thought is we simplify all the relevant interfaces, essentially
> >> reverting back to rel 1.4.  For instance, we move the various
> >> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
> >> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
> >> directly, and (if needed) also implement
> >> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
> >> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
> >> methods correctly, just as any other class would when implementing an
> >> abstract interface.  I have played around with this a bit and managed
> >> to get most tests working again for Bio::SeqFeature::Generic and
> >> FeatureIO but a number of others break.
> >>
> >> If needed I can try this out on a branch (a bit ironic, since the
> >> changes instigating this mess should have been tested on a branch!).
> >> Maybe this will get the ball rolling towards a 1.6 release.  Any
> >> thoughts?
> >>
> >> chris
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ---------------------------------------------------------------------- 

> > --
> > Scott Cain, Ph. D. 
> > cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/) 
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> -- 
> Click on the link below to report this email as spam
> https://www.mailcontrol.com/sr/Z!
> PZbyWH8JjiAfutpwULH4r7uW5Ugf1xtM+hyl21+efKtFgsAvNc3weh2hLqBsx8qT3rbOWim!
> Vn7A6djKguyK4O2gER4dLr9AKQF+tbnNRe+5lUPSgNICEO3B01XGW5n2DPe!
> yEtP3js8LAfwb38Bepj7AEJrDzVAG8yHc2pI5Y2U7!
> XHn0N1xbhPb0KSgNCfpTRCAMi3+BBkPbzT1bgrPmgUSJxQ9e 


From cjfields at uiuc.edu  Fri Aug 24 00:39:30 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 23 Aug 2007 19:39:30 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>
References: <OF1E1ED913.3FB67C57-ON88257340.00785855-88257340.0078D192@accelrys.com>
Message-ID: <241563BB-F96A-4631-B504-F73699FDE84B@uiuc.edu>

Having an independent test would be great!  The reason I suggest  
there may be a speedup: one complaint popping up after 1.5 was the  
slowdown in sequence parsing, which could be related to the 'heavier'  
objectified tags.

chris

On Aug 23, 2007, at 4:59 PM, Scott Markel wrote:

> Chris,
>
> Pipeline Pilot's Sequence Analysis Collection wraps BioPerl.
> Once you think the branch changes have converged a bit we'd
> be happy to try running our regression suite and report what
> we find.
>
> Scott
>
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect  email:  smarkel at accelrys.com
> Accelrys, Inc.                      mobile: +1 858 205 3653
> 10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
> San Diego, CA 92121                 fax:    +1 858 799 5222
> USA                                 web:    http://www.accelrys.com
>
>
> bioperl-l-bounces at lists.open-bio.org wrote on 23.08.2007 14:33:25:
>
>> Scott,
>>
>> So far most of FeatureIO.t passes, with only a few exceptions dealing
>> with the from_feature method (I know what the problem is there).  A
>> large number of other tests crash horribly (not so surprising), so
>> I'll have to go through those.  Ergo any changes and testing will
>> definitely be conducted on a branch then merged back to main trunk
>> once everything is okay.  I'll probably start a branch in the next
>> few days or so.
>>
>> Here's what I have been working on so far, which I think is  
>> reasonable:
>>
>> 1) Move all *_tag_* related methods out of Bio::AnnotatableI and into
>> Bio::SeqFeature::Annotatable.
>>
>> 2) Reinstate the same tag methods in Bio::SeqFeatureI and remove
>> Bio::AnnotatableI from the inheritance tree.
>>
>> 3) Make Bio::SeqFeature::Annotatable Bio::AnnotatableI (which it
>> already was, strangely enough).  Now it simple implements the proper
>> methods from the interface classes SeqFeatureI and AnnotatableI.
>>
>> 4) Revert Bio::SeqFeature::Generic tags back to simple untyped
>> strings (reimplement the 1.4 rel methods).
>>
>> I'm interested in seeing whether this results in a significant
>> performance increase in SeqIO since the Annotation instantiation is
>> removed.
>>
>> ToDo: I plan on removing the operator overloading in Bio::Annotation,
>> which was a serious sticking point with most of the devs.  This won't
>> be done until after tests pass for everything else.
>>
>> What we will need at some point which I can't provide:
>> Bio::SeqFeature::Annotated has no docs (no synopsis, no
>> description).  The reason I bring this up is Sendu and I are
>> seriously considering running an automated code audits in order to
>> gauge which modules lack docs, test coverage, etc..  We're likely
>> splitting those without adequate test/doc coverage off into a
>> separate 'dev' release.
>>
>> chris
>>
>> On Aug 23, 2007, at 2:53 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> GBrowse would be unaffected by this as it doesn't use
>>> Bio::SeqFeature::Annotated.  The GMOD GFF3 Chado loader on the other
>>> hand will almost certainly break horribly, as it depends on the  
>>> strong
>>> typing of Bio::FeatureIO/Bio::SeqFeature::Annotated.  If you  
>>> could try
>>> your ideas out in a branch that I could checkout and test on, that
>>> would
>>> be good.
>>>
>>> Thanks,
>>> Scott
>>>
>>>
>>> On Wed, 2007-08-22 at 23:53 -0500, Chris Fields wrote:
>>>> As many of the devs know, there are a number of Feature/Annotation
>>>> issues that need to be resolved prior to a 1.6 release:
>>>>
>>>> http://www.bioperl.org/wiki/Release_Schedule#SeqFeature.
>>>> 2FAnnotation_changes:_Keep_or_roll_back.3F
>>>>
>>>> There has been little work done over the last 2 1/2 years to  
>>>> undo or
>>>> rectify problems associated with those additions; I feel like those
>>>> of us still routinely contributing have been left holding the bag.
>>>> There has also been very little attempt to document any of this
>>>> adequately enough; as an example see POD for
>>>> Bio::SeqFeature::Annotated (what little there is).
>>>>
>>>> I would like to suggest the radical idea of rolling back
>>>> AnnotatableI/
>>>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>>>> are simple scalars) and possibly work in implementing Ewan's
>>>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>>>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).  The various
>>>> AnnotatableI changes, odd inheritance, and operator overloading  
>>>> have
>>>> really obfuscated the code to the point where no one wants to touch
>>>> it in case it breaks something important.  However, I believe it is
>>>> the one serious impediment to a new stable release.
>>>>
>>>> My thought is we simplify all the relevant interfaces, essentially
>>>> reverting back to rel 1.4.  For instance, we move the various
>>>> Bio::AnnotatableI tag methods back into Bio::SeqFeatureI.
>>>> Bio::SeqFeature::Annotated would implement Bio::AnnotatableI
>>>> directly, and (if needed) also implement
>>>> Bio::SeqFeature::TypedSeqFeatureI, so the impetus is on
>>>> Bio::SeqFeature::Annotated to overload the relevant SeqFeatureI
>>>> methods correctly, just as any other class would when  
>>>> implementing an
>>>> abstract interface.  I have played around with this a bit and  
>>>> managed
>>>> to get most tests working again for Bio::SeqFeature::Generic and
>>>> FeatureIO but a number of others break.
>>>>
>>>> If needed I can try this out on a branch (a bit ironic, since the
>>>> changes instigating this mess should have been tested on a  
>>>> branch!).
>>>> Maybe this will get the ball rolling towards a 1.6 release.  Any
>>>> thoughts?
>>>>
>>>> chris
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>
>>> --
>>> Scott Cain, Ph. D.
>>> cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> -- 
>> Click on the link below to report this email as spam
>> https://www.mailcontrol.com/sr/Z!
>> PZbyWH8JjiAfutpwULH4r7uW5Ugf1xtM+hyl21 
>> +efKtFgsAvNc3weh2hLqBsx8qT3rbOWim!
>> Vn7A6djKguyK4O2gER4dLr9AKQF+tbnNRe+5lUPSgNICEO3B01XGW5n2DPe!
>> yEtP3js8LAfwb38Bepj7AEJrDzVAG8yHc2pI5Y2U7!
>> XHn0N1xbhPb0KSgNCfpTRCAMi3+BBkPbzT1bgrPmgUSJxQ9e
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Fri Aug 24 03:34:12 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 23 Aug 2007 23:34:12 -0400
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
Message-ID: <CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>


On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:

> There has been little work done over the last 2 1/2 years to undo or
> rectify problems associated with those additions; I feel like those
> of us still routinely contributing have been left holding the bag.

Not by intention, but unfortunately that's probably a fair  
assessment. (And I'm one of those guilty of inaction.)

> [...]
> I would like to suggest the radical idea of rolling back AnnotatableI/
> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
> are simple scalars) and possibly work in implementing Ewan's
> SeqFeature::TypedSeqFeatureI for those who want strong data types
> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).

I fully support this; to me that sounds exactly like the way to go.

> The various AnnotatableI changes, odd inheritance, and operator  
> overloading have
> really obfuscated the code to the point where no one wants to touch
> it in case it breaks something important.  However, I believe it is
> the one serious impediment to a new stable release.

Yes, I think you're hitting the nail on the head.

Chris, if you take the lead on this and carry it through we will all  
owe you hugely. I'm not sure how many beers that would compare to,  
but I'll throw in some. (Who else do I owe beer? I'm losing track.  
Strangely nobody tried to redeem beer from me in Vienna. Maybe in  
Toronto?)

Seriously, rectifying this problem would lift a huge weight.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From florent.angly at gmail.com  Fri Aug 24 04:43:23 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Thu, 23 Aug 2007 21:43:23 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
Message-ID: <46CE61EB.5000300@gmail.com>

Dear list members,

I would like to "produce" an alignment of a contig, or more exactly 
visualize it in a such a fashion based on the aligned sequences provided 
to be by a sequence assembler:

Consensus: ACGTACGTTG
Sequence1: ACG-AC
Sequence2:  CGTACGT
Sequence3:     AC-TTG

It sounds like a very trivial task but after searching for a long time, 
it seems impossible using the methods BioPerl provides.

Using the Bio::Align classes, it seems like the only way is if the 
sequences have the same aligned length, i.e. like this:

Consensus: ACGTACGTTG
Sequence1: ACG-AC----
Sequence2: -CGTACGT--
Sequence3: ----AC-TTG

It's not very satisfactory if I have to pad the sequences with gaps 
manually. In the context of a phylogenetic alignment, it might make 
sense, but not for contigs.

For assemblies whole sequences are mapped on contigs. Bio::LocatableSeq 
does not help here because it defines locations _within_ the sequence 
(the name LocatableSeq was pretty misleading to me).

I think it's all very strange that contigs have the coordinates of the 
aligned sequences composing them but there is no straightforward way to 
exploit this information.

So what's the bottom line? Am I missing something obvious, an 
out-of-the-box solution? Is it a "missing feature" of BioPerl that is 
planned to be implemented in the future or that should be requested? 
Should I pad my sequences with dashes or spaces after assembly? Or is it 
expected that my aligned reads coming from my assembly be padded with 
lots of gaps at their beginning and end? What's the BioPerl philosophy here?

Thanks for giving me pointers,

Florent


From bix at sendu.me.uk  Fri Aug 24 08:35:23 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 09:35:23 +0100
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CE61EB.5000300@gmail.com>
References: <46CE61EB.5000300@gmail.com>
Message-ID: <46CE984B.3060701@sendu.me.uk>

Florent Angly wrote:
> Dear list members,
> 
> I would like to "produce" an alignment of a contig, or more exactly 
> visualize it in a such a fashion based on the aligned sequences provided 
> to be by a sequence assembler:
> 
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC
> Sequence2:  CGTACGT
> Sequence3:     AC-TTG
> 
> It sounds like a very trivial task but after searching for a long time, 
> it seems impossible using the methods BioPerl provides.

Isn't Bio::Assembly::Contig what you need?

http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Assembly/Contig.html


From zhaodj at ioz.ac.cn  Fri Aug 24 09:34:07 2007
From: zhaodj at ioz.ac.cn (De-Jian,ZHAO)
Date: Fri, 24 Aug 2007 17:34:07 +0800 (CST)
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CE61EB.5000300@gmail.com>
References: <46CE61EB.5000300@gmail.com>
Message-ID: <51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>

On Fri, Aug 24, 2007 12:43, Florent Angly wrote:
> Dear list members,
>
> I would like to "produce" an alignment of a contig, or more
exactly
> visualize it in a such a fashion based on the aligned sequences
> provided
> to be by a sequence assembler:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC
> Sequence2:  CGTACGT
> Sequence3:     AC-TTG
>
> It sounds like a very trivial task but after searching for a long
time,
> it seems impossible using the methods BioPerl provides.
>
> Using the Bio::Align classes, it seems like the only way is if the
sequences have the same aligned length, i.e. like this:
>
> Consensus: ACGTACGTTG
> Sequence1: ACG-AC----
> Sequence2: -CGTACGT--
> Sequence3: ----AC-TTG
>
> It's not very satisfactory if I have to pad the sequences with
gaps
> manually. In the context of a phylogenetic alignment, it might
make
> sense, but not for contigs.

How do you pad the sequences with gaps manually? Just replace the
hyphens with blanks? If yes, you can program in perl to automate
this process.

> For assemblies whole sequences are mapped on contigs.
> Bio::LocatableSeq
> does not help here because it defines locations _within_ the
> sequence
> (the name LocatableSeq was pretty misleading to me).
>
> I think it's all very strange that contigs have the coordinates of
the
> aligned sequences composing them but there is no straightforward
way
> to
> exploit this information.
>
> So what's the bottom line? Am I missing something obvious, an
> out-of-the-box solution? Is it a "missing feature" of BioPerl that
is
> planned to be implemented in the future or that should be
requested?
> Should I pad my sequences with dashes or spaces after assembly? Or
is it
> expected that my aligned reads coming from my assembly be padded
with
> lots of gaps at their beginning and end? What's the BioPerl
> philosophy here?
>
> Thanks for giving me pointers,
>
> Florent
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
De-Jian Zhao
Institute of Zoology,Chinese Academy of Sciences
+86-10-64807217
zhaodj at ioz.ac.cn


From marian.thieme at arcor.de  Fri Aug 24 10:05:55 2007
From: marian.thieme at arcor.de (Marian Thieme)
Date: Fri, 24 Aug 2007 12:05:55 +0200
Subject: [Bioperl-l] ReseqChip, module/package name
Message-ID: <46CEAD83.2050904@arcor.de>

Hi,

2 questions about the naming of the module I did submit
(see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)

1.) The package:
because there exists already an expression package I suggest to create a
new package called resequencing

2.) I would suggest that the module is called RedundantFragments or
AdditionalFragments

so we would have something like:

Bio::Resequencing::AdditionalFragments

Any other ideas ?

Marian

By the way can anybody change my email adress to marian.thieme at arcor.de
in bugzilla as well as in the bioperl list, please ?!! didnt achieve
that by my own...


From mcons004 at fiu.edu  Fri Aug 24 03:30:44 2007
From: mcons004 at fiu.edu (mcons004 at fiu.edu)
Date: Thu, 23 Aug 2007 23:30:44 -0400 (EDT)
Subject: [Bioperl-l] please some help
Message-ID: <20070823233044.BJQ45014@mailstore2.fiu.edu>

  Hello,
     I am new to this software and I am having some trouble starting. The version of Bioperl I am working on is v5.8.6. My OS is Unix (Mac OS X). I am trying to use Bioperl with a file called blastParser to process a file which is the output of a "blastall" operation.
  
 The code that gives me error is:
> perl blastParser.pl junk.out 1 1 1.0
 and the error message says:
Can't locate Bio/SearchIO.pm in @INC (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level

 You online info says I probably means that the module Bio::SearchIO.pm is not instaled and I can either install Bundle::Bioperl or install that specific module by hand. Could you give me some tips in this? I am new working with Unix, and Bioperl so I am a little confused. Any information will be helpful for me. Thanks


From bix at sendu.me.uk  Fri Aug 24 14:38:39 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 15:38:39 +0100
Subject: [Bioperl-l] please some help
In-Reply-To: <20070823233044.BJQ45014@mailstore2.fiu.edu>
References: <20070823233044.BJQ45014@mailstore2.fiu.edu>
Message-ID: <46CEED6F.1080101@sendu.me.uk>

mcons004 at fiu.edu wrote:
> Hello, I am new to this software and I am having some trouble
> starting. The version of Bioperl I am working on is v5.8.6. My OS is
> Unix (Mac OS X). I am trying to use Bioperl with a file called
> blastParser to process a file which is the output of a "blastall"
> operation.
> 
> The code that gives me error is:
>> perl blastParser.pl junk.out 1 1 1.0
> and the error message says: Can't locate Bio/SearchIO.pm in @INC
> (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level
> 
> 
> You online info says I probably means that the module
> Bio::SearchIO.pm is not instaled and I can either install
> Bundle::Bioperl or install that specific module by hand. Could you
> give me some tips in this? I am new working with Unix, and Bioperl so
> I am a little confused.

You need to install Bioperl first. You can find instructions here:
http://www.bioperl.org/wiki/Installing_Bioperl_for_Unix

If this is your own Mac (you have the root/admin password), when it 
tells you to run cpan (">perl -MCPAN -e shell" or ">cpan"), start the 
command with 'sudo'. So:

 >sudo cpan


From florent.angly at gmail.com  Fri Aug 24 16:07:04 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Fri, 24 Aug 2007 09:07:04 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
Message-ID: <46CF0228.2000404@gmail.com>

Thanks for all the replies.

Sendu Bala wrote:

> Isn't Bio::Assembly::Contig what you need?
>
> http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/Assembly/Contig.html
>
I'm using this module already to manipulate the contigs, but there's no
option that I know of to _display_ the contigs in the way I described.
(Sorry, the title of my email was misleading.)


De-Jian,ZHAO wrote:
> How do you pad the sequences with gaps manually? Just replace the
> hyphens with blanks? If yes, you can program in perl to automate
> this process.
>   
How do I pad the sequences manually?? I calculate how many gaps have to
go left and right of the aligned sequence based on its length, its
position in the aligned consensus and the consensus length.
my $newseq = '-' x $leftnum . $seq . '-'x$rightnum
By the way, the sequences cannot be stored with blanks in them...

I think the best way to provide an out-of-the-box solution for
displaying contigs the described way would be to _not_ use Bio::Align at
all, but rather to create a new assembly IO module like
Bio::Assembly::IO::simpleout for example. That would be useful.

The reason I wanted to visualize these contigs is because I made a
Bio::Assembly::IO module for TIGR Assembler files that I intend on
submitting to BioPerl. I wanted to make sure first that I did not have
any obvious bug in my contig coordinates. I've read the documentation on
the Wiki so if a BioPerl developer would please like lo step up and
contact me directly for checking my code, that would be nice =)

Florent


From cjfields at uiuc.edu  Fri Aug 24 16:07:36 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 11:07:36 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip, module/package name
In-Reply-To: <46CEAD83.2050904@arcor.de>
References: <46CEAD83.2050904@arcor.de>
Message-ID: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>

Marian,

First, apologies about not getting on this sooner.  It's shaping up  
to be a busy year!

The new package: How about Bio::Expression::Tools::MitoChip?  My  
reasoning: I don't think it's necessary to define a new  
Bio::Resequencing namespace for just one module; my inclination is  
towards using Bio::Expression namespace as Bio::Tools have been  
traditionally reserved for output parsers.  I am unsure what the  
Bio::Expression status is (very little is documented, no tests are  
written, nothing on the mail list archives); maybe Allen can answer  
that?  I don't see anything that precludes you from using that  
namespace as long as your tools are fairly well-defined (they are)  
and have tests (they do).

Also, your module deals with doing one specific thing (extraction and  
incorporation of information about redundant fragments) for the Affy  
MitoChip.  It might be worth genericizing the class a bit so that you  
can add new parser or analysis methods w/o having to define new  
classes to deal with the same Mitochip data.

Mail list: The mail list subscription page (http://bioperl.org/ 
mailman/listinfo/bioperl-l) allows you to subscribe or change  
subscription options (at the bottom of the page).

Bugzilla: if you are logged into Bugzilla under your old email, there  
is an option at the bottom of the page (Edit : Prefs) where you can  
change your email address and other preferences.

chris

On Aug 24, 2007, at 5:05 AM, Marian Thieme wrote:

> Hi,
>
> 2 questions about the naming of the module I did submit
> (see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)
>
> 1.) The package:
> because there exists already an expression package I suggest to  
> create a
> new package called resequencing
>
> 2.) I would suggest that the module is called RedundantFragments or
> AdditionalFragments
>
> so we would have something like:
>
> Bio::Resequencing::AdditionalFragments
>
> Any other ideas ?
>
> Marian
>
> By the way can anybody change my email adress to  
> marian.thieme at arcor.de
> in bugzilla as well as in the bioperl list, please ?!! didnt achieve
> that by my own...
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Aug 24 16:23:12 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 11:23:12 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
Message-ID: <4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>

On Aug 23, 2007, at 10:34 PM, Hilmar Lapp wrote:

> On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:
>
>> There has been little work done over the last 2 1/2 years to undo or
>> rectify problems associated with those additions; I feel like those
>> of us still routinely contributing have been left holding the bag.
>
> Not by intention, but unfortunately that's probably a fair  
> assessment. (And I'm one of those guilty of inaction.)

Not completely.  You, Jason, Chris M., and several others expressed  
yourselves quite clearly (move the code to a branch and test).  I  
think that everyone was trying to be diplomatic about it and so never  
attempted to do anything except get it working correctly.

>> [...]
>> I would like to suggest the radical idea of rolling back  
>> AnnotatableI/
>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>> are simple scalars) and possibly work in implementing Ewan's
>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).
>
> I fully support this; to me that sounds exactly like the way to go.

Okay, I'll probably go ahead and get a branch started today.  I'll  
have to look at Ewan's interface in more detail; it's possible a new  
SeqFeature implementation will need to be written up to incorporate it.

>> The various AnnotatableI changes, odd inheritance, and operator  
>> overloading have
>> really obfuscated the code to the point where no one wants to touch
>> it in case it breaks something important.  However, I believe it is
>> the one serious impediment to a new stable release.
>
> Yes, I think you're hitting the nail on the head.
>
> Chris, if you take the lead on this and carry it through we will  
> all owe you hugely. I'm not sure how many beers that would compare  
> to, but I'll throw in some. (Who else do I owe beer? I'm losing  
> track. Strangely nobody tried to redeem beer from me in Vienna.  
> Maybe in Toronto?)
>
> Seriously, rectifying this problem would lift a huge weight.
>
> 	-hilmar

It would be nice to get regular releases started again.  I think  
this'll help.

chris


From marian.thieme at arcor.de  Fri Aug 24 17:01:07 2007
From: marian.thieme at arcor.de (Marian Thieme)
Date: Fri, 24 Aug 2007 19:01:07 +0200
Subject: [Bioperl-l] Bio::Expression & Re: ReseqChip, module/package name
Message-ID: <46CF0ED3.8000708@arcor.de>

> The new package: How about Bio::Expression::Tools::MitoChip?  My  
> reasoning: I don't think it's necessary to define a new  
> Bio::Resequencing namespace for just one module; my inclination is  
> towards using Bio::Expression namespace as Bio::Tools have been  
> traditionally reserved for output parsers.  I am unsure what the  
> Bio::Expression status is (very little is documented, no tests are  
> written, nothing on the mail list archives); maybe Allen can answer  
> that?  I don't see anything that precludes you from using that  
> namespace as long as your tools are fairly well-defined (they are)  
> and have tests (they do).

The problem I see, with Bio::Expression, is that Resequencing chips are
not belongs to Expression chips.
(Expression chips are designed to hybridisize RNA strands and hence
measure RNA expression levels, on the other hand a resequencing chip is
based on DNA, also the design and the probe length is quite different).
So, from my point of view it make sence to differ between dna and rna
chips, at least.

>
> Also, your module deals with doing one specific thing (extraction and  
> incorporation of information about redundant fragments) for the Affy  
> MitoChip.  It might be worth genericizing the class a bit so that you  
> can add new parser or analysis methods w/o having to define new  
> classes to deal with the same Mitochip data.

OK, need to think about that.

>
> Mail list: The mail list subscription page (http://bioperl.org/
<http://www.arcor.de/home/link.php?url=http%3A%2F%2Fbioperl.org%2F&ts=1187974826&hash=13eb66beff4317844b3e2448aa7af12a>

> mailman/listinfo/bioperl-l) allows you to subscribe or change  
> subscription options (at the bottom of the page).
>
cleared

> Bugzilla: if you are logged into Bugzilla under your old email, there  
> is an option at the bottom of the page (Edit : Prefs) where you can  
> change your email address and other preferences.
>
unfortunatly I dont recieve a mail to confirm the change. did try that
twice..


Marian


From bix at sendu.me.uk  Fri Aug 24 16:43:22 2007
From: bix at sendu.me.uk (Sendu Bala)
Date: Fri, 24 Aug 2007 17:43:22 +0100
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CF0228.2000404@gmail.com>
References: <46CE61EB.5000300@gmail.com>	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
Message-ID: <46CF0AAA.4090301@sendu.me.uk>

Florent Angly wrote:
> Thanks for all the replies.
> 
> Sendu Bala wrote:
> 
>> Isn't Bio::Assembly::Contig what you need?
>
> I'm using this module already to manipulate the contigs, but there's 
> no option that I know of to _display_ the contigs in the way I 
> described.
[snip]
> I think the best way to provide an out-of-the-box solution for 
> displaying contigs the described way would be to _not_ use Bio::Align
> at all, but rather to create a new assembly IO module like 
> Bio::Assembly::IO::simpleout for example. That would be useful.

Yes...


> The reason I wanted to visualize these contigs is because I made a 
> Bio::Assembly::IO module for TIGR Assembler files that I intend on 
> submitting to BioPerl.

That's wonderful... might I cheekily suggest that the solution to your
problem is to extend your IO module so that it does the 'O' as well? Ie.
unlike the other IO modules, write_assembly() is actually implemented.
Then you can round-trip to ensure your next_assembly() method has no bugs.


> I've read the documentation on the Wiki so if a BioPerl developer
> would please like lo step up and contact me directly for checking my
> code, that would be nice =)

If no one does, post it as an enhancement request to bugzilla. A test
script is a must.

http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests


From cjfields at uiuc.edu  Fri Aug 24 17:16:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 12:16:10 -0500
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <46CF0228.2000404@gmail.com>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
Message-ID: <32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>


On Aug 24, 2007, at 11:07 AM, Florent Angly wrote:
...

> De-Jian,ZHAO wrote:
>> How do you pad the sequences with gaps manually? Just replace the
>> hyphens with blanks? If yes, you can program in perl to automate
>> this process.
>>
> How do I pad the sequences manually?? I calculate how many gaps  
> have to
> go left and right of the aligned sequence based on its length, its
> position in the aligned consensus and the consensus length.
> my $newseq = '-' x $leftnum . $seq . '-'x$rightnum
> By the way, the sequences cannot be stored with blanks in them...
>
> I think the best way to provide an out-of-the-box solution for
> displaying contigs the described way would be to _not_ use  
> Bio::Align at
> all, but rather to create a new assembly IO module like
> Bio::Assembly::IO::simpleout for example. That would be useful.
>
> The reason I wanted to visualize these contigs is because I made a
> Bio::Assembly::IO module for TIGR Assembler files that I intend on
> submitting to BioPerl. I wanted to make sure first that I did not have
> any obvious bug in my contig coordinates. I've read the  
> documentation on
> the Wiki so if a BioPerl developer would please like lo step up and
> contact me directly for checking my code, that would be nice =)
>
> Florent

A similar question has been previously asked on the same subject:

http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2827/focus=2869

Jason's suggestion was to have a Bio::Assembly::Contig method get_aln 
() which produces a Bio::SimpleAlign object containing appropriately  
padded seqs compatible for AlignIO output.  However, the method was  
never implemented.

Personally, the way I would try going about this would be to  
implement the Contig::get_aln() method, padding with bioperl- 
compliant alignment gap symbols (currently -.*?=~), so if anyone  
wanted they could write to any AlignIO-implemented format (MSF,  
Clustal, etc).  In your Bio::Assembly::IO::simpleout module implement  
write_assembly() and use the Contig::get_aln() method where needed to  
grab the SimpleAlign, then simply substitute gap symbols with spaces  
when writing contig output.

In general, any new code is attached to a bugzilla report as an  
enhancement request:

http://bugzilla.open-bio.org/

One of the devs will work on getting the code incorporated into  
bioperl.  Make sure the code is documented (http://www.bioperl.org/ 
wiki/Advanced_BioPerl), and attach appropriate tests (http:// 
www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests) and test data.

chris


From cjfields at uiuc.edu  Fri Aug 24 17:20:16 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 12:20:16 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <9824900.1187973171940.JavaMail.ngmail@webmail17>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
	<9824900.1187973171940.JavaMail.ngmail@webmail17>
Message-ID: <A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>


On Aug 24, 2007, at 11:32 AM, marian.thieme at arcor.de wrote:

>> ...
> The problem I see, with Bio::Expression, is that Resequencing chips  
> are not belongs to Expression chips.
> (Expression chips are designed to hybridisize RNA strands and hence  
> measure RNA expression levels, on the other hand a resequencing  
> chip is based on DNA, also the design and the probe length is quite  
> different). So, from my point of view it make sence to differ  
> between dna and rna chips, at least.

Then maybe the more generic Bio::Microarray namespace is the way to  
go, with the module name Bio::Microarray::Tools:: MitoChip.  If  
needed other tools can be added as needed.

>> Also, your module deals with doing one specific thing (extraction and
>> incorporation of information about redundant fragments) for the Affy
>> MitoChip.  It might be worth genericizing the class a bit so that you
>> can add new parser or analysis methods w/o having to define new
>> classes to deal with the same Mitochip data.
>
> OK, need to think about that.

It all depends on how much you intend to contribute; if you plan on  
adding to it over time we can talk about starting up a developer  
account.

>> Mail list: The mail list subscription page (http://bioperl.org/
>> mailman/listinfo/bioperl-l) allows you to subscribe or change
>> subscription options (at the bottom of the page).
>>
> cleared
>
>> Bugzilla: if you are logged into Bugzilla under your old email, there
>> is an option at the bottom of the page (Edit : Prefs) where you can
>> change your email address and other preferences.
>>
> unfortunatly I dont recieve a mail to confirm the change. did try  
> that twice..
>
>
> Marian

I tested it out and received the email at both addresses (as it  
states).  If you respond to either email it should implement the  
change in three days time.  If it doesn't you can email support at  
open.bio.org to see if there is a problem.

chris


From florent.angly at gmail.com  Fri Aug 24 17:58:13 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Fri, 24 Aug 2007 10:58:13 -0700
Subject: [Bioperl-l] Is it possible to do contig alignments?
In-Reply-To: <32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>
References: <46CE61EB.5000300@gmail.com>
	<51693.159.226.67.49.1187948047.squirrel@mail.ioz.ac.cn>
	<46CF0228.2000404@gmail.com>
	<32D5D3FF-D0A5-4EEB-BA5E-B0087CC64B19@uiuc.edu>
Message-ID: <46CF1C35.3050100@gmail.com>

Chris Fields wrote:
>
> A similar question has been previously asked on the same subject:
>
> http://thread.gmane.org/gmane.comp.lang.perl.bio.general/2827/focus=2869
>
> Jason's suggestion was to have a Bio::Assembly::Contig method 
> get_aln() which produces a Bio::SimpleAlign object containing 
> appropriately padded seqs compatible for AlignIO output.  However, the 
> method was never implemented.
>
> Personally, the way I would try going about this would be to implement 
> the Contig::get_aln() method, padding with bioperl-compliant alignment 
> gap symbols (currently -.*?=~), so if anyone wanted they could write 
> to any AlignIO-implemented format (MSF, Clustal, etc).  In your 
> Bio::Assembly::IO::simpleout module implement write_assembly() and use 
> the Contig::get_aln() method where needed to grab the SimpleAlign, 
> then simply substitute gap symbols with spaces when writing contig 
> output.
>
> In general, any new code is attached to a bugzilla report as an 
> enhancement request:
>
> http://bugzilla.open-bio.org/
>
> One of the devs will work on getting the code incorporated into 
> bioperl.  Make sure the code is documented 
> (http://www.bioperl.org/wiki/Advanced_BioPerl), and attach appropriate 
> tests (http://www.bioperl.org/wiki/HOWTO:Writing_BioPerl_Tests) and 
> test data.
>
> chris
>
>
Thanks Chris for the pointers, I will be looking into these things.
Florent


From hlapp at gmx.net  Fri Aug 24 18:25:57 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 24 Aug 2007 14:25:57 -0400
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
	<9824900.1187973171940.JavaMail.ngmail@webmail17>
	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
Message-ID: <BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>


On Aug 24, 2007, at 1:20 PM, Chris Fields wrote:

>>> ...
>> The problem I see, with Bio::Expression, is that Resequencing chips
>> are not belongs to Expression chips.
>> (Expression chips are designed to hybridisize RNA strands and hence
>> measure RNA expression levels, on the other hand a resequencing
>> chip is based on DNA, also the design and the probe length is quite
>> different). So, from my point of view it make sence to differ
>> between dna and rna chips, at least.
>
> Then maybe the more generic Bio::Microarray namespace is the way to
> go, with the module name Bio::Microarray::Tools:: MitoChip.  If
> needed other tools can be added as needed.
>

Makes sense to me too. Presumably, regardless of DNA or RNA being  
hybridized or length of probes, the data that comes out of them is  
quite similar in a general nature (namely hybridization signals)?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From marian.thieme at arcor.de  Fri Aug 24 16:32:51 2007
From: marian.thieme at arcor.de (marian.thieme at arcor.de)
Date: Fri, 24 Aug 2007 18:32:51 +0200 (CEST)
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
 module/package name
In-Reply-To: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>
	<46CEAD83.2050904@arcor.de>
Message-ID: <9824900.1187973171940.JavaMail.ngmail@webmail17>

> The new package: How about Bio::Expression::Tools::MitoChip?  My  
> reasoning: I don't think it's necessary to define a new  
> Bio::Resequencing namespace for just one module; my inclination is  
> towards using Bio::Expression namespace as Bio::Tools have been  
> traditionally reserved for output parsers.  I am unsure what the  
> Bio::Expression status is (very little is documented, no tests are  
> written, nothing on the mail list archives); maybe Allen can answer  
> that?  I don't see anything that precludes you from using that  
> namespace as long as your tools are fairly well-defined (they are)  
> and have tests (they do).

The problem I see, with Bio::Expression, is that Resequencing chips are not belongs to Expression chips.
(Expression chips are designed to hybridisize RNA strands and hence measure RNA expression levels, on the other hand a resequencing chip is based on DNA, also the design and the probe length is quite different). So, from my point of view it make sence to differ between dna and rna chips, at least.

> 
> Also, your module deals with doing one specific thing (extraction and  
> incorporation of information about redundant fragments) for the Affy  
> MitoChip.  It might be worth genericizing the class a bit so that you  
> can add new parser or analysis methods w/o having to define new  
> classes to deal with the same Mitochip data.

OK, need to think about that.

> 
> Mail list: The mail list subscription page (http://bioperl.org/ 
> mailman/listinfo/bioperl-l) allows you to subscribe or change  
> subscription options (at the bottom of the page).
> 
cleared

> Bugzilla: if you are logged into Bugzilla under your old email, there  
> is an option at the bottom of the page (Edit : Prefs) where you can  
> change your email address and other preferences.
> 
unfortunatly I dont recieve a mail to confirm the change. did try that twice..


Marian

> On Aug 24, 2007, at 5:05 AM, Marian Thieme wrote:
> 
> > Hi,
> >
> > 2 questions about the naming of the module I did submit
> > (see http://bugzilla.open-bio.org/show_bug.cgi?id=2332)
> >
> > 1.) The package:
> > because there exists already an expression package I suggest to  
> > create a
> > new package called resequencing
> >
> > 2.) I would suggest that the module is called RedundantFragments or
> > AdditionalFragments
> >
> > so we would have something like:
> >
> > Bio::Resequencing::AdditionalFragments
> >
> > Any other ideas ?
> >
> > Marian
> >
> > By the way can anybody change my email adress to  
> > marian.thieme at arcor.de
> > in bugzilla as well as in the bioperl list, please ?!! didnt achieve
> > that by my own...
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 

Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT F?R ALLE NEUEINSTEIGER
Jetzt bei Arcor: g?nstig und schnell mit DSL - das All-Inclusive-Paket
f?r clevere Doppel-Sparer, nur  34,95 ?  inkl. DSL- und ISDN-Grundgeb?hr!
http://www.arcor.de/rd/emf-dsl-2


From cjfields at uiuc.edu  Fri Aug 24 21:12:25 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 16:12:25 -0500
Subject: [Bioperl-l] SeqFeature/AnnotatableI and rel. 1.6
In-Reply-To: <4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>
References: <D5DFB58D-EF9D-4D30-9B76-F242BD481EE7@uiuc.edu>
	<CFB61E08-641A-4302-93E0-E90DF435A4E4@gmx.net>
	<4F5FD173-FC80-4F70-B294-83DA58FDCE64@uiuc.edu>
Message-ID: <ABED5057-CFB5-4AAA-9D23-B6A069575BF6@uiuc.edu>

Okay, I have started a new branch in cvs (tagged featann_rollback).   
I'll start looking through everything within the next few days to get  
a general idea of what needs to be done.  All I know is the changes  
were extensive and included modifications to tests.

If anyone has comments I have added a wiki page here:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

chris

On Aug 24, 2007, at 11:23 AM, Chris Fields wrote:

> On Aug 23, 2007, at 10:34 PM, Hilmar Lapp wrote:
>
>> On Aug 23, 2007, at 12:53 AM, Chris Fields wrote:
>>
>>> There has been little work done over the last 2 1/2 years to undo or
>>> rectify problems associated with those additions; I feel like those
>>> of us still routinely contributing have been left holding the bag.
>>
>> Not by intention, but unfortunately that's probably a fair
>> assessment. (And I'm one of those guilty of inaction.)
>
> Not completely.  You, Jason, Chris M., and several others expressed
> yourselves quite clearly (move the code to a branch and test).  I
> think that everyone was trying to be diplomatic about it and so never
> attempted to do anything except get it working correctly.
>
>>> [...]
>>> I would like to suggest the radical idea of rolling back
>>> AnnotatableI/
>>> SeqFeatureI changes to a much simpler rel. 1.4-like behavior (tags
>>> are simple scalars) and possibly work in implementing Ewan's
>>> SeqFeature::TypedSeqFeatureI for those who want strong data types
>>> (i.e. Bio::FeatureIO/Bio::SeqFeature::Annotated).
>>
>> I fully support this; to me that sounds exactly like the way to go.
>
> Okay, I'll probably go ahead and get a branch started today.  I'll
> have to look at Ewan's interface in more detail; it's possible a new
> SeqFeature implementation will need to be written up to incorporate  
> it.
>
>>> The various AnnotatableI changes, odd inheritance, and operator
>>> overloading have
>>> really obfuscated the code to the point where no one wants to touch
>>> it in case it breaks something important.  However, I believe it is
>>> the one serious impediment to a new stable release.
>>
>> Yes, I think you're hitting the nail on the head.
>>
>> Chris, if you take the lead on this and carry it through we will
>> all owe you hugely. I'm not sure how many beers that would compare
>> to, but I'll throw in some. (Who else do I owe beer? I'm losing
>> track. Strangely nobody tried to redeem beer from me in Vienna.
>> Maybe in Toronto?)
>>
>> Seriously, rectifying this problem would lift a huge weight.
>>
>> 	-hilmar
>
> It would be nice to get regular releases started again.  I think
> this'll help.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From marian at arcor.de  Fri Aug 24 18:48:20 2007
From: marian at arcor.de (marian)
Date: Fri, 24 Aug 2007 20:48:20 +0200
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
 module/package name
In-Reply-To: <BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>	<46CEAD83.2050904@arcor.de>	<9824900.1187973171940.JavaMail.ngmail@webmail17>	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
	<BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
Message-ID: <46CF27F4.8030608@arcor.de>

Hilmar Lapp schrieb:
> On Aug 24, 2007, at 1:20 PM, Chris Fields wrote:
>
>   
>>>> ...
>>>>         
>>> The problem I see, with Bio::Expression, is that Resequencing chips
>>> are not belongs to Expression chips.
>>> (Expression chips are designed to hybridisize RNA strands and hence
>>> measure RNA expression levels, on the other hand a resequencing
>>> chip is based on DNA, also the design and the probe length is quite
>>> different). So, from my point of view it make sence to differ
>>> between dna and rna chips, at least.
>>>       
>> Then maybe the more generic Bio::Microarray namespace is the way to
>> go, with the module name Bio::Microarray::Tools:: MitoChip.  If
>> needed other tools can be added as needed.
>>
>>     
>
> Makes sense to me too. Presumably, regardless of DNA or RNA being  
> hybridized or length of probes, the data that comes out of them is  
> quite similar in a general nature (namely hybridization signals)?
>
> 	-hilmar
>   

Bio::Microarray::Tools::MitoChip would be OK to me. I merely meant, that it 
isnt an expression chip and you also wont/cant analyze expression data with 
the tool I am talking about.

Marian


From cjfields at uiuc.edu  Fri Aug 24 22:36:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 17:36:46 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
Message-ID: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>

One thing I am noticing with the rollback to tag as strings is that  
tags with an undefined value are not set; I'm assuming when tags were  
Bio::AnnotationI they were instantiated regardless with an undef  
value.  When attempting to call an undef tag with get_tag_values() I  
get:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: asking for tag value that does not exist signalPeptideLength
STACK: Error::throw
STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/ 
bioperl-live/blib/lib/Bio/Root/Root.pm:357
STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/ 
featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
STACK: t/targetp.t:189
-----------------------------------------------------------

I personally think of this as a feature (why set a tag at all if it  
is undef?).  However, are there any circumstances where we might want  
this behavior?  Do we want to simply return w/o a value if a tag name  
isn't found (i.e. remove the exception)?

chris


From hlapp at gmx.net  Fri Aug 24 23:02:43 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 24 Aug 2007 19:02:43 -0400
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
Message-ID: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>

You're supposed to call has_tag() first before you can assume that  
you can call get_tag_values() w/o an exception. That was the original  
API.

	-hilmar

On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:

> One thing I am noticing with the rollback to tag as strings is that
> tags with an undefined value are not set; I'm assuming when tags were
> Bio::AnnotationI they were instantiated regardless with an undef
> value.  When attempting to call an undef tag with get_tag_values() I
> get:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: asking for tag value that does not exist signalPeptideLength
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
> bioperl-live/blib/lib/Bio/Root/Root.pm:357
> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
> STACK: t/targetp.t:189
> -----------------------------------------------------------
>
> I personally think of this as a feature (why set a tag at all if it
> is undef?).  However, are there any circumstances where we might want
> this behavior?  Do we want to simply return w/o a value if a tag name
> isn't found (i.e. remove the exception)?
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Aug 25 04:05:58 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 24 Aug 2007 23:05:58 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
Message-ID: <6392DF1D-D91B-4B6E-812B-38FC0EA0D234@uiuc.edu>

Makes sense.  Okay, I'll leave the exception in.  Thanks!

chris

On Aug 24, 2007, at 6:02 PM, Hilmar Lapp wrote:

> You're supposed to call has_tag() first before you can assume that
> you can call get_tag_values() w/o an exception. That was the original
> API.
>
> 	-hilmar
>
> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>
>> One thing I am noticing with the rollback to tag as strings is that
>> tags with an undefined value are not set; I'm assuming when tags were
>> Bio::AnnotationI they were instantiated regardless with an undef
>> value.  When attempting to call an undef tag with get_tag_values() I
>> get:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: asking for tag value that does not exist signalPeptideLength
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>> STACK: t/targetp.t:189
>> -----------------------------------------------------------
>>
>> I personally think of this as a feature (why set a tag at all if it
>> is undef?).  However, are there any circumstances where we might want
>> this behavior?  Do we want to simply return w/o a value if a tag name
>> isn't found (i.e. remove the exception)?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From n.haigh at sheffield.ac.uk  Sat Aug 25 07:50:29 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sat, 25 Aug 2007 08:50:29 +0100
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
Message-ID: <46CFDF45.8030200@sheffield.ac.uk>

This sort of highlights a comment I made previously about how do you
test for a stable API?

It seems to me that unless you have intricate knowledge about the
changes that took place, you will find it difficult to know when an API
change has occurred. Is it possible to run the 1.4 test suite against
existing code to ensure tests pass? What if the 1.4 tests contained
bugs? This approach would need good code coverage by the tests to ensure
things work the same i.e. test code in HEAD against the test suite from
the previous stable release's branch - would/should this work
conceptually?**

Nath

Hilmar Lapp wrote:
> You're supposed to call has_tag() first before you can assume that  
> you can call get_tag_values() w/o an exception. That was the original  
> API.
>
> 	-hilmar
>
> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>
>   
>> One thing I am noticing with the rollback to tag as strings is that
>> tags with an undefined value are not set; I'm assuming when tags were
>> Bio::AnnotationI they were instantiated regardless with an undef
>> value.  When attempting to call an undef tag with get_tag_values() I
>> get:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: asking for tag value that does not exist signalPeptideLength
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>> STACK: t/targetp.t:189
>> -----------------------------------------------------------
>>
>> I personally think of this as a feature (why set a tag at all if it
>> is undef?).  However, are there any circumstances where we might want
>> this behavior?  Do we want to simply return w/o a value if a tag name
>> isn't found (i.e. remove the exception)?
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>   


From cjfields at uiuc.edu  Sat Aug 25 14:36:08 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 25 Aug 2007 09:36:08 -0500
Subject: [Bioperl-l] undef SeqFeature tag values
In-Reply-To: <46CFDF45.8030200@sheffield.ac.uk>
References: <88A352F1-EC1A-44FA-90DA-B869FF965F86@uiuc.edu>
	<7F5FDC98-24A6-4B74-A374-16780F9A5CC9@gmx.net>
	<46CFDF45.8030200@sheffield.ac.uk>
Message-ID: <3F3C311E-3CD5-436B-987F-FD7695904647@uiuc.edu>

The rollback branch is off of HEAD, not 1.4, so any bugs fixed since  
then and any modules/tests added will be present.  So far everything  
has worked relatively well; you can check the history of this page to  
track what has happened so far:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

The only problem code remaining for the first round of changes is a  
single method in Bio::SeqFeature::Annotated (if the tests are to be  
trusted) and one test in Bio::SeqFeature::AnnotationAdaptor using  
Hilmar's original test suite.  Most of those were tests breaking  
Feature/Annotation API outlined in the HOWTO (calling get_Annotations  
directly from a Bio::SeqI or Bio::SeqFeatureI for instance), or  
examples where has_tag() was not used.  I agree good test coverage  
would probably help catch some of those still silently lingering in  
code, but I don't think it can find everything; that's the reason I  
indicate there will need extensive testing.  That applies within the  
suite but also by users in the wild.

The SeqFeatureI and AnnotatableI API is defined very specifically in  
the Feature/Annotation HOWTO, so if anything the introduced changes  
violated much of that and started a domino effect of users  
unknowingly violating the API (me among them).  Also, just b/c a test  
passes doesn't mean it is the ->correct<- result; it is very easy to  
just throw something from Data::Dumper into an is() test and have it  
pass.  As an example, it appears there was a bit of cheating going on  
with AnnotationAdaptor.t in particular, where expected numbers were  
changed to conform to results w/o explanation.  Which is the correct  
answer?  I trust Hilmar's original test suite over the (rushed) changes.

chris

On Aug 25, 2007, at 2:50 AM, Nathan S. Haigh wrote:

> This sort of highlights a comment I made previously about how do you
> test for a stable API?
>
> It seems to me that unless you have intricate knowledge about the
> changes that took place, you will find it difficult to know when an  
> API
> change has occurred. Is it possible to run the 1.4 test suite against
> existing code to ensure tests pass? What if the 1.4 tests contained
> bugs? This approach would need good code coverage by the tests to  
> ensure
> things work the same i.e. test code in HEAD against the test suite  
> from
> the previous stable release's branch - would/should this work
> conceptually?**
>
> Nath
>
> Hilmar Lapp wrote:
>> You're supposed to call has_tag() first before you can assume that
>> you can call get_tag_values() w/o an exception. That was the original
>> API.
>>
>> 	-hilmar
>>
>> On Aug 24, 2007, at 6:36 PM, Chris Fields wrote:
>>
>>
>>> One thing I am noticing with the rollback to tag as strings is that
>>> tags with an undefined value are not set; I'm assuming when tags  
>>> were
>>> Bio::AnnotationI they were instantiated regardless with an undef
>>> value.  When attempting to call an undef tag with get_tag_values() I
>>> get:
>>>
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: asking for tag value that does not exist signalPeptideLength
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /Users/cjfields/src/featann_rollback/
>>> bioperl-live/blib/lib/Bio/Root/Root.pm:357
>>> STACK: Bio::SeqFeature::Generic::get_tag_values /Users/cjfields/src/
>>> featann_rollback/bioperl-live/blib/lib/Bio/SeqFeature/Generic.pm:499
>>> STACK: t/targetp.t:189
>>> -----------------------------------------------------------
>>>
>>> I personally think of this as a feature (why set a tag at all if it
>>> is undef?).  However, are there any circumstances where we might  
>>> want
>>> this behavior?  Do we want to simply return w/o a value if a tag  
>>> name
>>> isn't found (i.e. remove the exception)?
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Sat Aug 25 22:12:49 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 25 Aug 2007 17:12:49 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
Message-ID: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>

I have finished rolling back most of the specific changes made prior  
to the 1.5 release and have relevant tests passing:

http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round

Operator overloading of Bio::Annotation objects will be trickier to  
debug as tons of tests fail when the overloading is removed:

http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round

I'll start looking into fixes.  I don't like overloads from a  
personal standpoint (problems w/ long-term code maintenance), but was  
there a more specific reason for removing them?

chris


From hlapp at gmx.net  Sun Aug 26 04:58:46 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 00:58:46 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
Message-ID: <3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>

The reason was to provide for backward compatibility with the  
original API in which tag values were scalars, not objects. The idea  
was that if someone relied on that and treats the object as a scalar  
(comparison, printing, etc), the operator overloading would take care  
of that.

So by going back to the original API the overloading should become  
obsolete, at least theoretically.

The overloading can cause some very subtle issues that I pointed out  
in an earlier email. It's one of those really "clever" tricks that  
just make it very hard for newcomers to understand what's going on in  
their code.

	-hilmar

On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:

> I have finished rolling back most of the specific changes made prior
> to the 1.5 release and have relevant tests passing:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>
> Operator overloading of Bio::Annotation objects will be trickier to
> debug as tons of tests fail when the overloading is removed:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>
> I'll start looking into fixes.  I don't like overloads from a
> personal standpoint (problems w/ long-term code maintenance), but was
> there a more specific reason for removing them?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From n.haigh at sheffield.ac.uk  Sun Aug 26 07:35:36 2007
From: n.haigh at sheffield.ac.uk (Nathan S. Haigh)
Date: Sun, 26 Aug 2007 08:35:36 +0100
Subject: [Bioperl-l] please some help
In-Reply-To: <20070823233044.BJQ45014@mailstore2.fiu.edu>
References: <20070823233044.BJQ45014@mailstore2.fiu.edu>
Message-ID: <46D12D48.8080301@sheffield.ac.uk>

mcons004 at fiu.edu wrote:
>   Hello,
>      I am new to this software and I am having some trouble starting. The version of Bioperl I am working on is v5.8.6. My OS is Unix (Mac OS X). I am trying to use Bioperl with a file called blastParser to process a file which is the output of a "blastall" operation.
>   
>  The code that gives me error is:
>> perl blastParser.pl junk.out 1 1 1.0
>  and the error message says:
> Can't locate Bio/SearchIO.pm in @INC (@INC contains: /System/Library/Perl/5.8.6/darwin-thread-multi-2level
> 
>  You online info says I probably means that the module Bio::SearchIO.pm is not instaled and I can either install Bundle::Bioperl or install that specific module by hand. Could you give me some tips in this? I am new working with Unix, and Bioperl so I am a little confused. Any information will be helpful for me. Thanks
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

 From what you have said, it appears you need some basic info to 
understand what you are trying to achieve.

The Perl programming language requires the Perl interpreter in order to 
execute a Perl script. The Perl interpreter is usually installed as 
standard with Unix/Linux based Operating Systems. The version you 
mention (5.8.6) will not be the version of Bioperl but the version of 
the Perl interpreter you have installed - you can check this by typing 
"perl -v" at a command prompt.

Given your apparent lack of understanding of the Unix OS, it is likely 
that you don't have Bioperl installed. You should have a look at:
http://www.bioperl.org/wiki/Getting_BioPerl#Mac_OS_X_using_fink

Nath


From cjfields at uiuc.edu  Sun Aug 26 19:22:24 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 26 Aug 2007 14:22:24 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
Message-ID: <B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>

I managed to find your comments (as well as ones from Ewan, Jason,  
and a few others) on the mail list archives, so I'll link to them.   
The problem will be fixing the several places where overloading is  
assumed but no longer exists (i.e. in write_* methods), but we can  
probably pinpoint those by throwing or warning when overloading is  
assumed.

My thought is to either modify as_text() or add a new display_text()  
method to all AnnotationI that explicitly does what the overloading  
implied (print the annotation in a specified or assumed way).  We  
could then delegate to that in the stringification overload (with  
appropriate deprecation warnings) until 1.6, where we remove it  
completely.  Something like:

my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
                                         -primary_id => 'TSC0000030',
                                         -tagname => "tag2);

# either
print $link1->display_text(),"\n";
# or ...
print $link1->as_text("display"),"\n";
# prints "TSC:TSC0000030"

# default human readable
print $link1->as_text(),"\n";
# prints "Direct database link to TSC0000030 in database TSC"

print "$link1\n";
# gets a deprecation warning for now, removed completely for 1.6

chris

On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:

> The reason was to provide for backward compatibility with the  
> original API in which tag values were scalars, not objects. The  
> idea was that if someone relied on that and treats the object as a  
> scalar (comparison, printing, etc), the operator overloading would  
> take care of that.
>
> So by going back to the original API the overloading should become  
> obsolete, at least theoretically.
>
> The overloading can cause some very subtle issues that I pointed  
> out in an earlier email. It's one of those really "clever" tricks  
> that just make it very hard for newcomers to understand what's  
> going on in their code.
>
> 	-hilmar
>
> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>
>> I have finished rolling back most of the specific changes made prior
>> to the 1.5 release and have relevant tests passing:
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>
>> Operator overloading of Bio::Annotation objects will be trickier to
>> debug as tons of tests fail when the overloading is removed:
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>>
>> I'll start looking into fixes.  I don't like overloads from a
>> personal standpoint (problems w/ long-term code maintenance), but was
>> there a more specific reason for removing them?
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Sun Aug 26 20:57:37 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 16:57:37 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
Message-ID: <503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>

The thing that I actually never quite understood (and predates the  
API changes) is why $ann->as_text() needs to include explanatory text  
such as 'Direct database link to blah in database foo.' I would have  
said that "TSC:TSC0000030" is human readable enough, unless you  
present it without any context so that one would have no clue that it  
is a database cross-reference.

The as_text() method shouldn't be meant for the sole purpose of  
debugging annotation collections. However, I'm not sure for what else  
you could use it for, given that there are no guidelines for what to  
expect.

In fact, I do use as_text() a lot for a real purpose, namely as a  
surrogate unique key. For example, making a collection of dblinks  
unique is quite simple using the as_text() method:

	my %dbhash = map { ($_->as_text(), $_) } $anncoll->remove_Annotations 
('dblink');
	$anncoll->add_Annotation('dblink',$_) foreach (values %dbhash);

This is a common task when harvesting annotation from various places  
and then integrating it. However, there is nothing in the API  
documentation that suggests that this might be a reliable or even  
expected property such that you could omit the 'dblink' tag above.

I agree that having a conceptual equivalent to $feature->display_name  
and $seq->display_id would be good, but these methods have no claim  
to returning something that's unique in any way.

I guess I've now raised more questions than I answered (in fact I  
didn't answer any). Sorry 'bout that.

	-hilmar

On Aug 26, 2007, at 3:22 PM, Chris Fields wrote:

> I managed to find your comments (as well as ones from Ewan, Jason,  
> and a few others) on the mail list archives, so I'll link to them.   
> The problem will be fixing the several places where overloading is  
> assumed but no longer exists (i.e. in write_* methods), but we can  
> probably pinpoint those by throwing or warning when overloading is  
> assumed.
>
> My thought is to either modify as_text() or add a new display_text 
> () method to all AnnotationI that explicitly does what the  
> overloading implied (print the annotation in a specified or assumed  
> way).  We could then delegate to that in the stringification  
> overload (with appropriate deprecation warnings) until 1.6, where  
> we remove it completely.  Something like:
>
> my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
>                                         -primary_id => 'TSC0000030',
>                                         -tagname => "tag2);
>
> # either
> print $link1->display_text(),"\n";
> # or ...
> print $link1->as_text("display"),"\n";
> # prints "TSC:TSC0000030"
>
> # default human readable
> print $link1->as_text(),"\n";
> # prints "Direct database link to TSC0000030 in database TSC"
>
> print "$link1\n";
> # gets a deprecation warning for now, removed completely for 1.6
>
> chris
>
> On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:
>
>> The reason was to provide for backward compatibility with the  
>> original API in which tag values were scalars, not objects. The  
>> idea was that if someone relied on that and treats the object as a  
>> scalar (comparison, printing, etc), the operator overloading would  
>> take care of that.
>>
>> So by going back to the original API the overloading should become  
>> obsolete, at least theoretically.
>>
>> The overloading can cause some very subtle issues that I pointed  
>> out in an earlier email. It's one of those really "clever" tricks  
>> that just make it very hard for newcomers to understand what's  
>> going on in their code.
>>
>> 	-hilmar
>>
>> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>>
>>> I have finished rolling back most of the specific changes made prior
>>> to the 1.5 release and have relevant tests passing:
>>>
>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>>
>>> Operator overloading of Bio::Annotation objects will be trickier to
>>> debug as tons of tests fail when the overloading is removed:
>>>
>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Second_round
>>>
>>> I'll start looking into fixes.  I don't like overloads from a
>>> personal standpoint (problems w/ long-term code maintenance), but  
>>> was
>>> there a more specific reason for removing them?
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sun Aug 26 22:47:41 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 26 Aug 2007 17:47:41 -0500
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
	<503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
Message-ID: <E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>

Either way I implement, it would be used simply as a generic  
convenience method to replicate output via stringification  
overloading, using a common method name for all AnnotationI; there  
seem to be several instances where this is used for generating output  
(i.e. SeqIO::genbank).  So, for instance, when formatting output you  
could just call as_text('display') or display_text() and you would  
get the most common formatting for that particular annotation type.

chris

On Aug 26, 2007, at 3:57 PM, Hilmar Lapp wrote:

> The thing that I actually never quite understood (and predates the  
> API changes) is why $ann->as_text() needs to include explanatory  
> text such as 'Direct database link to blah in database foo.' I  
> would have said that "TSC:TSC0000030" is human readable enough,  
> unless you present it without any context so that one would have no  
> clue that it is a database cross-reference.
>
> The as_text() method shouldn't be meant for the sole purpose of  
> debugging annotation collections. However, I'm not sure for what  
> else you could use it for, given that there are no guidelines for  
> what to expect.
>
> In fact, I do use as_text() a lot for a real purpose, namely as a  
> surrogate unique key. For example, making a collection of dblinks  
> unique is quite simple using the as_text() method:
>
> 	my %dbhash = map { ($_->as_text(), $_) } $anncoll- 
> >remove_Annotations('dblink');
> 	$anncoll->add_Annotation('dblink',$_) foreach (values %dbhash);
>
> This is a common task when harvesting annotation from various  
> places and then integrating it. However, there is nothing in the  
> API documentation that suggests that this might be a reliable or  
> even expected property such that you could omit the 'dblink' tag  
> above.
>
> I agree that having a conceptual equivalent to $feature- 
> >display_name and $seq->display_id would be good, but these methods  
> have no claim to returning something that's unique in any way.
>
> I guess I've now raised more questions than I answered (in fact I  
> didn't answer any). Sorry 'bout that.
>
> 	-hilmar
>
> On Aug 26, 2007, at 3:22 PM, Chris Fields wrote:
>
>> I managed to find your comments (as well as ones from Ewan, Jason,  
>> and a few others) on the mail list archives, so I'll link to  
>> them.  The problem will be fixing the several places where  
>> overloading is assumed but no longer exists (i.e. in write_*  
>> methods), but we can probably pinpoint those by throwing or  
>> warning when overloading is assumed.
>>
>> My thought is to either modify as_text() or add a new display_text 
>> () method to all AnnotationI that explicitly does what the  
>> overloading implied (print the annotation in a specified or  
>> assumed way).  We could then delegate to that in the  
>> stringification overload (with appropriate deprecation warnings)  
>> until 1.6, where we remove it completely.  Something like:
>>
>> my $link1 = Bio::Annotation::DBLink->new(-database => 'TSC',
>>                                         -primary_id => 'TSC0000030',
>>                                         -tagname => "tag2);
>>
>> # either
>> print $link1->display_text(),"\n";
>> # or ...
>> print $link1->as_text("display"),"\n";
>> # prints "TSC:TSC0000030"
>>
>> # default human readable
>> print $link1->as_text(),"\n";
>> # prints "Direct database link to TSC0000030 in database TSC"
>>
>> print "$link1\n";
>> # gets a deprecation warning for now, removed completely for 1.6
>>
>> chris
>>
>> On Aug 25, 2007, at 11:58 PM, Hilmar Lapp wrote:
>>
>>> The reason was to provide for backward compatibility with the  
>>> original API in which tag values were scalars, not objects. The  
>>> idea was that if someone relied on that and treats the object as  
>>> a scalar (comparison, printing, etc), the operator overloading  
>>> would take care of that.
>>>
>>> So by going back to the original API the overloading should  
>>> become obsolete, at least theoretically.
>>>
>>> The overloading can cause some very subtle issues that I pointed  
>>> out in an earlier email. It's one of those really "clever" tricks  
>>> that just make it very hard for newcomers to understand what's  
>>> going on in their code.
>>>
>>> 	-hilmar
>>>
>>> On Aug 25, 2007, at 6:12 PM, Chris Fields wrote:
>>>
>>>> I have finished rolling back most of the specific changes made  
>>>> prior
>>>> to the 1.5 release and have relevant tests passing:
>>>>
>>>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#First_round
>>>>
>>>> Operator overloading of Bio::Annotation objects will be trickier to
>>>> debug as tons of tests fail when the overloading is removed:
>>>>
>>>> http://www.bioperl.org/wiki/ 
>>>> Feature_Annotation_rollback#Second_round
>>>>
>>>> I'll start looking into fixes.  I don't like overloads from a
>>>> personal standpoint (problems w/ long-term code maintenance),  
>>>> but was
>>>> there a more specific reason for removing them?
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>>
>>>
>>>
>>>
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From hlapp at gmx.net  Sun Aug 26 23:01:03 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sun, 26 Aug 2007 19:01:03 -0400
Subject: [Bioperl-l] Feature/Annotation rollback(update)
In-Reply-To: <E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>
References: <CECA0A27-EABD-44A8-8C6C-9AC666270437@uiuc.edu>
	<3BC5C775-0062-4B02-A929-D2D3F8FDD768@gmx.net>
	<B2C61BB2-E4B8-4902-BB86-48F3457DF9EB@uiuc.edu>
	<503E47B9-EB4E-4442-8A56-D1513489EEA3@gmx.net>
	<E0A389DE-3399-4439-9AC2-76319CCD5B84@uiuc.edu>
Message-ID: <35BBCF3B-BA1B-4C8D-8753-2A27AB3B423C@gmx.net>


On Aug 26, 2007, at 6:47 PM, Chris Fields wrote:

> just call as_text('display') or display_text()

The latter is more obvious, and can be better tested for presence and  
implementation, though in the world of perl that's of course not  
strictly true.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From zeroliu at 163.com  Mon Aug 27 11:49:53 2007
From: zeroliu at 163.com (zeroliu)
Date: Mon, 27 Aug 2007 19:49:53 +0800 (CST)
Subject: [Bioperl-l] Problems of parse emboss water result by Bio::AlignIO
Message-ID: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>

 Hello,
I'm trying to parse water (EMBOSS 5.0.0) result by Bio::AlignIO
(Bioperl-1.4) and encountered some problems.
1. What does the Bio::AlignIO->next_aln() return?
Does it return a Bio::Align::AlignI or Bio::SimpleAlign object?
Or it depends on the alignment file format?
2. How can I get the "score" properity in a water alignment result?
There is a score method in Bio::SimpleAlign but not in Bio::AlignIO.
In 2004, Jason mentioned:
Scores are set by the Alignment parser - we separate the 'running' from
the 'parsing'.
Bio::AlignIO::emboss had to be updated.
(http://article.gmane.org/gmane.comp.lang.perl.bio.general/7156/match=alignio+water)
How could I know it?
Thank you very much!  


From cjfields at uiuc.edu  Mon Aug 27 17:13:13 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 12:13:13 -0500
Subject: [Bioperl-l] Bio::SeqFeature::Annotated status
Message-ID: <6DC5ECA8-3DF1-4B84-914C-4F2B3B44E29A@uiuc.edu>

What is the current status on maintenance of  
Bio::SeqFeature::Annotated?  From what I gather (based on the code  
and past mail list posts) the intent of the module seems to be to  
store any SeqFeature-specific data (tags, score, source, primary_tag,  
etc) in a Bio::AnnotationCollectionI as strongly typed data.  However  
there are several inconsistencies, such as objects being returned  
when a string is expected (score(), source()).

Also, several methods appear half-implemented, aren't consistent with  
SeqFeatureI API or similar methods in other SeqFeatureI's, and there  
are no docs explaining what is expected.
If no one speaks up on it, I'll do my best with maintaining it  
myself, but don't expect the API to stay as it is.

chris


From cjfields at uiuc.edu  Mon Aug 27 22:31:01 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 17:31:01 -0500
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
Message-ID: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>

This is related to the ongoing Feature/Annotation rollback.  I have  
found that a few Ontology-related modules are (either directly or  
indirectly) passing strings instead of Bio::Annotation::DBLinks to  
Bio::Ontology::Term::new(), add_dblink(), or add_dblink_context()  
(thelast is where the error occurs).

If needed we could allow strings to be passed but this isn't  
consistent with the API.  Any thoughts on what to do here?

chris


From hlapp at gmx.net  Mon Aug 27 23:07:12 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 27 Aug 2007 19:07:12 -0400
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
In-Reply-To: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
References: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
Message-ID: <01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>

The B::O::TermI interface actually says that get_dblinks() would  
return scalars. That's why the add_dblink methods accept strings. I  
also agree that this is inconsistent with with the rest of BioPerl.

Oddly enough, Term::add_dblink_context() does ask for DBLink objects,  
though it doesn't seem to be enforced, even though  
Term::get_dblink_context() is advertised as returning scalars.

So it does seem this is messed up design-wise. It seems to me that to  
really fix this would inevitably break the API, and I don't see how  
you would make this backwards compatible w/o creating a lot of messy  
code, the sole purpose of which would be backwards compatibility.

One could only fix Term::add_dblink_context() as it's not in the  
interface but that wouldn't contribute anything to improving  
consistency.

So the alternative to breaking the API in a non-backwards compatible  
fashion would be to add to it, map the existing dblink methods onto  
the added ones, and start deprecating them. For example, you could  
add methods get_dbxrefs() (also on the interface), add_dbxref(),  
etc,   and build in a context argument so we don't need another set  
of methods for that. They would accept and return DBLink objects, and  
the get_dblink() methods could be changed to map those to scalars  
while also getting slated for deprecation.

Does this make sense?

	-hilmar

On Aug 27, 2007, at 6:31 PM, Chris Fields wrote:

> This is related to the ongoing Feature/Annotation rollback.  I have
> found that a few Ontology-related modules are (either directly or
> indirectly) passing strings instead of Bio::Annotation::DBLinks to
> Bio::Ontology::Term::new(), add_dblink(), or add_dblink_context()
> (thelast is where the error occurs).
>
> If needed we could allow strings to be passed but this isn't
> consistent with the API.  Any thoughts on what to do here?
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Tue Aug 28 01:12:35 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 20:12:35 -0500
Subject: [Bioperl-l] Bio::Ontology::Term (rollback question)
In-Reply-To: <01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>
References: <C16195C4-9339-409B-9D13-2A447E0C866C@uiuc.edu>
	<01A56BFB-DE36-4C95-9BD3-DB35A706BD87@gmx.net>
Message-ID: <EF121F1E-BAA0-49BD-830F-1F3BC6FAC807@uiuc.edu>


On Aug 27, 2007, at 6:07 PM, Hilmar Lapp wrote:

> The B::O::TermI interface actually says that get_dblinks() would  
> return scalars. That's why the add_dblink methods accept strings. I  
> also agree that this is inconsistent with with the rest of BioPerl.
>
> Oddly enough, Term::add_dblink_context() does ask for DBLink  
> objects, though it doesn't seem to be enforced, even though  
> Term::get_dblink_context() is advertised as returning scalars.

This happened b/c of stringification and 'eq' overloading.  Just  
removing the overloads didn't reveal this problem; I had to add  
exceptions to them to fish this out.

> So it does seem this is messed up design-wise. It seems to me that  
> to really fix this would inevitably break the API, and I don't see  
> how you would make this backwards compatible w/o creating a lot of  
> messy code, the sole purpose of which would be backwards  
> compatibility.
>
> One could only fix Term::add_dblink_context() as it's not in the  
> interface but that wouldn't contribute anything to improving  
> consistency.

Agreed; in fact it may make it more confusing.

> So the alternative to breaking the API in a non-backwards  
> compatible fashion would be to add to it, map the existing dblink  
> methods onto the added ones, and start deprecating them. For  
> example, you could add methods get_dbxrefs() (also on the  
> interface), add_dbxref(), etc,   and build in a context argument so  
> we don't need another set of methods for that. They would accept  
> and return DBLink objects, and the get_dblink() methods could be  
> changed to map those to scalars while also getting slated for  
> deprecation.
>
> Does this make sense?
>
> 	-hilmar

I think so; I'll have to look over the code to see how we would  
implement this, though I'm guessing everything would be stored as  
DBLink objects by default.  Any changes will probably need to wait  
until after I fish out any remaining spots in the code where  
overloading is being used, but at least we have a direction on where  
to go.

chris


From cjfields at uiuc.edu  Tue Aug 28 04:18:19 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 27 Aug 2007 23:18:19 -0500
Subject: [Bioperl-l] Feature/Annotation rollback (update #2)
Message-ID: <A91DD20B-841B-480A-A953-E811AD634AF0@uiuc.edu>

Okay, the planned rollback on is pretty much complete with a few  
exceptions.  I'll probably merge back to bioperl-live within the next  
few days once the following issues are addressed:

1)  Bio::Ontology::Term - several classes are using  
Bio::Ontology::Term in ways inconsistent with one another; some are  
passing Bio::Annotation::DBLink instances and other are passing  
simple strings.  This was somewhat transparent with various operator  
overloads but now they have really come to the surface.  I'll  
probably work on Hilmar's suggestion on adding extra class methods to  
give it a more consistent interface and deprecate the older ones.  As  
one might guess this affects much of Bio::Ontology but also  
Bio::Seqfeature::Annotated; strangely enough FeatureIO tests pass  
(which may simply mean there isn't enough test coverage for FeatureIO).

2)  Bio::SeqFeature::Annotated - no word back on maintenance for this  
module.  It needs to implement Bio::SeqFeature::TypedSeqFeatureI  
(pretty easy) and needs documentation (not so easy).  It's apparently  
essential for FeatureIO.  I'll basically get it up-and-running and  
clean up the API.

There are a few odds and ends that need to be addressed with  
roundtripping, but these are already problems on the MAIN trunk so  
they will be addressed once code is merged back in.

chris


From Frigerio at pierroton.inra.fr  Tue Aug 28 07:12:22 2007
From: Frigerio at pierroton.inra.fr (Jean-Marc FRIGERIO)
Date: Tue, 28 Aug 2007 09:12:22 +0200
Subject: [Bioperl-l] Bio::SeqIO::phd_comment objet
Message-ID: <200708280912.22798.Frigerio@pierroton.inra.fr>

Hi,

The Bio::SeqIO::phd module says, speaking about the COMMENT section of a phd 
file:
 # this should be an actual object to assist in serialization
  # but I don't have time for this now."

The doc says ( http://www.bioperl.org/wiki/Core_1.5.1_1.5.2_delta)

   This really needs a "phred_comments" object of some sort so that it will be 
serializable. Then when java clients get this object they will be able to 
deserialize it. 

I volunteer to do this,  but need your opinion.

Do we really need an object (Bio::phd_comment ? Bio::SeqIO::phd_comment ? 
Bio::phd_header ? other ?).

Or adding  few  Bio::Seq::SeqWithQuality subs in the Bio::SeqIO::phd module 
would suffice ? What are the constraints of serialization/deserialization of 
the java clients ?
I was thinking of just adding get/setter for all the comments
chromat_file(), abi_thumbprint(), etc.

touch() for the timestamp
attribute() for new unknown comments
write_comment().

others ?

		-- jmf

-- 
Jean-Marc Frigerio,
UMR BIOGECO   69, route d'Arcachon, 33612 CESTAS France
Tel : +33(0) 557 122 829   Fax : +33(0) 557 122 881
Frigerio at pierroton.inra.fr   http://www.pierroton.inra.fr/biogeco/index.html


From jay at jays.net  Tue Aug 28 11:14:37 2007
From: jay at jays.net (Jay Hannah)
Date: Tue, 28 Aug 2007 06:14:37 -0500
Subject: [Bioperl-l] Problems of parse emboss water result by
	Bio::AlignIO
In-Reply-To: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>
References: <534546299.525411188215393753.JavaMail.coremail@bj163app118.163.com>
Message-ID: <4CD8B5C2-3C87-495C-894E-17C3C67091DA@jays.net>

On Aug 27, 2007, at 6:49 AM, zeroliu wrote:
> I'm trying to parse water (EMBOSS 5.0.0) result by Bio::AlignIO
> (Bioperl-1.4) and encountered some problems.
> 1. What does the Bio::AlignIO->next_aln() return?
> Does it return a Bio::Align::AlignI or Bio::SimpleAlign object?
> Or it depends on the alignment file format?

http://doc.bioperl.org/bioperl-live/Bio/AlignIO.html
  Title   : next_aln
  Usage   : $aln = stream->next_aln
  Function: reads the next $aln object from the stream
  Returns : a Bio::Align::AlignI compliant object

> 2. How can I get the "score" properity in a water alignment result?
> There is a score method in Bio::SimpleAlign but not in Bio::AlignIO.
> In 2004, Jason mentioned:
> Scores are set by the Alignment parser - we separate the 'running'  
> from
> the 'parsing'.
> Bio::AlignIO::emboss had to be updated.
> (http://article.gmane.org/gmane.comp.lang.perl.bio.general/7156/ 
> match=alignio+water)
> How could I know it?

Line 480 of t/AlignIO.t seems to walk you through? Here's the block,  
with the test overhead removed.

# EMBOSS water
$str = Bio::AlignIO->new('-format' => 'emboss',
                          '-file' => 'cysprot.water');
$aln = $str->next_aln();
# $aln is now a Bio::Align::AlignI object
print $aln->score;    # '501.50'

HTH,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


From cjfields at uiuc.edu  Tue Aug 28 21:05:10 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 28 Aug 2007 16:05:10 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
Message-ID: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>

I'm now wrapping up the Feature/Annotation rollback.  I will probably  
start merging back to the main branch in the next day or two., as  
soon as interested parties (*cough*devs*cough*) look over the last  
batch of changes.

http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round

I have also added a small benchmark test which indicates a decrease  
in parsing time in SeqIO::genbank with all tests passing.  I expect  
this will translate over to any Bio::SeqFeature::Generic-using class  
(open mouth, prepare to insert foot....).

It is also possible there are still some instances where overloading  
is expected lurking about in the ~1000 or so modules, so I'll leave  
the exceptions I added to all Bio::AnnotationI; we can remove them  
down the line, maybe prior to rel1.6, after more tests are added or  
if they get particularly annoying.  My guess is I caught 99.99% of  
them (prepare to insert other foot....).

The key change in this last round is the addition of several class  
*dbxref* methods to Bio::Ontology::Term and  
Bio::Annotation::OntologyTerm, all of which are capable of working  
with either DBLink instances or simple scalars.  This was primarily  
done in order to clear up inconsistencies in the older *dblink*  
methods, which were ambiguous (some indicates simple scalar  
arguments, others DBLink objects); operator overloading was used  
extensively in these cases, which led to several issues.  I have  
added deprecation warnings to the older methods which now map to  
using the newer methods.  All tests pass with the exception of a few  
already failing on the MAIN branch; the single test which needs to be  
fixed is a round-tripping error in swiss.t (now a TODO), which can be  
fixed after merging back.

Please respond to this if there are any questions or if I need to  
clarify the changes I made a bit more.

chris


From hlapp at duke.edu  Tue Aug 28 22:13:32 2007
From: hlapp at duke.edu (Hilmar Lapp)
Date: Tue, 28 Aug 2007 18:13:32 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
References: <20070828070219.DE03668527@evol.biology.mcmaster.ca>
Message-ID: <1F006707-291C-4895-A178-33FDFBDE6AE6@duke.edu>

Is anyone thinking about adding support for this as an aligner  
option? I'm not sure whether aside from a Bio::Tools::Run module we'd  
also need a format parser - it sounds like it's emitting clustalw  
format?

	-hilmar

Begin forwarded message:

> From: evoldir at evol.biology.mcmaster.ca
> Date: August 28, 2007 3:02:19 AM EDT
> To: hlapp at duke.edu
> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
> Reply-To: racartwr at ncsu.edu
>
>
> Ngila is a global, pairwise alignment program that uses logarithmic  
> and
> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are more
> biologically realistic than the more popular (and efficient) affine  
> gap
> cost model.
>
> I have recently completed updating the program to version 1.2.1.  The
> new version includes two new, evolutionary alignment models based  
> on my
> current research.  These models allow you to find the maximum  
> alignment
> of two sequences based on biological, evolutionary parameters---no  
> more
> guessing at biological costs.  Additional changes are noted on the  
> website.
>
> Website & Manual:
>
> http://scit.us/projects/ngila/
>
> Windows Binary:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>
> Unix/Mac Source Code:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>
> I'll be happy to answer any questions users have about the new  
> models or
> the program.
>
> -- 
> *********************************************************
> Reed A. Cartwright, PhD     http://scit.us/
> Postdoctoral Researcher     http://www.dererumnatura.us/
> Department of Genetics      http://www.pandasthumb.org/
>
> Bioinformatics Research Center
> North Carolina State University
> Campus Box 7566
> Raleigh, NC 27695-7566
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:- hlapp at duke dot edu :
===========================================================


From hlapp at duke.edu  Tue Aug 28 22:13:32 2007
From: hlapp at duke.edu (Hilmar Lapp)
Date: Tue, 28 Aug 2007 18:13:32 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
Message-ID: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>

Is anyone thinking about adding support for this as an aligner  
option? I'm not sure whether aside from a Bio::Tools::Run module we'd  
also need a format parser - it sounds like it's emitting clustalw  
format?

	-hilmar

Begin forwarded message:

> From: evoldir at evol.biology.mcmaster.ca
> Date: August 28, 2007 3:02:19 AM EDT
> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
> Reply-To: racartwr at ncsu.edu
>
>
> Ngila is a global, pairwise alignment program that uses logarithmic  
> and
> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are more
> biologically realistic than the more popular (and efficient) affine  
> gap
> cost model.
>
> I have recently completed updating the program to version 1.2.1.  The
> new version includes two new, evolutionary alignment models based  
> on my
> current research.  These models allow you to find the maximum  
> alignment
> of two sequences based on biological, evolutionary parameters---no  
> more
> guessing at biological costs.  Additional changes are noted on the  
> website.
>
> Website & Manual:
>
> http://scit.us/projects/ngila/
>
> Windows Binary:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>
> Unix/Mac Source Code:
>
> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>
> I'll be happy to answer any questions users have about the new  
> models or
> the program.
>
> -- 
> *********************************************************
> Reed A. Cartwright, PhD     http://scit.us/
> Postdoctoral Researcher     http://www.dererumnatura.us/
> Department of Genetics      http://www.pandasthumb.org/
>
> Bioinformatics Research Center
> North Carolina State University
> Campus Box 7566
> Raleigh, NC 27695-7566
>


From hlapp at gmx.net  Tue Aug 28 23:09:46 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 28 Aug 2007 19:09:46 -0400
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
In-Reply-To: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
References: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
Message-ID: <EF683AC3-F30C-49BC-9F16-7BA10C70F751@gmx.net>

Sorry for the double post, BTW. I had erroneously assumed that the  
first email would be held for post by non-member. -hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Wed Aug 29 04:01:13 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 28 Aug 2007 23:01:13 -0500
Subject: [Bioperl-l] Fwd: Announcing Ngila 1.2.1 Alignment Program
In-Reply-To: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
References: <E8CEAD6A-9F6B-43B8-94A3-95A1C96E872D@duke.edu>
Message-ID: <EDED724C-3219-45FF-BAF2-592EEEBCB634@uiuc.edu>

It probably wouldn't be hard to write one up, particularly if it's  
got already parsable format.  We could probably base it off the  
current clustalw wrapper unless someone else thinks there is a better  
way.

chris

On Aug 28, 2007, at 5:13 PM, Hilmar Lapp wrote:

> Is anyone thinking about adding support for this as an aligner
> option? I'm not sure whether aside from a Bio::Tools::Run module we'd
> also need a format parser - it sounds like it's emitting clustalw
> format?
>
> 	-hilmar
>
> Begin forwarded message:
>
>> From: evoldir at evol.biology.mcmaster.ca
>> Date: August 28, 2007 3:02:19 AM EDT
>> Subject: Other:  Announcing Ngila 1.2.1 Alignment Program
>> Reply-To: racartwr at ncsu.edu
>>
>>
>> Ngila is a global, pairwise alignment program that uses logarithmic
>> and
>> affine gap costs, i.e. C(g) = a+b*g+c*ln(g).  These gap costs are  
>> more
>> biologically realistic than the more popular (and efficient) affine
>> gap
>> cost model.
>>
>> I have recently completed updating the program to version 1.2.1.  The
>> new version includes two new, evolutionary alignment models based
>> on my
>> current research.  These models allow you to find the maximum
>> alignment
>> of two sequences based on biological, evolutionary parameters---no
>> more
>> guessing at biological costs.  Additional changes are noted on the
>> website.
>>
>> Website & Manual:
>>
>> http://scit.us/projects/ngila/
>>
>> Windows Binary:
>>
>> http://scit.us/projects/files/ngila/Releases/ngila-release-win32.zip
>>
>> Unix/Mac Source Code:
>>
>> http://scit.us/projects/files/ngila/Releases/ngila-release.tar.gz
>>
>> I'll be happy to answer any questions users have about the new
>> models or
>> the program.
>>
>> -- 
>> *********************************************************
>> Reed A. Cartwright, PhD     http://scit.us/
>> Postdoctoral Researcher     http://www.dererumnatura.us/
>> Department of Genetics      http://www.pandasthumb.org/
>>
>> Bioinformatics Research Center
>> North Carolina State University
>> Campus Box 7566
>> Raleigh, NC 27695-7566
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Aug 29 16:03:07 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 11:03:07 -0500
Subject: [Bioperl-l] remote SwissProt server problems
Message-ID: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>

Just as a notice, DBFetch is currently retrieving only single records  
for the UniProtKB database (where Bio::DB::SwissProt fetches  
sequences).  If anyone runs remote sevrer tests and DB.t in the test  
suite you'll see a failure towards the end which indicates this.   
I've posted a notice to the server help desk and will respond when I  
hear more.

chris


From cain.cshl at gmail.com  Wed Aug 29 19:45:48 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 29 Aug 2007 15:45:48 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
Message-ID: <1188416748.2567.36.camel@localhost.localdomain>

Hi Chris,

I just wanted to let you know that I was out of town for a few days, but
now I'm back and I'm doing testing of GMOD software based on the branch
you are working on.  I'll let you know how it goes, but don't let me
stop you if you confident of your changes.  I'm sure whatever goes
wrong, it will just point out holes in the FeatureIO tests (I'm sure
there are plenty) and will require hopefully minimal changes on my end.

Thanks for your considerable efforts on this!  (Regardless of how much
work it makes for me :-)
Scott


On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> I'm now wrapping up the Feature/Annotation rollback.  I will probably  
> start merging back to the main branch in the next day or two., as  
> soon as interested parties (*cough*devs*cough*) look over the last  
> batch of changes.
> 
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
> 
> I have also added a small benchmark test which indicates a decrease  
> in parsing time in SeqIO::genbank with all tests passing.  I expect  
> this will translate over to any Bio::SeqFeature::Generic-using class  
> (open mouth, prepare to insert foot....).
> 
> It is also possible there are still some instances where overloading  
> is expected lurking about in the ~1000 or so modules, so I'll leave  
> the exceptions I added to all Bio::AnnotationI; we can remove them  
> down the line, maybe prior to rel1.6, after more tests are added or  
> if they get particularly annoying.  My guess is I caught 99.99% of  
> them (prepare to insert other foot....).
> 
> The key change in this last round is the addition of several class  
> *dbxref* methods to Bio::Ontology::Term and  
> Bio::Annotation::OntologyTerm, all of which are capable of working  
> with either DBLink instances or simple scalars.  This was primarily  
> done in order to clear up inconsistencies in the older *dblink*  
> methods, which were ambiguous (some indicates simple scalar  
> arguments, others DBLink objects); operator overloading was used  
> extensively in these cases, which led to several issues.  I have  
> added deprecation warnings to the older methods which now map to  
> using the newer methods.  All tests pass with the exception of a few  
> already failing on the MAIN branch; the single test which needs to be  
> fixed is a round-tripping error in swiss.t (now a TODO), which can be  
> fixed after merging back.
> 
> Please respond to this if there are any questions or if I need to  
> clarify the changes I made a bit more.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070829/f8433568/attachment.sig>

From cjfields at uiuc.edu  Wed Aug 29 20:13:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 15:13:17 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188416748.2567.36.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
Message-ID: <8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>

I'll probably go ahead and start merging this stuff over to CVS HEAD  
then.  There haven't been any objections so far.

The page I posted outlines the more critical fixes, primarily the  
changes to Bio::Ontology::Term methods (along with relevant code) due  
to inconsistencies in the interface.  The Bio::Annotation classes  
also now throw if you attempt to use them in an overloaded context.   
I also split off SeqFeature::Annotated tests into it's own test suite  
(SeqFeatAnnotated.t).

Let me know if there are any problems along the way!

chris

On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:

> Hi Chris,
>
> I just wanted to let you know that I was out of town for a few  
> days, but
> now I'm back and I'm doing testing of GMOD software based on the  
> branch
> you are working on.  I'll let you know how it goes, but don't let me
> stop you if you confident of your changes.  I'm sure whatever goes
> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> there are plenty) and will require hopefully minimal changes on my  
> end.
>
> Thanks for your considerable efforts on this!  (Regardless of how much
> work it makes for me :-)
> Scott
>
>
> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>> I'm now wrapping up the Feature/Annotation rollback.  I will probably
>> start merging back to the main branch in the next day or two., as
>> soon as interested parties (*cough*devs*cough*) look over the last
>> batch of changes.
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>>
>> I have also added a small benchmark test which indicates a decrease
>> in parsing time in SeqIO::genbank with all tests passing.  I expect
>> this will translate over to any Bio::SeqFeature::Generic-using class
>> (open mouth, prepare to insert foot....).
>>
>> It is also possible there are still some instances where overloading
>> is expected lurking about in the ~1000 or so modules, so I'll leave
>> the exceptions I added to all Bio::AnnotationI; we can remove them
>> down the line, maybe prior to rel1.6, after more tests are added or
>> if they get particularly annoying.  My guess is I caught 99.99% of
>> them (prepare to insert other foot....).
>>
>> The key change in this last round is the addition of several class
>> *dbxref* methods to Bio::Ontology::Term and
>> Bio::Annotation::OntologyTerm, all of which are capable of working
>> with either DBLink instances or simple scalars.  This was primarily
>> done in order to clear up inconsistencies in the older *dblink*
>> methods, which were ambiguous (some indicates simple scalar
>> arguments, others DBLink objects); operator overloading was used
>> extensively in these cases, which led to several issues.  I have
>> added deprecation warnings to the older methods which now map to
>> using the newer methods.  All tests pass with the exception of a few
>> already failing on the MAIN branch; the single test which needs to be
>> fixed is a round-tripping error in swiss.t (now a TODO), which can be
>> fixed after merging back.
>>
>> Please respond to this if there are any questions or if I need to
>> clarify the changes I made a bit more.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                          
> cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jay at jays.net  Wed Aug 29 22:11:55 2007
From: jay at jays.net (Jay Hannah)
Date: Wed, 29 Aug 2007 17:11:55 -0500
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
Message-ID: <46D5EF2B.5000101@jays.net>

Please slap me if I'm hysterical.

I'm seeking a broad bioinformatics search engine platform. I want to 
take gobs of data in gobs of formats and allow people to search it on 
the web.

- Entrez is awesome. Unfortunately I don't see anything in the NCBI 
toolkit that helps me run my own version of it. Even a tiny one. After 
an initial "check out our toolkit" response from NCBI I don't seem to be 
getting anywhere. Maybe I'm not communicating enough or well enough.

- EB-eye Search is slick. I don't see any developer kit or source code 
of any kind and I've gotten no response to my emails to them.

- LuceGene is very cool. But it looks like no one has touched it in 2.5 
years and I've gotten no response from their contact email address. I'm 
especially intrigued by their

  src/LuceGene/src/org/eugenes/index/LuceneReadseqIndexer.java

which seems to use the rather popular(?) Java Readseq to populate Lucene 
with source data in all sorts of different formats.

I don't know Java.

- Solr is really neat. It's easy to install and gives a simple/powerful 
XML API to populate a Lucene index.

... so ...

I'm thinking BioPerl knows how to parse lots of formats into a Bio::Seq.

I'm thinking I could write Perl which would take a Bio::Seq object and 
convert it to an XML file which Solr would happily inject into Lucene 
for me.

If I could do that I'm thinking that any of the many formats that 
Bio::SeqIO can slurp could magically be sent into a Lucene index for 
searching.

I'm thinking that would be really cool and I'm going to write it.

Now's your chance to slap me.

Since I haven't started yet, what would I call this thing? 
Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)

Thanks,

Jay Hannah
http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah


More notes:
http://clab.ist.unomaha.edu/CLAB/index.php/RT11


From hlapp at gmx.net  Thu Aug 30 01:37:59 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 29 Aug 2007 21:37:59 -0400
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <46D5EF2B.5000101@jays.net>
References: <46D5EF2B.5000101@jays.net>
Message-ID: <D202078D-8F88-4FAA-94EA-8C08CE653C41@gmx.net>


On Aug 29, 2007, at 6:11 PM, Jay Hannah wrote:

> [...]
>
> I'm thinking I could write Perl which would take a Bio::Seq object and
> convert it to an XML file which Solr would happily inject into Lucene
> for me.
>
> If I could do that I'm thinking that any of the many formats that
> Bio::SeqIO can slurp could magically be sent into a Lucene index for
> searching.
>
> [...]
> Since I haven't started yet, what would I call this thing?
> Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)

Would this be a Solr-specific XML writer? Or could you use an  
existing XML format for sequences?

(as an aside, if you do need a Solr-specific format writer, my  
suggestion would be to name it solrxml [lowercase])

	-hilmar

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Aug 30 02:01:45 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 21:01:45 -0500
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <46D5EF2B.5000101@jays.net>
References: <46D5EF2B.5000101@jays.net>
Message-ID: <0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>


On Aug 29, 2007, at 5:11 PM, Jay Hannah wrote:

> Please slap me if I'm hysterical.
>
> I'm seeking a broad bioinformatics search engine platform. I want to
> take gobs of data in gobs of formats and allow people to search it on
> the web.
>
> - Entrez is awesome. Unfortunately I don't see anything in the NCBI
> toolkit that helps me run my own version of it. Even a tiny one. After
> an initial "check out our toolkit" response from NCBI I don't seem  
> to be
> getting anywhere. Maybe I'm not communicating enough or well enough.

No.  I have had non-responses before from NCBI; they may just be too  
busy.  Warnock probably applies.

> - EB-eye Search is slick. I don't see any developer kit or source code
> of any kind and I've gotten no response to my emails to them.

Not sure of this one personally.

> - LuceGene is very cool.
> ...
> I don't know Java.

...but you could write a (perl) wrapper around it.  You can try  
contacting Don Gilbert about it, though I think he's been trying out  
Chado.

> - Solr is really neat. It's easy to install and gives a simple/ 
> powerful
> XML API to populate a Lucene index.
> ... so ...
>
> I'm thinking BioPerl knows how to parse lots of formats into a  
> Bio::Seq.
>
> ...
>
> I'm thinking that would be really cool and I'm going to write it.
>
> Now's your chance to slap me.

No need.

> Since I haven't started yet, what would I call this thing?
> Bio::SeqIO::Solr?  (and I wouldn't implement the I part?)
>
> Thanks,
>
> Jay Hannah
> http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah
>
> More notes:
> http://clab.ist.unomaha.edu/CLAB/index.php/RT11

The way I would go about it is use an established XML schema as a  
starting point and implement a writer (if bioperl doesn't already  
support it).  It's better than reinventing (a constantly reinvented)  
wheel and starting up a brand-new schema of your own.  INSDSeq  
(http://www.insdc.org/page.php?page=xmlstatus) is one I've been  
wanting to add for a while but haven't had time to work on; there are  
several other examples.  Note that a few of the currently supported  
ones in bioperl, such as bsml and game, have had very little to no  
development over the years in favor of newer (better?) XML flavors,  
so it likely isn't worth working with those.

chris


From hlapp at gmx.net  Thu Aug 30 02:02:45 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed, 29 Aug 2007 22:02:45 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
Message-ID: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>


On Aug 28, 2007, at 5:05 PM, Chris Fields wrote:

> I'm now wrapping up the Feature/Annotation rollback.  I will probably
> start merging back to the main branch in the next day or two., as
> soon as interested parties (*cough*devs*cough*) look over the last
> batch of changes.
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>
> [...]
> It is also possible there are still some instances where overloading
> is expected lurking about in the ~1000 or so modules, so I'll leave
> the exceptions I added to all Bio::AnnotationI

Keep in mind that code such as

	if ($ann) { ... }

is mostly not b/c someone wanted to use overloading, but rather  
someone was lazy and really meant to say

	if (defined($ann)) { ... }

In the absence of eq overloading, these will behave identically. So  
if you leave the exceptions in it is sort-of policing lazy  
programmers, which I guess is fine in principle, but is guaranteed to  
trip up a lot of script code. I'd take it out if you're reasonably  
sure that at least within BioPerl itself those lazy programming  
incidents are removed.

> [...]
> The key change in this last round is the addition of several class
> *dbxref* methods to Bio::Ontology::Term and
> Bio::Annotation::OntologyTerm, all of which are capable of working
> with either DBLink instances or simple scalars.

I don't think you need the code here to deal with both scalars and  
objects. It is fine I think to define the new methods from the outset  
to consistently accept and return DBLink objects, and period.

The backwards compatibility logic should rather be in the *_dblink*()  
methods; i.e., instead of simple aliases they should have the code to  
map to and from the new API. That way, once the deprecation cycle  
ends, they can be removed, and with them all the legacy code that now  
is no longer needed, whereas if you have that in the new methods, it  
keeps bothering the maintainers.

You also mention a add_dbxref_context() on the wiki page - I'm not  
sure why that would be needed given that you build in the -context  
option to add_dbxref() from the outset. But maybe I've glossed over  
some detail.

Once this is merged back to the main trunk, I guess we need to give  
Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it  
makes real sense.

Thanks Chris for this effort, this clears a monumental roadblock.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Thu Aug 30 03:23:14 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 22:23:14 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
Message-ID: <A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>


On Aug 29, 2007, at 9:02 PM, Hilmar Lapp wrote:

>
> On Aug 28, 2007, at 5:05 PM, Chris Fields wrote:
>
>> I'm now wrapping up the Feature/Annotation rollback.  I will probably
>> start merging back to the main branch in the next day or two., as
>> soon as interested parties (*cough*devs*cough*) look over the last
>> batch of changes.
>>
>> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
>>
>> [...]
>> It is also possible there are still some instances where overloading
>> is expected lurking about in the ~1000 or so modules, so I'll leave
>> the exceptions I added to all Bio::AnnotationI
>
> Keep in mind that code such as
>
> 	if ($ann) { ... }
>
> is mostly not b/c someone wanted to use overloading, but rather
> someone was lazy and really meant to say
>
> 	if (defined($ann)) { ... }

Agreed.

> In the absence of eq overloading, these will behave identically. So
> if you leave the exceptions in it is sort-of policing lazy
> programmers, which I guess is fine in principle, but is guaranteed to
> trip up a lot of script code. I'd take it out if you're reasonably
> sure that at least within BioPerl itself those lazy programming
> incidents are removed.

I agree the overload exceptions shouldn't be left in.  The problem is  
I'm not certain we have caught most implicit overload calls (just the  
ones tested for).  Scott's checking everything against GMOD, though,  
so we can remove them after that.

>> [...]
>> The key change in this last round is the addition of several class
>> *dbxref* methods to Bio::Ontology::Term and
>> Bio::Annotation::OntologyTerm, all of which are capable of working
>> with either DBLink instances or simple scalars.
>
> I don't think you need the code here to deal with both scalars and
> objects. It is fine I think to define the new methods from the outset
> to consistently accept and return DBLink objects, and period.
>
> The backwards compatibility logic should rather be in the *_dblink*()
> methods; i.e., instead of simple aliases they should have the code to
> map to and from the new API. That way, once the deprecation cycle
> ends, they can be removed, and with them all the legacy code that now
> is no longer needed, whereas if you have that in the new methods, it
> keeps bothering the maintainers.

That should be easy enough to fix and would be more consistent.  I  
can look over the various calls to dbxref methods and see what needs  
to be done, then fix that in cvs.

> You also mention a add_dbxref_context() on the wiki page - I'm not
> sure why that would be needed given that you build in the -context
> option to add_dbxref() from the outset. But maybe I've glossed over
> some detail.

The -context parameter was in get_dbxref(), to grab those DBLinks in  
a particular context.  We could do the same with add_dbxref() (pass  
DBLinks in first arg as array ref, context as second arg).  That  
would then obviate the need for add_dbxref_context().

I'll also change the parameter passing in get_dbxref() to just accept  
context as an single optional argument since we're dealing with only  
DBLink instances now.

> Once this is merged back to the main trunk, I guess we need to give
> Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it
> makes real sense.

It describes one method, ontology_term(), which returns a  
Bio::Ontology::TermI.  This is similar to SeqFeature::Annotated::type 
(), which returns a Bio::Annotation::OntologyTerm (a  
Bio::Ontology::TermI).  My thought is to simply deprecate type() in  
favor of TypedSeqFeatureI::ontology_term().

> Thanks Chris for this effort, this clears a monumental roadblock.
>
> 	-hilmar

No problem.  It just needed to be done.

chris


From florent.angly at gmail.com  Thu Aug 30 03:44:58 2007
From: florent.angly at gmail.com (Florent Angly)
Date: Wed, 29 Aug 2007 20:44:58 -0700
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
Message-ID: <46D63D3A.6050308@gmail.com>

Hilmar Lapp wrote:
> Keep in mind that code such as
>
> 	if ($ann) { ... }
>
> is mostly not b/c someone wanted to use overloading, but rather  
> someone was lazy and really meant to say
>
> 	if (defined($ann)) { ... }
>
> In the absence of eq overloading, these will behave identically. So  
> if you leave the exceptions in it is sort-of policing lazy  
> programmers, which I guess is fine in principle, but is guaranteed to  
> trip up a lot of script code. I'd take it out if you're reasonably  
> sure that at least within BioPerl itself those lazy programming  
> incidents are removed.
	if ($ann) { ... }

and 

	if (defined($ann)) { ... }

are not the same.

	if ($ann)

is evaluated false for an empty string like

        $ann = '';

and for a value of zero, i.e.

	$ann = 0;

while

	defined($ann)

returns true in these 2 cases.

Florent


From cjfields at uiuc.edu  Thu Aug 30 03:54:05 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 22:54:05 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <46D63D3A.6050308@gmail.com>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<46D63D3A.6050308@gmail.com>
Message-ID: <90C3DE31-12FD-4BF3-B9F7-0FB5E1DE2A28@uiuc.edu>


On Aug 29, 2007, at 10:44 PM, Florent Angly wrote:

> Hilmar Lapp wrote:
>> Keep in mind that code such as
>>
>> 	if ($ann) { ... }
>>
>> is mostly not b/c someone wanted to use overloading, but rather   
>> someone was lazy and really meant to say
>>
>> 	if (defined($ann)) { ... }
>>
>> In the absence of eq overloading, these will behave identically.  
>> So  if you leave the exceptions in it is sort-of policing lazy   
>> programmers, which I guess is fine in principle, but is guaranteed  
>> to  trip up a lot of script code. I'd take it out if you're  
>> reasonably  sure that at least within BioPerl itself those lazy  
>> programming  incidents are removed.
> 	if ($ann) { ... }
>
> and
> 	if (defined($ann)) { ... }
>
> are not the same.
>
> 	if ($ann)
>
> is evaluated false for an empty string like
>
>        $ann = '';
>
> and for a value of zero, i.e.
>
> 	$ann = 0;
>
> while
>
> 	defined($ann)
>
> returns true in these 2 cases.
>
> Florent

I agree, but we're talking about the context in which this test is  
performed, where $ann is either an instance of a Bio::AnnotationI or  
undef (not a scalar value or '').  In this case it works both as 'if  
($ann)' or 'if (defined($ann))', though the latter is preferred.   
Never underestimate laziness!

chris


From cain.cshl at gmail.com  Thu Aug 30 03:59:11 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 29 Aug 2007 23:59:11 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <46D63D3A.6050308@gmail.com>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<46D63D3A.6050308@gmail.com>
Message-ID: <1188446351.2567.55.camel@localhost.localdomain>

Hi Florent,

Of course what you wrote below is true, but what Hilmar was writing
about was lazy programmers (like me) who assume that the empty string
and 0 value cases aren't going to happen (because we happen to know they
never should in certain contexts), and so use 'if ($ann)'.  Of course,
at the moment, I am in the process of de-lazifying my code (though I
tended to think of it as being efficent :-)

Scott


On Wed, 2007-08-29 at 20:44 -0700, Florent Angly wrote:
> Hilmar Lapp wrote:
> > Keep in mind that code such as
> >
> > 	if ($ann) { ... }
> >
> > is mostly not b/c someone wanted to use overloading, but rather  
> > someone was lazy and really meant to say
> >
> > 	if (defined($ann)) { ... }
> >
> > In the absence of eq overloading, these will behave identically. So  
> > if you leave the exceptions in it is sort-of policing lazy  
> > programmers, which I guess is fine in principle, but is guaranteed to  
> > trip up a lot of script code. I'd take it out if you're reasonably  
> > sure that at least within BioPerl itself those lazy programming  
> > incidents are removed.
> 	if ($ann) { ... }
> 
> and 
> 
> 	if (defined($ann)) { ... }
> 
> are not the same.
> 
> 	if ($ann)
> 
> is evaluated false for an empty string like
> 
>         $ann = '';
> 
> and for a value of zero, i.e.
> 
> 	$ann = 0;
> 
> while
> 
> 	defined($ann)
> 
> returns true in these 2 cases.
> 
> Florent
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070829/27872681/attachment.sig>

From cain.cshl at gmail.com  Thu Aug 30 04:05:06 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 00:05:06 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
Message-ID: <1188446706.2567.59.camel@localhost.localdomain>

Hi Chris,

Is there a reason that the value method of the
Bio::Annotation::SimpleValue (and possibly some of its siblings)
returning "Value: $value"?  It didn't used to have the "Value: " before,
did it?

Thanks,
Scott


On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> I'll probably go ahead and start merging this stuff over to CVS HEAD  
> then.  There haven't been any objections so far.
> 
> The page I posted outlines the more critical fixes, primarily the  
> changes to Bio::Ontology::Term methods (along with relevant code) due  
> to inconsistencies in the interface.  The Bio::Annotation classes  
> also now throw if you attempt to use them in an overloaded context.   
> I also split off SeqFeature::Annotated tests into it's own test suite  
> (SeqFeatAnnotated.t).
> 
> Let me know if there are any problems along the way!
> 
> chris
> 
> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > I just wanted to let you know that I was out of town for a few  
> > days, but
> > now I'm back and I'm doing testing of GMOD software based on the  
> > branch
> > you are working on.  I'll let you know how it goes, but don't let me
> > stop you if you confident of your changes.  I'm sure whatever goes
> > wrong, it will just point out holes in the FeatureIO tests (I'm sure
> > there are plenty) and will require hopefully minimal changes on my  
> > end.
> >
> > Thanks for your considerable efforts on this!  (Regardless of how much
> > work it makes for me :-)
> > Scott
> >
> >
> > On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >> I'm now wrapping up the Feature/Annotation rollback.  I will probably
> >> start merging back to the main branch in the next day or two., as
> >> soon as interested parties (*cough*devs*cough*) look over the last
> >> batch of changes.
> >>
> >> http://www.bioperl.org/wiki/Feature_Annotation_rollback#Fourth_Round
> >>
> >> I have also added a small benchmark test which indicates a decrease
> >> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >> this will translate over to any Bio::SeqFeature::Generic-using class
> >> (open mouth, prepare to insert foot....).
> >>
> >> It is also possible there are still some instances where overloading
> >> is expected lurking about in the ~1000 or so modules, so I'll leave
> >> the exceptions I added to all Bio::AnnotationI; we can remove them
> >> down the line, maybe prior to rel1.6, after more tests are added or
> >> if they get particularly annoying.  My guess is I caught 99.99% of
> >> them (prepare to insert other foot....).
> >>
> >> The key change in this last round is the addition of several class
> >> *dbxref* methods to Bio::Ontology::Term and
> >> Bio::Annotation::OntologyTerm, all of which are capable of working
> >> with either DBLink instances or simple scalars.  This was primarily
> >> done in order to clear up inconsistencies in the older *dblink*
> >> methods, which were ambiguous (some indicates simple scalar
> >> arguments, others DBLink objects); operator overloading was used
> >> extensively in these cases, which led to several issues.  I have
> >> added deprecation warnings to the older methods which now map to
> >> using the newer methods.  All tests pass with the exception of a few
> >> already failing on the MAIN branch; the single test which needs to be
> >> fixed is a round-tripping error in swiss.t (now a TODO), which can be
> >> fixed after merging back.
> >>
> >> Please respond to this if there are any questions or if I need to
> >> clarify the changes I made a bit more.
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                          
> > cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/b03eef7e/attachment.sig>

From cjfields at uiuc.edu  Thu Aug 30 04:17:18 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 29 Aug 2007 23:17:18 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188446706.2567.59.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
Message-ID: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>

It shouldn't, that sounds like the output for add_text().  value()  
should just return the scalar value.

As a note, I added a new method, display_text(), for all  
Bio::AnnotationI classes which by default replicates the same output  
that stringification overloads produced.  So you should be able to  
explicitly call $ann->display_text for any Bio::AnnotationI where you  
once used an implicit call:

# old
print "$ann\n";

# new
print $ann->display_text,"\n";

chris

On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:

> Hi Chris,
>
> Is there a reason that the value method of the
> Bio::Annotation::SimpleValue (and possibly some of its siblings)
> returning "Value: $value"?  It didn't used to have the "Value: "  
> before,
> did it?
>
> Thanks,
> Scott
>
>
> On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
>> I'll probably go ahead and start merging this stuff over to CVS HEAD
>> then.  There haven't been any objections so far.
>>
>> The page I posted outlines the more critical fixes, primarily the
>> changes to Bio::Ontology::Term methods (along with relevant code) due
>> to inconsistencies in the interface.  The Bio::Annotation classes
>> also now throw if you attempt to use them in an overloaded context.
>> I also split off SeqFeature::Annotated tests into it's own test suite
>> (SeqFeatAnnotated.t).
>>
>> Let me know if there are any problems along the way!
>>
>> chris
>>
>> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> I just wanted to let you know that I was out of town for a few
>>> days, but
>>> now I'm back and I'm doing testing of GMOD software based on the
>>> branch
>>> you are working on.  I'll let you know how it goes, but don't let me
>>> stop you if you confident of your changes.  I'm sure whatever goes
>>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
>>> there are plenty) and will require hopefully minimal changes on my
>>> end.
>>>
>>> Thanks for your considerable efforts on this!  (Regardless of how  
>>> much
>>> work it makes for me :-)
>>> Scott
>>>
>>>
>>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
>>>> probably
>>>> start merging back to the main branch in the next day or two., as
>>>> soon as interested parties (*cough*devs*cough*) look over the last
>>>> batch of changes.
>>>>
>>>> http://www.bioperl.org/wiki/ 
>>>> Feature_Annotation_rollback#Fourth_Round
>>>>
>>>> I have also added a small benchmark test which indicates a decrease
>>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
>>>> this will translate over to any Bio::SeqFeature::Generic-using  
>>>> class
>>>> (open mouth, prepare to insert foot....).
>>>>
>>>> It is also possible there are still some instances where  
>>>> overloading
>>>> is expected lurking about in the ~1000 or so modules, so I'll leave
>>>> the exceptions I added to all Bio::AnnotationI; we can remove them
>>>> down the line, maybe prior to rel1.6, after more tests are added or
>>>> if they get particularly annoying.  My guess is I caught 99.99% of
>>>> them (prepare to insert other foot....).
>>>>
>>>> The key change in this last round is the addition of several class
>>>> *dbxref* methods to Bio::Ontology::Term and
>>>> Bio::Annotation::OntologyTerm, all of which are capable of working
>>>> with either DBLink instances or simple scalars.  This was primarily
>>>> done in order to clear up inconsistencies in the older *dblink*
>>>> methods, which were ambiguous (some indicates simple scalar
>>>> arguments, others DBLink objects); operator overloading was used
>>>> extensively in these cases, which led to several issues.  I have
>>>> added deprecation warnings to the older methods which now map to
>>>> using the newer methods.  All tests pass with the exception of a  
>>>> few
>>>> already failing on the MAIN branch; the single test which needs  
>>>> to be
>>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
>>>> can be
>>>> fixed after merging back.
>>>>
>>>> Please respond to this if there are any questions or if I need to
>>>> clarify the changes I made a bit more.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>> Scott Cain, Ph. D.
>>> cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From neetisomaiya at gmail.com  Thu Aug 30 04:47:53 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Thu, 30 Aug 2007 10:17:53 +0530
Subject: [Bioperl-l] kegg xml parsing
Message-ID: <764978cf0708292147q4ead37b0i782b83ecda8ce3da@mail.gmail.com>

Hi,

Has anyone used XML::Twig for parsing of kegg xml data?
I was looking for some small example code of the same.

Thanks.
-- 
-Neeti
Even my blood says, B positive


From sdavis2 at mail.nih.gov  Thu Aug 30 10:16:54 2007
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 30 Aug 2007 06:16:54 -0400
Subject: [Bioperl-l] Bio::Seq -> Solr (Lucene) ?
In-Reply-To: <0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>
References: <46D5EF2B.5000101@jays.net>
	<0FF63232-25DE-4676-8C06-B9B00BE28349@uiuc.edu>
Message-ID: <46D69916.4060202@mail.nih.gov>

Chris Fields wrote:
> On Aug 29, 2007, at 5:11 PM, Jay Hannah wrote:
> 
>> Please slap me if I'm hysterical.
>>
>> I'm seeking a broad bioinformatics search engine platform. I want to
>> take gobs of data in gobs of formats and allow people to search it on
>> the web.

Not sure how it might or might not meet your needs, but have you looked
at SRS (Sequence Retrieval System)?  I have never tried to use it,
personally, though.

Sean


From cjfields at uiuc.edu  Thu Aug 30 13:17:17 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 08:17:17 -0500
Subject: [Bioperl-l] remote SwissProt server problems
In-Reply-To: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>
References: <6805F552-9947-4C28-B846-47B5501B31DF@uiuc.edu>
Message-ID: <62B4DE62-C11E-4E75-837C-6C1005FB12A4@uiuc.edu>

This should be fixed now (DBFetch-related tests pass, though MeSH  
tests are now failing!).

chris

On Aug 29, 2007, at 11:03 AM, Chris Fields wrote:

> Just as a notice, DBFetch is currently retrieving only single records
> for the UniProtKB database (where Bio::DB::SwissProt fetches
> sequences).  If anyone runs remote sevrer tests and DB.t in the test
> suite you'll see a failure towards the end which indicates this.
> I've posted a notice to the server help desk and will respond when I
> hear more.
>
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cain.cshl at gmail.com  Thu Aug 30 14:39:59 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 10:39:59 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
Message-ID: <1188484799.2567.84.camel@localhost.localdomain>

Hi Chris,

I see--I was using as_text and getting the "Value: $value"; there are
places in my code where I have always used ->value and I thought that
the way it was working had changed.

What is the use case for having the as_text method work the way it does?

Thanks,
Scott


On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
> It shouldn't, that sounds like the output for add_text().  value()  
> should just return the scalar value.
> 
> As a note, I added a new method, display_text(), for all  
> Bio::AnnotationI classes which by default replicates the same output  
> that stringification overloads produced.  So you should be able to  
> explicitly call $ann->display_text for any Bio::AnnotationI where you  
> once used an implicit call:
> 
> # old
> print "$ann\n";
> 
> # new
> print $ann->display_text,"\n";
> 
> chris
> 
> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > Is there a reason that the value method of the
> > Bio::Annotation::SimpleValue (and possibly some of its siblings)
> > returning "Value: $value"?  It didn't used to have the "Value: "  
> > before,
> > did it?
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> >> I'll probably go ahead and start merging this stuff over to CVS HEAD
> >> then.  There haven't been any objections so far.
> >>
> >> The page I posted outlines the more critical fixes, primarily the
> >> changes to Bio::Ontology::Term methods (along with relevant code) due
> >> to inconsistencies in the interface.  The Bio::Annotation classes
> >> also now throw if you attempt to use them in an overloaded context.
> >> I also split off SeqFeature::Annotated tests into it's own test suite
> >> (SeqFeatAnnotated.t).
> >>
> >> Let me know if there are any problems along the way!
> >>
> >> chris
> >>
> >> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> >>
> >>> Hi Chris,
> >>>
> >>> I just wanted to let you know that I was out of town for a few
> >>> days, but
> >>> now I'm back and I'm doing testing of GMOD software based on the
> >>> branch
> >>> you are working on.  I'll let you know how it goes, but don't let me
> >>> stop you if you confident of your changes.  I'm sure whatever goes
> >>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> >>> there are plenty) and will require hopefully minimal changes on my
> >>> end.
> >>>
> >>> Thanks for your considerable efforts on this!  (Regardless of how  
> >>> much
> >>> work it makes for me :-)
> >>> Scott
> >>>
> >>>
> >>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
> >>>> probably
> >>>> start merging back to the main branch in the next day or two., as
> >>>> soon as interested parties (*cough*devs*cough*) look over the last
> >>>> batch of changes.
> >>>>
> >>>> http://www.bioperl.org/wiki/ 
> >>>> Feature_Annotation_rollback#Fourth_Round
> >>>>
> >>>> I have also added a small benchmark test which indicates a decrease
> >>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >>>> this will translate over to any Bio::SeqFeature::Generic-using  
> >>>> class
> >>>> (open mouth, prepare to insert foot....).
> >>>>
> >>>> It is also possible there are still some instances where  
> >>>> overloading
> >>>> is expected lurking about in the ~1000 or so modules, so I'll leave
> >>>> the exceptions I added to all Bio::AnnotationI; we can remove them
> >>>> down the line, maybe prior to rel1.6, after more tests are added or
> >>>> if they get particularly annoying.  My guess is I caught 99.99% of
> >>>> them (prepare to insert other foot....).
> >>>>
> >>>> The key change in this last round is the addition of several class
> >>>> *dbxref* methods to Bio::Ontology::Term and
> >>>> Bio::Annotation::OntologyTerm, all of which are capable of working
> >>>> with either DBLink instances or simple scalars.  This was primarily
> >>>> done in order to clear up inconsistencies in the older *dblink*
> >>>> methods, which were ambiguous (some indicates simple scalar
> >>>> arguments, others DBLink objects); operator overloading was used
> >>>> extensively in these cases, which led to several issues.  I have
> >>>> added deprecation warnings to the older methods which now map to
> >>>> using the newer methods.  All tests pass with the exception of a  
> >>>> few
> >>>> already failing on the MAIN branch; the single test which needs  
> >>>> to be
> >>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
> >>>> can be
> >>>> fixed after merging back.
> >>>>
> >>>> Please respond to this if there are any questions or if I need to
> >>>> clarify the changes I made a bit more.
> >>>>
> >>>> chris
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>> -- 
> >>> -------------------------------------------------------------------- 
> >>> --
> >>> --
> >>> Scott Cain, Ph. D.
> >>> cain at cshl.edu
> >>> GMOD Coordinator (http://www.gmod.org/)
> >>> 216-392-3087
> >>> Cold Spring Harbor Laboratory
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher
> >> Lab of Dr. Robert Switzer
> >> Dept of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                    
> > cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/f2f5159f/attachment.sig>

From cain.cshl at gmail.com  Thu Aug 30 15:46:24 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 11:46:24 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
Message-ID: <1188488785.2567.93.camel@localhost.localdomain>

Hi Chris,

Good news!  I only had to add a few defineds and a few display_texts and
I was able to successfully create a database and load the yeast GFF3
file.  While I want to do more testing with GFF from other sources,
clearly, I am 95% of the way there with relatively little work.

Nice job and Thanks!
Scott


On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
> It shouldn't, that sounds like the output for add_text().  value()  
> should just return the scalar value.
> 
> As a note, I added a new method, display_text(), for all  
> Bio::AnnotationI classes which by default replicates the same output  
> that stringification overloads produced.  So you should be able to  
> explicitly call $ann->display_text for any Bio::AnnotationI where you  
> once used an implicit call:
> 
> # old
> print "$ann\n";
> 
> # new
> print $ann->display_text,"\n";
> 
> chris
> 
> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > Is there a reason that the value method of the
> > Bio::Annotation::SimpleValue (and possibly some of its siblings)
> > returning "Value: $value"?  It didn't used to have the "Value: "  
> > before,
> > did it?
> >
> > Thanks,
> > Scott
> >
> >
> > On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
> >> I'll probably go ahead and start merging this stuff over to CVS HEAD
> >> then.  There haven't been any objections so far.
> >>
> >> The page I posted outlines the more critical fixes, primarily the
> >> changes to Bio::Ontology::Term methods (along with relevant code) due
> >> to inconsistencies in the interface.  The Bio::Annotation classes
> >> also now throw if you attempt to use them in an overloaded context.
> >> I also split off SeqFeature::Annotated tests into it's own test suite
> >> (SeqFeatAnnotated.t).
> >>
> >> Let me know if there are any problems along the way!
> >>
> >> chris
> >>
> >> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
> >>
> >>> Hi Chris,
> >>>
> >>> I just wanted to let you know that I was out of town for a few
> >>> days, but
> >>> now I'm back and I'm doing testing of GMOD software based on the
> >>> branch
> >>> you are working on.  I'll let you know how it goes, but don't let me
> >>> stop you if you confident of your changes.  I'm sure whatever goes
> >>> wrong, it will just point out holes in the FeatureIO tests (I'm sure
> >>> there are plenty) and will require hopefully minimal changes on my
> >>> end.
> >>>
> >>> Thanks for your considerable efforts on this!  (Regardless of how  
> >>> much
> >>> work it makes for me :-)
> >>> Scott
> >>>
> >>>
> >>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
> >>>> I'm now wrapping up the Feature/Annotation rollback.  I will  
> >>>> probably
> >>>> start merging back to the main branch in the next day or two., as
> >>>> soon as interested parties (*cough*devs*cough*) look over the last
> >>>> batch of changes.
> >>>>
> >>>> http://www.bioperl.org/wiki/ 
> >>>> Feature_Annotation_rollback#Fourth_Round
> >>>>
> >>>> I have also added a small benchmark test which indicates a decrease
> >>>> in parsing time in SeqIO::genbank with all tests passing.  I expect
> >>>> this will translate over to any Bio::SeqFeature::Generic-using  
> >>>> class
> >>>> (open mouth, prepare to insert foot....).
> >>>>
> >>>> It is also possible there are still some instances where  
> >>>> overloading
> >>>> is expected lurking about in the ~1000 or so modules, so I'll leave
> >>>> the exceptions I added to all Bio::AnnotationI; we can remove them
> >>>> down the line, maybe prior to rel1.6, after more tests are added or
> >>>> if they get particularly annoying.  My guess is I caught 99.99% of
> >>>> them (prepare to insert other foot....).
> >>>>
> >>>> The key change in this last round is the addition of several class
> >>>> *dbxref* methods to Bio::Ontology::Term and
> >>>> Bio::Annotation::OntologyTerm, all of which are capable of working
> >>>> with either DBLink instances or simple scalars.  This was primarily
> >>>> done in order to clear up inconsistencies in the older *dblink*
> >>>> methods, which were ambiguous (some indicates simple scalar
> >>>> arguments, others DBLink objects); operator overloading was used
> >>>> extensively in these cases, which led to several issues.  I have
> >>>> added deprecation warnings to the older methods which now map to
> >>>> using the newer methods.  All tests pass with the exception of a  
> >>>> few
> >>>> already failing on the MAIN branch; the single test which needs  
> >>>> to be
> >>>> fixed is a round-tripping error in swiss.t (now a TODO), which  
> >>>> can be
> >>>> fixed after merging back.
> >>>>
> >>>> Please respond to this if there are any questions or if I need to
> >>>> clarify the changes I made a bit more.
> >>>>
> >>>> chris
> >>>> _______________________________________________
> >>>> Bioperl-l mailing list
> >>>> Bioperl-l at lists.open-bio.org
> >>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>> -- 
> >>> -------------------------------------------------------------------- 
> >>> --
> >>> --
> >>> Scott Cain, Ph. D.
> >>> cain at cshl.edu
> >>> GMOD Coordinator (http://www.gmod.org/)
> >>> 216-392-3087
> >>> Cold Spring Harbor Laboratory
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> Christopher Fields
> >> Postdoctoral Researcher
> >> Lab of Dr. Robert Switzer
> >> Dept of Biochemistry
> >> University of Illinois Urbana-Champaign
> >>
> >>
> >>
> > -- 
> > ---------------------------------------------------------------------- 
> > --
> > Scott Cain, Ph. D.                                    
> > cain.cshl at gmail.com
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/ec7a594e/attachment.sig>

From hlapp at gmx.net  Thu Aug 30 16:07:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:07:18 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188488785.2567.93.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
Message-ID: <0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:

> Good news!  I only had to add a few defineds and a few  
> display_texts and
> I was able to successfully create a database and load the yeast GFF3

Scott - I'm a little worried - what are you using the display_text()  
calls for? There is no method to set a property that would be  
returned here, so you only have control over that if you override the  
method in a custom AnnotationI class.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
OtZghop1tET5iMqnwXzL+lk=
=NVrK
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Thu Aug 30 16:10:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:10:14 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188484799.2567.84.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188484799.2567.84.camel@localhost.localdomain>
Message-ID: <49824C75-3FA5-4E59-8F99-BC0E974E9652@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 10:39 AM, Scott Cain wrote:

> What is the use case for having the as_text method work the way it  
> does?

That's a bit nebulous as I tried to point out the other day. It's  
just a textual representation of the annotation, but you don't really  
have control over what the particular Annotation class considers to  
fulfill that purpose.

So, it's fine to expect a printable meaningful string to be returned,  
but don't try to parse it or rely on exactly what it is going to look  
like.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1uvnuV6N2JxL7qsRAn+dAKC9iLj93El38uv7kjprdZDo0sXC6wCgqwhm
0/tF89/FO1a4CWAf1bahd+8=
=I7SM
-----END PGP SIGNATURE-----


From hlapp at gmx.net  Thu Aug 30 16:20:18 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:20:18 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
Message-ID: <DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>


On Aug 29, 2007, at 11:23 PM, Chris Fields wrote:

>> Once this is merged back to the main trunk, I guess we need to give
>> Bio::SeqFeature::TypedSeqFeatureI a thorough look and make sure it
>> makes real sense.
>
> It describes one method, ontology_term(), which returns a  
> Bio::Ontology::TermI.  This is similar to  
> SeqFeature::Annotated::type(), which returns a  
> Bio::Annotation::OntologyTerm (a Bio::Ontology::TermI).  My thought  
> is to simply deprecate type() in favor of  
> TypedSeqFeatureI::ontology_term().

I think we'll want to think about that. type() gives me some  
indication of what the returned value might represent, whereas  
ontology_term() only tells me about the type of the returned object.

You could make ontology_term() accept a context argument, such as

	my $feature_type = $typedFeat->ontology_term(-context => -type);

Or you could name the method(s) more explicitly, such as

	my $feature_type = $typedFeat->type_term();
	my $feature_source = $typedFeat->source_term();
	my @annTerms = $typedFeat->get_Annotations('Gene Ontology');

Am I making sense?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Thu Aug 30 16:28:47 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 12:28:47 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
Message-ID: <1188491327.2567.101.camel@localhost.localdomain>

Hi Hilmar,

I'm using it as Chris suggested: where I had be depending on ""
overloading.  I think in most places, I am using it on
Bio::Annotation::SimpleValue to get the string that is the simple value.
On more complex data types, I am using other methods built into those
classes to extract useful stuff for inserting into the database.

Scott


On Thu, 2007-08-30 at 12:07 -0400, Hilmar Lapp wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:
> 
> > Good news!  I only had to add a few defineds and a few  
> > display_texts and
> > I was able to successfully create a database and load the yeast GFF3
> 
> Scott - I'm a little worried - what are you using the display_text()  
> calls for? There is no method to set a property that would be  
> returned here, so you only have control over that if you override the  
> method in a custom AnnotationI class.
> 
> 	-hilmar
> - --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.3 (Darwin)
> 
> iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
> OtZghop1tET5iMqnwXzL+lk=
> =NVrK
> -----END PGP SIGNATURE-----
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/1d98e384/attachment.sig>

From hlapp at gmx.net  Thu Aug 30 16:52:14 2007
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 30 Aug 2007 12:52:14 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188491327.2567.101.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
Message-ID: <F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Aug 30, 2007, at 12:28 PM, Scott Cain wrote:

> I think in most places, I am using it on
> Bio::Annotation::SimpleValue to get the string that is the simple  
> value.

You should be using $ann->value() for that, unless I'm missing  
something.

	-hilmar
- --
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iD8DBQFG1vXCuV6N2JxL7qsRAkcJAKCICRtOSlPLVYYKCbOTvDIf4idb3wCgkxYM
seeaNvSsFY/4bHLGZ9dum2Q=
=E35w
-----END PGP SIGNATURE-----


From cain.cshl at gmail.com  Thu Aug 30 17:16:09 2007
From: cain.cshl at gmail.com (Scott Cain)
Date: Thu, 30 Aug 2007 13:16:09 -0400
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
	<F03155D4-58CB-4C8D-9D52-C49036EB7F45@gmx.net>
Message-ID: <1188494169.2567.109.camel@localhost.localdomain>

Well, in the instances where I was using it, ->value seems to work
exactly the same, so I changed it to value to be more consistent with
other code I'd written.  I'd used display_name without really thinking
about it.

Thanks,
Scott


On Thu, 2007-08-30 at 12:52 -0400, Hilmar Lapp wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> 
> On Aug 30, 2007, at 12:28 PM, Scott Cain wrote:
> 
> > I think in most places, I am using it on
> > Bio::Annotation::SimpleValue to get the string that is the simple  
> > value.
> 
> You should be using $ann->value() for that, unless I'm missing  
> something.
> 
> 	-hilmar
> - --
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
> 
> 
> 
> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.3 (Darwin)
> 
> iD8DBQFG1vXCuV6N2JxL7qsRAkcJAKCICRtOSlPLVYYKCbOTvDIf4idb3wCgkxYM
> seeaNvSsFY/4bHLGZ9dum2Q=
> =E35w
> -----END PGP SIGNATURE-----
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20070830/4c383cd3/attachment.sig>

From cjfields at uiuc.edu  Thu Aug 30 17:27:46 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 12:27:46 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188491327.2567.101.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
	<0545DE1A-F2E2-4FA8-BE7C-436EE25C7D92@gmx.net>
	<1188491327.2567.101.camel@localhost.localdomain>
Message-ID: <6E9B07D0-AB37-4439-AA9D-9268AB5A38C0@uiuc.edu>

display_text() is really a hack for explicitly getting the same  
output one would have expected from stringification overload for any  
Bio::AnnotationI (you can also use callbacks on it for customizing it  
if needed, but that's not important here).  It works depending on the  
context of what you're trying to accomplish, but it might be best to  
use value() specifically in places where you expect only using  
Bio::Annotation::Simple.

chris

On Aug 30, 2007, at 11:28 AM, Scott Cain wrote:

> Hi Hilmar,
>
> I'm using it as Chris suggested: where I had be depending on ""
> overloading.  I think in most places, I am using it on
> Bio::Annotation::SimpleValue to get the string that is the simple  
> value.
> On more complex data types, I am using other methods built into those
> classes to extract useful stuff for inserting into the database.
>
> Scott
>
>
>
> On Thu, 2007-08-30 at 12:07 -0400, Hilmar Lapp wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>>
>> On Aug 30, 2007, at 11:46 AM, Scott Cain wrote:
>>
>>> Good news!  I only had to add a few defineds and a few
>>> display_texts and
>>> I was able to successfully create a database and load the yeast GFF3
>>
>> Scott - I'm a little worried - what are you using the display_text()
>> calls for? There is no method to set a property that would be
>> returned here, so you only have control over that if you override the
>> method in a custom AnnotationI class.
>>
>> 	-hilmar
>> - --
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>>
>>
>>
>>
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.3 (Darwin)
>>
>> iD8DBQFG1us5uV6N2JxL7qsRAicFAKCFCHPORyK9273X8u2/gbaZCNpEHgCeMovA
>> OtZghop1tET5iMqnwXzL+lk=
>> =NVrK
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug 30 17:45:44 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 12:45:44 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <1188488785.2567.93.camel@localhost.localdomain>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<1188416748.2567.36.camel@localhost.localdomain>
	<8BA0C799-9899-4926-B7F5-24180F764146@uiuc.edu>
	<1188446706.2567.59.camel@localhost.localdomain>
	<CC117C59-55CD-49D1-AB59-7FC0DC31888C@uiuc.edu>
	<1188488785.2567.93.camel@localhost.localdomain>
Message-ID: <B81A709F-5081-4EB0-8778-2ABEDB02BA86@uiuc.edu>

Sounds good but I have yet to commit some of the Ontology changes  
Hilmar and I discussed (whereupon our brace heroes deprecate dblinks  
methods in favor of dbxrefs).  These should be committed fairly soon  
(hour or two).

My guess is the change will be fairly transparent so shouldn't affect  
anything unless you have scripts using those methods directly.

chris

On Aug 30, 2007, at 10:46 AM, Scott Cain wrote:

> Hi Chris,
>
> Good news!  I only had to add a few defineds and a few  
> display_texts and
> I was able to successfully create a database and load the yeast GFF3
> file.  While I want to do more testing with GFF from other sources,
> clearly, I am 95% of the way there with relatively little work.
>
> Nice job and Thanks!
> Scott
>
>
> On Wed, 2007-08-29 at 23:17 -0500, Chris Fields wrote:
>> It shouldn't, that sounds like the output for add_text().  value()
>> should just return the scalar value.
>>
>> As a note, I added a new method, display_text(), for all
>> Bio::AnnotationI classes which by default replicates the same output
>> that stringification overloads produced.  So you should be able to
>> explicitly call $ann->display_text for any Bio::AnnotationI where you
>> once used an implicit call:
>>
>> # old
>> print "$ann\n";
>>
>> # new
>> print $ann->display_text,"\n";
>>
>> chris
>>
>> On Aug 29, 2007, at 11:05 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> Is there a reason that the value method of the
>>> Bio::Annotation::SimpleValue (and possibly some of its siblings)
>>> returning "Value: $value"?  It didn't used to have the "Value: "
>>> before,
>>> did it?
>>>
>>> Thanks,
>>> Scott
>>>
>>>
>>> On Wed, 2007-08-29 at 15:13 -0500, Chris Fields wrote:
>>>> I'll probably go ahead and start merging this stuff over to CVS  
>>>> HEAD
>>>> then.  There haven't been any objections so far.
>>>>
>>>> The page I posted outlines the more critical fixes, primarily the
>>>> changes to Bio::Ontology::Term methods (along with relevant  
>>>> code) due
>>>> to inconsistencies in the interface.  The Bio::Annotation classes
>>>> also now throw if you attempt to use them in an overloaded context.
>>>> I also split off SeqFeature::Annotated tests into it's own test  
>>>> suite
>>>> (SeqFeatAnnotated.t).
>>>>
>>>> Let me know if there are any problems along the way!
>>>>
>>>> chris
>>>>
>>>> On Aug 29, 2007, at 2:45 PM, Scott Cain wrote:
>>>>
>>>>> Hi Chris,
>>>>>
>>>>> I just wanted to let you know that I was out of town for a few
>>>>> days, but
>>>>> now I'm back and I'm doing testing of GMOD software based on the
>>>>> branch
>>>>> you are working on.  I'll let you know how it goes, but don't  
>>>>> let me
>>>>> stop you if you confident of your changes.  I'm sure whatever goes
>>>>> wrong, it will just point out holes in the FeatureIO tests (I'm  
>>>>> sure
>>>>> there are plenty) and will require hopefully minimal changes on my
>>>>> end.
>>>>>
>>>>> Thanks for your considerable efforts on this!  (Regardless of how
>>>>> much
>>>>> work it makes for me :-)
>>>>> Scott
>>>>>
>>>>>
>>>>> On Tue, 2007-08-28 at 16:05 -0500, Chris Fields wrote:
>>>>>> I'm now wrapping up the Feature/Annotation rollback.  I will
>>>>>> probably
>>>>>> start merging back to the main branch in the next day or two., as
>>>>>> soon as interested parties (*cough*devs*cough*) look over the  
>>>>>> last
>>>>>> batch of changes.
>>>>>>
>>>>>> http://www.bioperl.org/wiki/
>>>>>> Feature_Annotation_rollback#Fourth_Round
>>>>>>
>>>>>> I have also added a small benchmark test which indicates a  
>>>>>> decrease
>>>>>> in parsing time in SeqIO::genbank with all tests passing.  I  
>>>>>> expect
>>>>>> this will translate over to any Bio::SeqFeature::Generic-using
>>>>>> class
>>>>>> (open mouth, prepare to insert foot....).
>>>>>>
>>>>>> It is also possible there are still some instances where
>>>>>> overloading
>>>>>> is expected lurking about in the ~1000 or so modules, so I'll  
>>>>>> leave
>>>>>> the exceptions I added to all Bio::AnnotationI; we can remove  
>>>>>> them
>>>>>> down the line, maybe prior to rel1.6, after more tests are  
>>>>>> added or
>>>>>> if they get particularly annoying.  My guess is I caught  
>>>>>> 99.99% of
>>>>>> them (prepare to insert other foot....).
>>>>>>
>>>>>> The key change in this last round is the addition of several  
>>>>>> class
>>>>>> *dbxref* methods to Bio::Ontology::Term and
>>>>>> Bio::Annotation::OntologyTerm, all of which are capable of  
>>>>>> working
>>>>>> with either DBLink instances or simple scalars.  This was  
>>>>>> primarily
>>>>>> done in order to clear up inconsistencies in the older *dblink*
>>>>>> methods, which were ambiguous (some indicates simple scalar
>>>>>> arguments, others DBLink objects); operator overloading was used
>>>>>> extensively in these cases, which led to several issues.  I have
>>>>>> added deprecation warnings to the older methods which now map to
>>>>>> using the newer methods.  All tests pass with the exception of a
>>>>>> few
>>>>>> already failing on the MAIN branch; the single test which needs
>>>>>> to be
>>>>>> fixed is a round-tripping error in swiss.t (now a TODO), which
>>>>>> can be
>>>>>> fixed after merging back.
>>>>>>
>>>>>> Please respond to this if there are any questions or if I need to
>>>>>> clarify the changes I made a bit more.
>>>>>>
>>>>>> chris
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>> -- 
>>>>> ------------------------------------------------------------------ 
>>>>> --
>>>>> --
>>>>> --
>>>>> Scott Cain, Ph. D.
>>>>> cain at cshl.edu
>>>>> GMOD Coordinator (http://www.gmod.org/)
>>>>> 216-392-3087
>>>>> Cold Spring Harbor Laboratory
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>> -- 
>>> -------------------------------------------------------------------- 
>>> --
>>> --
>>> Scott Cain, Ph. D.
>>> cain.cshl at gmail.com
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>>
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ---------------------------------------------------------------------- 
> --
> Scott Cain, Ph. D.                                    
> cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Thu Aug 30 18:03:29 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 13:03:29 -0500
Subject: [Bioperl-l] Feature/Annotation rollback finished
In-Reply-To: <DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>
References: <AB1D2F2A-137F-4C3E-8381-44E142D12508@uiuc.edu>
	<E9E4C379-A982-4F1D-AB22-6A31DBE21388@gmx.net>
	<A57BD5F0-714D-4C9C-8732-69153A5BBE02@uiuc.edu>
	<DF84C537-2860-48E1-9979-E1101C4D5826@gmx.net>
Message-ID: <D4E8E9D3-BB64-48C5-8273-5C6C04DC8DE9@uiuc.edu>


On Aug 30, 2007, at 11:20 AM, Hilmar Lapp wrote:

>> ...It describes one method, ontology_term(), which returns a  
>> Bio::Ontology::TermI.  This is similar to  
>> SeqFeature::Annotated::type(), which returns a  
>> Bio::Annotation::OntologyTerm (a Bio::Ontology::TermI).  My  
>> thought is to simply deprecate type() in favor of  
>> TypedSeqFeatureI::ontology_term().
>
> I think we'll want to think about that. type() gives me some  
> indication of what the returned value might represent, whereas  
> ontology_term() only tells me about the type of the returned object.
>
> You could make ontology_term() accept a context argument, such as
>
> 	my $feature_type = $typedFeat->ontology_term(-context => -type);
>
> Or you could name the method(s) more explicitly, such as
>
> 	my $feature_type = $typedFeat->type_term();
> 	my $feature_source = $typedFeat->source_term();
> 	my @annTerms = $typedFeat->get_Annotations('Gene Ontology');
>
> Am I making sense?
>
> 	-hilmar

I think so; I'll have to look at what is returned from type() in some  
more detail.

It appears that the two main culprits for passing strings off to  
Ontology::Term are the Bio::OntologyIO::obo and  
Bio::OntologyIO::dagflat parsers.  I can add some code in there to  
change those to DBLinks prior to creating Ontology::Term instances,  
which should clean that up.

chris


From cjfields at uiuc.edu  Fri Aug 31 00:57:15 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 30 Aug 2007 19:57:15 -0500
Subject: [Bioperl-l] Bio::Expression & Re:  ReseqChip,
	module/package name
In-Reply-To: <46CF27F4.8030608@arcor.de>
References: <03D7F0EB-3BC2-4988-B67F-09C4225EAE13@uiuc.edu>	<46CEAD83.2050904@arcor.de>	<9824900.1187973171940.JavaMail.ngmail@webmail17>	<A3DEC410-B89F-4C48-B843-F2BD8AA0A514@uiuc.edu>
	<BE442226-9FDF-43A4-BCA6-398652019D31@gmx.net>
	<46CF27F4.8030608@arcor.de>
Message-ID: <4ED2E2B0-8E36-4500-A4C9-B8C333E14614@uiuc.edu>


On Aug 24, 2007, at 1:48 PM, marian wrote:

> ...
> Bio::Microarray::Tools::MitoChip would be OK to me. I merely meant,  
> that it
> isnt an expression chip and you also wont/cant analyze expression  
> data with
> the tool I am talking about.
>
> Marian

Okay, I have everything working from bugzilla:

http://bugzilla.open-bio.org/show_bug.cgi?id=2332

I suppose what we need to do next is get a test script going.  I'll  
look at the script attached to see if we can get something going that  
is fairly quick.

chris


From avilella at gmail.com  Fri Aug 31 09:29:43 2007
From: avilella at gmail.com (Albert Vilella)
Date: Fri, 31 Aug 2007 10:29:43 +0100
Subject: [Bioperl-l] display protein/CDS multiple sequence alignments with
	exon boundaries
Message-ID: <358f4d650708310229i421bb1d7w5f64e3ded5f92618@mail.gmail.com>

Hi,

Probably a bit of a long shot but does anyone have code for
displaying protein or CDS multiple sequence alignments with the exon
boundaries
of each gene in the alignment?

Something in the bioperl world without funky external dependencies. I think
it would
be an awesome addition to the howtos.

Currently, the Bio::Graphics howto has cdna to genome mapping scripts or
blast output scripts, but
I couldn't find code for dealing with multiple sequence alignments.

Cheers,

    Albert.


From neetisomaiya at gmail.com  Fri Aug 31 09:41:51 2007
From: neetisomaiya at gmail.com (neeti somaiya)
Date: Fri, 31 Aug 2007 15:11:51 +0530
Subject: [Bioperl-l] need help
Message-ID: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>

Hi,

I am trying to parse the compound (
ftp://ftp.genome.jp/pub/kegg/ligand/compound/compound) and glycan (
ftp://ftp.genome.jp/pub/kegg/ligand/glycan/glycan) files of KEGG using
bioperl.
I just want the kegg id of the compound/glycan and its names and synonyms if
any.
Bio::SeqIO is giving some problem, I am not able to fetch the id and name.
Can someone help me with this.

Thanks.

-- 
-Neeti
Even my blood says, B positive


From cjfields at uiuc.edu  Fri Aug 31 14:51:51 2007
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 31 Aug 2007 09:51:51 -0500
Subject: [Bioperl-l] need help
In-Reply-To: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>
References: <764978cf0708310241i1baf6feeoc808c396125c078e@mail.gmail.com>
Message-ID: <BD54A833-D2D3-4AE5-8517-BB060F3C132E@uiuc.edu>

I don't believe Bio::SeqIO::kegg will parse those files (they aren't  
sequence files).  The format it recognizes is:

http://www.bioperl.org/wiki/KEGG_sequence_format

for the files found in the subdirectories here:

ftp://ftp.genome.ad.jp/pub/kegg/genes/organisms

I would just build a custom parser if all you're interested in is id/ 
names/synonyms.  It'll be much faster.

chris

On Aug 31, 2007, at 4:41 AM, neeti somaiya wrote:

> Hi,
>
> I am trying to parse the compound (
> ftp://ftp.genome.jp/pub/kegg/ligand/compound/compound) and glycan (
> ftp://ftp.genome.jp/pub/kegg/ligand/glycan/glycan) files of KEGG using
> bioperl.
> I just want the kegg id of the compound/glycan and its names and  
> synonyms if
> any.
> Bio::SeqIO is giving some problem, I am not able to fetch the id  
> and name.
> Can someone help me with this.
>
> Thanks.
>
> -- 
> -Neeti
> Even my blood says, B positive
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign