From kvddrift at earthlink.net  Sun Feb  1 12:08:36 2004
From: kvddrift at earthlink.net (Koen van der Drift)
Date: Sun Feb  1 12:15:44 2004
Subject: [Bioperl-l] Bio::UnivAln mourned
In-Reply-To: <200402010251.i112pSEt031610@portal.open-bio.org>
References: <200402010251.i112pSEt031610@portal.open-bio.org>
Message-ID: <45F2F0CC-54D9-11D8-A807-003065A5FDCC@earthlink.net>

> But I just migrated from linux to OS X, and my latest install (bioperl
> 1.4)
> was virgin to my new machine.
>
> After my scripts carped I placed UnivAln.pm from an old bioperl dist (I
> used 0.7.0)
> into my bioperl-1.x build tree (install-dist/Bio) and added it's name
> to my MANIFEST
> before running (or re-running) perl Makefile.PL.
>
> Now it works without apparent conflicts in my set-up, YMMV.
>
> I understand that maybe the demise of the module is due to its lack of
> a maintainer keeping its
> guts up-to-date with bioperl root structure. I would perhaps be willing
> to help here.
> Would there be objections to its revival? Or are there plans to expand
> existing modules?

Hi,

I am the current maintainer of the fink package for bioperl. I have 
already submitted the new version (1.4) to fink a few weeks ago. 
Unfortunately, the powers that be have not yet put the package into the 
fink cvs, so it is not yet available. The fink team is currently 
restructuring the way perl modules get installed, and it seems that the 
bioperl package is on hold until this is resolved. If you want I can 
mail you the info file for bioperl offlist, so you can use it with 
fink.

- Koen.

From lstein at cshl.edu  Mon Feb  2 04:07:10 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Mon Feb  2 04:13:56 2004
Subject: [Bioperl-l] Filehandle interface in Bio::AlignIO: broken in 1.4?
In-Reply-To: <Pine.LNX.4.50.0401311001440.23804-100000@tenero.duhs.duke.edu>
References: <CC0FFF46-53F7-11D8-ACE8-000A95C4E89C@icm.uu.se>
	<Pine.LNX.4.50.0401311001440.23804-100000@tenero.duhs.duke.edu>
Message-ID: <200402021107.10368.lstein@cshl.edu>

So sorry about breaking the newFh method.  You can restore the 
previous behavior by passing \*ARGV to the -fh argument when you 
create the SeqIO object.

Unfortunately the earlier behavior made it impossible to  create a 
seqIO object that was write only, and as a result some higher-level 
modules were gobbling the STDIN inappropriately.

Lincoln

On Saturday 31 January 2004 05:07 pm, Jason Stajich wrote:
> Dave -
>
> This is due to a change lincoln made to not default to the magic <>
> operator when no filename is provided.  It broke a lot of my
> scripts too.
>
> [This is basically part of similar question that Peter was asking
> wrt SeqIO]
>
> I would like to see it come back somehow but I am not sure how as
> it causes certain things to block during the tests.
>
> it has nothing to do with the newFh method but with Bio::Root::IO
> in the _readline method.
>
> <     my $fh = $self->_fh || \*ARGV;
> ---
>
> >     my $fh = $self->_fh or return;
>
> If you want the old functionality just change that line back in
> your code for the time being.
>
>
> --jason
>
> On Sat, 31 Jan 2004, Dave NO SPAM Ardell wrote:
> > Hi,
> >
> > I searched for this in docs and the mailing list but didn't find
> > anything. Also
> > looked quickly in bugzilla and the Changelist with no result.
> > So sorry if I am rehashing something known.
> >
> > Was surprised after installing 1.4 that filehandle functionality
> > documented
> > as part of the interface in AlignIO seems broken. That is to say,
> >
> > 1: $stream = Bio::AlignIO->newFh('-format' => "$opt_i"); # read
> > from standard input
> > 2: while ( my $aln = <$stream> ) {
> >
> > doesn't work anymore. It doesn't die, but the diamond operator
> > returns null regardless of input. During debugging,
> >
> > gdb> x $stream
> >
> > after line 1 above, executing with sequence input on STDIN
> > gives:
> >
> > ------ BEGIN DEBUGGER OUTPUT
> > 0  GLOB(0xbbd178)
> >     -> *Symbol::GEN0
> > Can't locate object method "FILENO" via package
> > "Bio::AlignIO::fasta" at /System/Library/Perl/5.8.1/dumpvar.pl
> > line 238.
> >          dumpvar::unwrap('GLOB(0xbbd178)',3,-2) called at
> > /System/Library/Perl/5.8.1/dumpvar.pl line 118
> >          dumpvar::DumpElem('GLOB(0xbbd178)',3,-2) called at
> > /System/Library/Perl/5.8.1/dumpvar.pl line 223
> >          dumpvar::unwrap('ARRAY(0xbbd0e8)',0,-1) called at
> > /System/Library/Perl/5.8.1/dumpvar.pl line 33
> >          main::dumpValue('ARRAY(0xbbd0e8)',-1) called at
> > /System/Library/Perl/5.8.1/perl5db.pl line 5270
> >          DB::dumpit('GLOB(0x143988)','ARRAY(0xbbd0e8)') called at
> > /System/Library/Perl/5.8.1/perl5db.pl line 647
> >          DB::eval called at /System/Library/Perl/5.8.1/perl5db.pl
> > line 3314
> >          DB::DB called at /Users/dave/Data/mybin/pi line 44
> >
> > ----- END DEBUGGER OUTPUT -----
> >
> > I see from recent scripts, for instance, bp_sreformat.pl, and
> > from documentation
> > that stdin/stdout functionality can be had through AlignIO::new
> >
> > But for my quickfix, I reinstalled bioperl 1.2.3
> >
> > Maybe the change is intentional.
> > Although I guess I would miss the filehandle interface,
> > I could get over it =)
> >   Maybe the change should be documented though.
> >
> > Apologies if it is and I missed it.
> >
> > Dave
> >
> >
> > --+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--
> >+--+-- +--
> > David Ardell, Asst. Professor.  			tel   : 46 (0) 18 471 6694
> > Linnaeus Centre for Bioinformatics		fax  : 46 (0) 18 471 6698
> > Uppsala University Biomedical Center     
> > http://www.lcb.uu.se/~dave Husargatan 3, Box 598,
> > SE 751 24 Uppsala SWEDEN.
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
From Bernhard.Schmalhofer at biomax.de  Mon Feb  2 04:15:20 2004
From: Bernhard.Schmalhofer at biomax.de (Bernhard Schmalhofer)
Date: Mon Feb  2 04:21:54 2004
Subject: [Bioperl-l] Testing BioPerl objects for equality
In-Reply-To: <Pine.OSX.4.44.0401290935590.2412-100000@ewan-birneys-computer.local>
References: <Pine.OSX.4.44.0401290935590.2412-100000@ewan-birneys-computer.local>
Message-ID: <401E1528.5060309@biomax.de>

Ewan Birney wrote:
> 
> On Thu, 29 Jan 2004, Peter van Heusden wrote:
> 
> 
>>I've got an idea for testing where I'd like to 'round-trip' through
>>SeqIO: read in from a file on disk, write out again with write_seq() and
>>then read in the file written by write_seq() and compare the two
>>sequence objects. If they aren't equal, it means we've got a problem.
> 
> 
> That sounds like a great idea... we've always had problems with diff'ing
> the files because of whitespace issues, but diff'ing the objects sounds
> great.
> 
> 
>>To make this work requires some kind of equals() method on Seq,
>>SeqFeature, etc. This doesn't seem to be there at the moment - or am I
>>missing something? Maybe there should probably be some kind of
>>Bio::ComparableI interface which provides an equals() abstract method.
>>
> 
If the roundtrip is starting from a file is a specific format, shouldn't 
it be possiple to compare the data structures of the sequence object 
directly?
I was think of using something like Test::More::is_deeply(), which tells 
you where the data structures start to become different.

CU, Bernhard
-- 
**************************************************
Bernhard Schmalhofer
Senior Developer
Biomax Informatics AG
Lochhamer Str. 11
82152 Martinsried, Germany
Tel: +49 89 895574-839
Fax: +49 89 895574-825
eMail: Bernhard.Schmalhofer@biomax.com
Website: www.biomax.com
**************************************************

From heikki at ebi.ac.uk  Mon Feb  2 06:41:23 2004
From: heikki at ebi.ac.uk (Heikki Lehvaslaiho)
Date: Mon Feb  2 06:47:25 2004
Subject: [Bioperl-l] working with large alignments
Message-ID: <200402021141.23420.heikki@ebi.ac.uk>


Albert Vilella who is visiting me here at EBI works with really big genomic 
sequence alignments. I've committed several of his modules into cvs for that 
purpose. The most important additions are:

* Bio::Seq::LargeLocatableSeq
    Bio::RangeI compliant Bio::Seq::LargePrimarySeq 
    uses File::Tmp for seq storing
* Bio::Seq::LargeSeqI
    Interface class for LargeSeq implemantations
* Bio::AlignIO::largemultifasta
    IO class creating Bio::Seq::LargeLocatableSeq and SimpleAlign objects


The LargeLocatableSeq is based on code from Bio::Seq::LargePrimarySeq. 
Everything seems to work but if we run tests added to the end of the 
t/AlignIO.t file with larger files, the process is still using large amount 
of memory. We'be interested from hearing from anyone who can suggest 
improvements.

You are willling to test the code with larger data sets, I've put two files 
here:
 
http://www.ebi.ac.uk/~lehvasla/bioperl/medium.largemultifasta (1.3M)
http://www.ebi.ac.uk/~lehvasla/bioperl/large.largemultifasta (31M)

Thanks,

	-Heikki  and Albert
-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
From m_conte at hotmail.com  Mon Feb  2 07:11:25 2004
From: m_conte at hotmail.com (matthieu CONTE)
Date: Mon Feb  2 07:17:39 2004
Subject: [Bioperl-l] Bio ::seqIO ::tigr
Message-ID: <BAY12-F99vux4Jf7zyn000077a6@hotmail.com>


Ok...
But the method ?get_BioDatabaseAdaptor? doesn't exist in the
Bio::DB::BioSQL::DBAdaptor module (documentation). I didn't find it on the 
bioperl-db web  page
    Any idea ?

Thanks


Matthieu CONTE
M. Sc. in Bioinformatics from SIB

00 33 06.68.90.28.70
m_conte@hotmail.com


>From: Hilmar Lapp <hlapp@gmx.net>
>To: "matthieu CONTE" <m_conte@hotmail.com>
>CC: bioperl-l@bioperl.org
>Subject: Re: [Bioperl-l] Bio ::seqIO ::tigr
>Date: Wed, 28 Jan 2004 08:55:15 -0800
>
>I suspect you have an old version of bioperl-db, or a version mix-up. You 
>need to download and install the latest revision from CVS for bioperl-db.
>
>Note that if the root of the problem is with the pir parser then 
>load_seqdatabase.pl will not cure it, as it just uses any Bio::SeqIO 
>compliant parser to provide the input sequences. If the parser is broken 
>then there won't be input ... It just saves you the round-trip (and 
>possible errors associated with it) of going through swissprot format.
>
>	-hilmar
>
>On Wednesday, January 28, 2004, at 02:07  AM, matthieu CONTE wrote:
>
>>Ok , I try directly with "load_seqdatabase.pl"  but there is another 
>>problem.....
>>
>>[conte@bearn scripts]$  perl load_seqdatabase.pl -dbuser biosql -dbpass 
>>biosql -format tigr tigr 
>>/home/conte/pipeline_orthologues/data/orysa_tigr.txt
>>
>>Can't locate object method "get_BioDatabaseAdaptor" via package 
>>"Bio::DB::BioSQL::DBAdaptor" at load_seqdatabase.pl line 84.
>>
>>Indeed this method does not exist in Bio::DB::BioSQL::DBAdaptor....
>>
>>
>>
>>
>>Matthieu CONTE
>>M. Sc. in Bioinformatics from SIB
>>
>>00 33 06.68.90.28.70
>>m_conte@hotmail.com
>>
>>
>>
>>
>>
>>>From: Hilmar Lapp <hlapp@gmx.net>
>>>To: "matthieu CONTE" <m_conte@hotmail.com>
>>>CC: bioperl-l@bioperl.org
>>>Subject: Re: [Bioperl-l] Bio ::seqIO ::tigr Date: Tue, 27 Jan 2004 
>>>09:31:39 -0800
>>>
>>>A question aside: why do you want to convert to swissprot in order to 
>>>load into biosql? (load_seqdatabase.pl can use any SeqIO reader.)
>>>
>>>	-hilmar
>>>
>>>On Tuesday, January 27, 2004, at 02:50  AM, matthieu CONTE wrote:
>>>
>>>>I currently trying to use the Bio ::seqIO ::tigr module.
>>>>My objective is to download the whole rice genome form Tigr ( adress 
>>>>below)and to integrate it in my BioSQL DB.
>>>>For this I am trying to convert the tigr format in swiss format with the 
>>>>script below
>>>>
>>>>
>>>>use Bio::SeqIO;
>>>>
>>>>my $in = Bio::SeqIO->new(-file 
>>>>=>'</home/conte/pipeline_orthologues/data/orysa_tigr.txt', -format 
>>>>=>'tigr');
>>>>
>>>>my $out = Bio::SeqIO->new(-file => 
>>>>'>/home/conte/pipeline_orthologues/data/orysa_swiss.txt' , 
>>>>-format=>'swiss');
>>>>
>>>>print $out $_ while <$in>;
>>>>
>>>>I obtain:
>>>>
>>>>------------ EXCEPTION  -------------
>>>>MSG: [19]Required <AUTHOR_LIST> missing
>>>>STACK Bio::SeqIO::tigr::throw 
>>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:1338
>>>>STACK Bio::SeqIO::tigr::_process_header 
>>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:700
>>>>STACK Bio::SeqIO::tigr::_process_assembly 
>>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:535
>>>>STACK Bio::SeqIO::tigr::_process_tigr 
>>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:453
>>>>STACK Bio::SeqIO::tigr::_process 
>>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:420
>>>>STACK Bio::SeqIO::tigr::_initialize 
>>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:90
>>>>STACK Bio::SeqIO::new 
>>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:358
>>>>STACK Bio::SeqIO::new 
>>>>/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:378
>>>>STACK toplevel get_bioseq_tigr.pl:8
>>>>
>>>>Could you please tell me if there is a problem with the parser or with 
>>>>the input data format of Tigr?
>>>>
>>>>Thanks in advance
>>>>
>>>>
>>>>
>>>>
>>>>Matthieu CONTE
>>>>m_conte@hotmail.com
>>>>
>>>>_________________________________________________________________
>>>>MSN Messenger : discutez en direct avec vos amis ! 
>>>>http://www.msn.fr/msger/default.asp
>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l@portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>--
>>>-------------------------------------------------------------
>>>Hilmar Lapp                            email: lapp at gnf.org
>>>GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>>>-------------------------------------------------------------
>>>
>>>
>>
>>_________________________________________________________________
>>MSN Messenger : discutez en direct avec vos amis ! 
>>http://www.msn.fr/msger/default.asp
>>
>>
>--
>-------------------------------------------------------------
>Hilmar Lapp                            email: lapp at gnf.org
>GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>-------------------------------------------------------------
>
>

_________________________________________________________________
MSN Search, le moteur de recherche qui pense comme vous ! 
http://search.msn.fr/worldwide.asp

From hlapp at gmx.net  Mon Feb  2 10:55:40 2004
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon Feb  2 11:01:57 2004
Subject: [Bioperl-l] Bio ::seqIO ::tigr
In-Reply-To: <BAY12-F99vux4Jf7zyn000077a6@hotmail.com>
Message-ID: <3FF205D0-5598-11D8-8C77-000A959EB4C4@gmx.net>

That's why I said you seem to have a version mix-up in addition.  
get_BioDatabaseAdaptor is part of the 0.1 API, which was retired more  
than a year ago.

	-hilmar

On Monday, February 2, 2004, at 04:11  AM, matthieu CONTE wrote:

>
> Ok...
> But the method ?get_BioDatabaseAdaptor? doesn't exist in the
> Bio::DB::BioSQL::DBAdaptor module (documentation). I didn't find it on  
> the bioperl-db web  page
>    Any idea ?
>
> Thanks
>
>
>
> Matthieu CONTE
> M. Sc. in Bioinformatics from SIB
>
> 00 33 06.68.90.28.70
> m_conte@hotmail.com
>
>
>
>
>
>> From: Hilmar Lapp <hlapp@gmx.net>
>> To: "matthieu CONTE" <m_conte@hotmail.com>
>> CC: bioperl-l@bioperl.org
>> Subject: Re: [Bioperl-l] Bio ::seqIO ::tigr
>> Date: Wed, 28 Jan 2004 08:55:15 -0800
>>
>> I suspect you have an old version of bioperl-db, or a version mix-up.  
>> You need to download and install the latest revision from CVS for  
>> bioperl-db.
>>
>> Note that if the root of the problem is with the pir parser then  
>> load_seqdatabase.pl will not cure it, as it just uses any Bio::SeqIO  
>> compliant parser to provide the input sequences. If the parser is  
>> broken then there won't be input ... It just saves you the round-trip  
>> (and possible errors associated with it) of going through swissprot  
>> format.
>>
>> 	-hilmar
>>
>> On Wednesday, January 28, 2004, at 02:07  AM, matthieu CONTE wrote:
>>
>>> Ok , I try directly with "load_seqdatabase.pl"  but there is another  
>>> problem.....
>>>
>>> [conte@bearn scripts]$  perl load_seqdatabase.pl -dbuser biosql  
>>> -dbpass biosql -format tigr tigr  
>>> /home/conte/pipeline_orthologues/data/orysa_tigr.txt
>>>
>>> Can't locate object method "get_BioDatabaseAdaptor" via package  
>>> "Bio::DB::BioSQL::DBAdaptor" at load_seqdatabase.pl line 84.
>>>
>>> Indeed this method does not exist in Bio::DB::BioSQL::DBAdaptor....
>>>
>>>
>>>
>>>
>>> Matthieu CONTE
>>> M. Sc. in Bioinformatics from SIB
>>>
>>> 00 33 06.68.90.28.70
>>> m_conte@hotmail.com
>>>
>>>
>>>
>>>
>>>
>>>> From: Hilmar Lapp <hlapp@gmx.net>
>>>> To: "matthieu CONTE" <m_conte@hotmail.com>
>>>> CC: bioperl-l@bioperl.org
>>>> Subject: Re: [Bioperl-l] Bio ::seqIO ::tigr Date: Tue, 27 Jan 2004  
>>>> 09:31:39 -0800
>>>>
>>>> A question aside: why do you want to convert to swissprot in order  
>>>> to load into biosql? (load_seqdatabase.pl can use any SeqIO >>>> reader.)
>>>>
>>>> 	-hilmar
>>>>
>>>> On Tuesday, January 27, 2004, at 02:50  AM, matthieu CONTE wrote:
>>>>
>>>>> I currently trying to use the Bio ::seqIO ::tigr module.
>>>>> My objective is to download the whole rice genome form Tigr (  
>>>>> adress below)and to integrate it in my BioSQL DB.
>>>>> For this I am trying to convert the tigr format in swiss format  
>>>>> with the script below
>>>>>
>>>>>
>>>>> use Bio::SeqIO;
>>>>>
>>>>> my $in = Bio::SeqIO->new(-file  
>>>>> =>'</home/conte/pipeline_orthologues/data/orysa_tigr.txt', -format  
>>>>> =>'tigr');
>>>>>
>>>>> my $out = Bio::SeqIO->new(-file =>  
>>>>> '>/home/conte/pipeline_orthologues/data/orysa_swiss.txt' ,  
>>>>> -format=>'swiss');
>>>>>
>>>>> print $out $_ while <$in>;
>>>>>
>>>>> I obtain:
>>>>>
>>>>> ------------ EXCEPTION  -------------
>>>>> MSG: [19]Required <AUTHOR_LIST> missing
>>>>> STACK Bio::SeqIO::tigr::throw  
>>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/ 
>>>>> tigr.pm:1338
>>>>> STACK Bio::SeqIO::tigr::_process_header  
>>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/>>>>> tigr.pm:700
>>>>> STACK Bio::SeqIO::tigr::_process_assembly  
>>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/>>>>> tigr.pm:535
>>>>> STACK Bio::SeqIO::tigr::_process_tigr  
>>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/>>>>> tigr.pm:453
>>>>> STACK Bio::SeqIO::tigr::_process  
>>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/>>>>> tigr.pm:420
>>>>> STACK Bio::SeqIO::tigr::_initialize  
>>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:90
>>>>> STACK Bio::SeqIO::new  
>>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:358
>>>>> STACK Bio::SeqIO::new  
>>>>> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:378
>>>>> STACK toplevel get_bioseq_tigr.pl:8
>>>>>
>>>>> Could you please tell me if there is a problem with the parser or  
>>>>> with the input data format of Tigr?
>>>>>
>>>>> Thanks in advance
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Matthieu CONTE
>>>>> m_conte@hotmail.com
>>>>>
>>>>> _________________________________________________________________
>>>>> MSN Messenger : discutez en direct avec vos amis !  
>>>>> http://www.msn.fr/msger/default.asp
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l@portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>> --
>>>> -------------------------------------------------------------
>>>> Hilmar Lapp                            email: lapp at gnf.org
>>>> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>>>> -------------------------------------------------------------
>>>>
>>>>
>>>
>>> _________________________________________________________________
>>> MSN Messenger : discutez en direct avec vos amis !  
>>> http://www.msn.fr/msger/default.asp
>>>
>>>
>> --
>> -------------------------------------------------------------
>> Hilmar Lapp                            email: lapp at gnf.org
>> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>> -------------------------------------------------------------
>>
>>
>
> _________________________________________________________________
> MSN Search, le moteur de recherche qui pense comme vous !  
> http://search.msn.fr/worldwide.asp
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From mitchell at odin.mdacc.tmc.edu  Mon Feb  2 16:45:04 2004
From: mitchell at odin.mdacc.tmc.edu (James Mitchell)
Date: Mon Feb  2 16:51:22 2004
Subject: [Bioperl-l] Bio::Ontology - parsing GO files
Message-ID: <NEBBIMLJLIAEEKIGMMCKKEDJCHAA.mitchell@odin.mdacc.tmc.edu>

I'm using Bio::Ontology modules to access GO tree information, ie.
ancestors/descendants of a given node.  I'm using it like this:
---
use strict;
use Bio::Ontology::SimpleGOEngine;

my $gendir = $ENV{GeneLink_Dir};
my $deffile = $gendir . "GO.defs";
my $comfile = $gendir . "component.ontology";
my $funfile = $gendir . "function.ontology";
my $profile = $gendir . "process.ontology";
my $parser = Bio::Ontology::SimpleGOEngine->new
	( -defs_file => $deffile,
	  -files     => [$comfile,
	                 $funfile,
	                 $profile] );
my $engine = $parser->parse();
---
I'm getting this error though:
Can't locate object method "parse" via package
"Bio::Ontology::SimpleGOEngine" (
perhaps you forgot to load "Bio::Ontology::SimpleGOEngine"?)
---
Is this the correct method for parsing GO files?  I'm using version 1.4 on
Windows.

thanks,
James


From ukWoodwards at uk.tk  Tue Feb  3 03:45:28 2004
From: ukWoodwards at uk.tk (Help)
Date: Mon Feb  2 21:48:47 2004
Subject: [Bioperl-l] sukper viagrma
Message-ID: <mxibuvsdvp.398890036zbyfmi@Helplslyzfecr>

It`s fabuklous!

I took the only one pijll of Cialjs and that was such a GREAT weekend!
All the girls at the party were just punch-drunk with my potential

I have fhcked all of them THREE times but my dhck WAS able to do some more!

Cbalis- it`s COOL!!! The best weekend stuff I've ever trhied!
Haven`t you tried yet?

DO IT NkOW at 
http://www.vow-meds.com/sv/index.php?pid=genviag

volley annuls stirrings nagging cycled Metrecal bonnet seceding pinnings.
 

From hlapp at gmx.net  Tue Feb  3 01:32:58 2004
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue Feb  3 01:39:18 2004
Subject: [Bioperl-l] Bio::Ontology - parsing GO files
In-Reply-To: <NEBBIMLJLIAEEKIGMMCKKEDJCHAA.mitchell@odin.mdacc.tmc.edu>
Message-ID: <CED5DDAF-5612-11D8-90AF-000A959EB4C4@gmx.net>


On Monday, February 2, 2004, at 01:45  PM, James Mitchell wrote:

> my $parser = Bio::Ontology::SimpleGOEngine->new
>


Is this still in the documentation? If so, I apologize. You parse 
ontologies analogous to other IO APIs in bioperl:

	$ont_stream = Bio::OntologyIO->new(-format => 'go',
                                         -files => [....],
                                         -defs_file => $deffile);

and then

	while(my $ont = $ont_stream->next_ontology()) {
		# do something with $ont (it's a Bio::Ontology::OntologyI)
	}

Most ontology streams will only have one ontology ever. So, for GO you 
could as well say

	my $go_ont = $ont_stream->next_ontology();

Hth,

	-hilmar

-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From lstein at cshl.edu  Tue Feb  3 04:37:34 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Tue Feb  3 04:44:34 2004
Subject: [Bioperl-l] Testing BioPerl objects for equality
In-Reply-To: <401E1528.5060309@biomax.de>
References: <Pine.OSX.4.44.0401290935590.2412-100000@ewan-birneys-computer.local>
	<401E1528.5060309@biomax.de>
Message-ID: <200402031137.34626.lstein@cshl.edu>

I think that's a great idea.  I hadn't known about test_deeply().  
There's also a Test::Differences module, that does something similar.

Lincoln

On Monday 02 February 2004 11:15 am, Bernhard Schmalhofer wrote:
> Ewan Birney wrote:
> > On Thu, 29 Jan 2004, Peter van Heusden wrote:
> >>I've got an idea for testing where I'd like to 'round-trip'
> >> through SeqIO: read in from a file on disk, write out again with
> >> write_seq() and then read in the file written by write_seq() and
> >> compare the two sequence objects. If they aren't equal, it means
> >> we've got a problem.
> >
> > That sounds like a great idea... we've always had problems with
> > diff'ing the files because of whitespace issues, but diff'ing the
> > objects sounds great.
> >
> >>To make this work requires some kind of equals() method on Seq,
> >>SeqFeature, etc. This doesn't seem to be there at the moment - or
> >> am I missing something? Maybe there should probably be some kind
> >> of Bio::ComparableI interface which provides an equals()
> >> abstract method.
>
> If the roundtrip is starting from a file is a specific format,
> shouldn't it be possiple to compare the data structures of the
> sequence object directly?
> I was think of using something like Test::More::is_deeply(), which
> tells you where the data structures start to become different.
>
> CU, Bernhard

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
From christop21whitney at hotmail.com  Tue Feb  3 05:00:33 2004
From: christop21whitney at hotmail.com (mauricio)
Date: Tue Feb  3 05:09:47 2004
Subject: [Bioperl-l] Stronger than V1AGRA?!
Message-ID: <1075802433-5919@excite.com>

The Biggest New Drug since V1agra! Many times as powerful.

C1AL1S has been seen all over TV as of late.

So why is it so much better than V1agra? Why are so many switching brands?

-A quicker more stable erection
-More enjoyable sex for both
-Longer sex
-Known to add length to you erection
-Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six)

We have it at a discounted savings. Save when you go through our site on all your orders.

See the difference today. 

http://aspen.instrhh.com/s95c/index.php?id=s95


passion electriccracker fiona dickhead vanilla niki sugar 
tootsie hanson 
image biology october god 
stormy 
From billthebrute at yahoo.fr  Tue Feb  3 08:04:26 2004
From: billthebrute at yahoo.fr (=?iso-8859-1?q?william=20ritchie?=)
Date: Tue Feb  3 08:10:39 2004
Subject: [Bioperl-l] Splice site
Message-ID: <20040203130426.21627.qmail@web25209.mail.ukl.yahoo.com>

Hi,
I m looking for an implementation of a good splice
site prediction algorithm (like netgene or sitevideo).
Does anyone have any suggestions?

Thanks.

_________________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Mail : http://fr.mail.yahoo.com
From Sebastien.Moretti at igs.cnrs-mrs.fr  Tue Feb  3 08:49:02 2004
From: Sebastien.Moretti at igs.cnrs-mrs.fr (Sebastien Moretti)
Date: Tue Feb  3 08:52:19 2004
Subject: [Bioperl-l] Uniprot
In-Reply-To: <20040203130426.21627.qmail@web25209.mail.ukl.yahoo.com>
References: <20040203130426.21627.qmail@web25209.mail.ukl.yahoo.com>
Message-ID: <200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr>

Hello
Is there a BioPerl module to send a request to UniProt db ?
My script send a request to Swissprot :

	#!/usr/bin/perl

	use strict;
	use Bio::DB::SwissProt;
	use Bio::SeqIO;
	my $acc=$ARGV[0];

	my $gb = new Bio::DB::SwissProt;
	my $stream = $gb->get_Seq_by_acc($acc);

	my $out=Bio::SeqIO->new(-format=>'swiss');

	my $result=$out->write_seq($stream);
	$result =~ s/^1.*$//;
	print $result;

	exit;

But how can I do the same with UniProt ?
Thanks

-- 
Sebastien MORETTI
CNRS - IGS
31 chemin Joseph Aiguier
13402 Marseille cedex 20, FRANCE
tel. 04 91 16 44 55 - 06 61 88 59 00
From Wiepert.Mathieu at mayo.edu  Tue Feb  3 09:49:49 2004
From: Wiepert.Mathieu at mayo.edu (Wiepert, Mathieu)
Date: Tue Feb  3 09:56:35 2004
Subject: [Bioperl-l] Difference between
Message-ID: <2F41CC6C9777D311ACBD009027B108EA06E9AFD5@excsrv32.mayo.edu>

Hi,

When I reported this, I was told that it was actually a minor bug, and they would look into it.  It didn't sound like something they were going to address any time soon, and I never followed up, so guess it is still the same issue...

-mat

> -----Original Message-----
> From: Alan Li [mailto:immunoguest@hotmail.com]
> Sent: Saturday, January 31, 2004 5:26 PM
> To: Wiepert, Mathieu; bioperl-l@bioperl.org
> Subject: RE: [Bioperl-l] Difference between
> 
> 
> I would like to thank everyone for their responses.
> 
> And yes, Mat is right about this being an issue with the XML 
> output of 
> stand-alone blast. I tried comparing the results of just the 
> stand-alone 
> blast using different -F flags.  The results below shows that 
> if "-F F" is 
> set the results are the same, but are different when using 
> "-F T" for the 
> XML output.
> 
> So is there anything I could do to make the XML results the 
> same when the 
> filtering option is set to true?  Perhaps either through 
> another blast 
> parameter or by doing it programmatically?
> 
> --------------------------------------------------------------
> ---------
> 
> blastall -p blastn -m 7 -F T -d ecoli/ecoli.nt -i test.txt
> 
> <Hit>
>           <Hit_num>1</Hit_num>
>           <Hit_id>gi|1786181|gb|AE000111.1|AE000111</Hit_id>
>           <Hit_def>Escherichia coli K-12 MG1655 section 1 of 
> 400 of the 
> complete genome</Hit_def>
>           <Hit_accession>AE000111</Hit_accession>
>           <Hit_len>10596</Hit_len>
>           <Hit_hsps>
>             <Hsp>
>               <Hsp_num>1</Hsp_num>
>               <Hsp_bit-score>589.253</Hsp_bit-score>
>               <Hsp_score>297</Hsp_score>
>               <Hsp_evalue>1.04898e-168</Hsp_evalue>
>               <Hsp_query-from>237</Hsp_query-from>
>               <Hsp_query-to>560</Hsp_query-to>
>               <Hsp_hit-from>237</Hsp_hit-from>
>               <Hsp_hit-to>560</Hsp_hit-to>
>               <Hsp_query-frame>1</Hsp_query-frame>
>               <Hsp_hit-frame>1</Hsp_hit-frame>
>               <Hsp_identity>324</Hsp_identity>
>               <Hsp_positive>324</Hsp_positive>
>               <Hsp_align-len>324</Hsp_align-len>
>               
> <Hsp_qseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC
> TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
> GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA
> GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC
> CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC
> CGAACGTATTTTTGCCGAACTTTT</Hsp_qseq>
>               
> <Hsp_hseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC
> TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
> GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA
> GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC
> CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC
> CGAACGTATTTTTGCCGAACTTTT</Hsp_hseq>
>               
> <Hsp_midline>|||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> |||||||||||||||||||||||||||</Hsp_midline>
>             </Hsp>
> 
> --------------------------------------------------------------
> ---------
> 
> blastall -p blastn -m 0 -F T -d ecoli/ecoli.nt -i test.txt
> 
> >gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section 
> 1 of 400 of the 
> >complete
>            genome
>           Length = 10596
> 
> Score =  589 bits (297), Expect = e-168
> Identities = 315/324 (97%)
> Strand = Plus / Plus
> 
> 
> Query: 237 
> aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 237 
> aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296
> 
> 
> Query: 297 
> cgggcnnnnnnnnncgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356
>            |||||         
> ||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 297 
> cgggctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356
> 
> 
> Query: 357 
> cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 357 
> cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416
> 
> 
> Query: 417 
> tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 417 
> tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476
> 
> 
> Query: 477 
> ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 477 
> ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536
> 
> 
> Query: 537 cgaacgtatttttgccgaactttt 560
>            ||||||||||||||||||||||||
> Sbjct: 537 cgaacgtatttttgccgaactttt 560
> 
> --------------------------------------------------------------
> ---------
> 
> blastall -p blastn -m 7 -F F -d ecoli/ecoli.nt -i test.txt
> 
> <Hit>
>           <Hit_num>1</Hit_num>
>           <Hit_id>gi|1786181|gb|AE000111.1|AE000111</Hit_id>
>           <Hit_def>Escherichia coli K-12 MG1655 section 1 of 
> 400 of the 
> complete genome</Hit_def>
>           <Hit_accession>AE000111</Hit_accession>
>           <Hit_len>10596</Hit_len>
>           <Hit_hsps>
>             <Hsp>
>               <Hsp_num>1</Hsp_num>
>               <Hsp_bit-score>1110.61</Hsp_bit-score>
>               <Hsp_score>560</Hsp_score>
>               <Hsp_evalue>0</Hsp_evalue>
>               <Hsp_query-from>1</Hsp_query-from>
>               <Hsp_query-to>560</Hsp_query-to>
>               <Hsp_hit-from>1</Hsp_hit-from>
>               <Hsp_hit-to>560</Hsp_hit-to>
>               <Hsp_query-frame>1</Hsp_query-frame>
>               <Hsp_hit-frame>1</Hsp_hit-frame>
>               <Hsp_identity>560</Hsp_identity>
>               <Hsp_positive>560</Hsp_positive>
>               <Hsp_align-len>560</Hsp_align-len>
>               
> <Hsp_qseq>AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAA
> AGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGA
> CTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGA
> GTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAG
> GTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG
> CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTAC
> ATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGC
> AGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATG
> ATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTT
> TGCCGAACTTTT</Hsp_qseq>
>               
> <Hsp_hseq>AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAA
> AGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGA
> CTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGA
> GTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAG
> GTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG
> CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTAC
> ATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGC
> AGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATG
> ATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTT
> TGCCGAACTTTT</Hsp_hseq>
>               
> <Hsp_midline>|||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> |||||||||||||||</Hsp_midline>
>             </Hsp>
> 
> --------------------------------------------------------------
> ---------
> 
> blastall -p blastn -m 0 -F F -d ecoli/ecoli.nt -i test.txt
> 
> >gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section 
> 1 of 400 of the 
> >complete
>            genome
>           Length = 10596
> 
> Score = 1110 bits (560), Expect = 0.0
> Identities = 560/560 (100%)
> Strand = Plus / Plus
> 
> 
> Query: 1   
> agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 1   
> agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60
> 
> 
> Query: 61  
> tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 61  
> tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120
> 
> 
> Query: 121 
> tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 121 
> tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180
> 
> 
> Query: 181 
> acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 181 
> acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240
> 
> 
> Query: 241 
> aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 241 
> aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300
> 
> 
> Query: 301 
> ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 301 
> ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360
> 
> 
> Query: 361 
> acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 361 
> acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420
> 
> 
> Query: 421 
> aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 421 
> aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480
> 
> 
> Query: 481 
> gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540
>            
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct: 481 
> gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540
> 
> 
> Query: 541 cgtatttttgccgaactttt 560
>            ||||||||||||||||||||
> Sbjct: 541 cgtatttttgccgaactttt 560
> 
> 
> >From: "Wiepert, Mathieu" <Wiepert.Mathieu@mayo.edu>
> >To: 'tai kwan do' <immunoguest@hotmail.com>,    bioperl-l@bioperl.org
> >Subject: RE: [Bioperl-l] Difference between Date: Fri, 30 
> Jan 2004 11:13:05 
> >-0600
> >
> >Hi,
> >
> >I have a vague recollection of this problem, so this answer 
> is likely 
> >wrong, but I think it has something to do with the filtered 
> sequence?  You 
> >have 9 masked NT's, so it is probably a difference in the 
> defaults, and 
> >something to do with the XML output not masked?
> >
> >Sorry I can't find the emails I had with NCBI on this, but I 
> am maybe 70% 
> >sure that it is a problem like that, with defaults on the 
> local server 
> >versus NCBI, and the XML not using masked data?
> >
> >Someone else chime in if I am way off there...
> >
> >HTH,
> >
> >-mat
> >
> 
> _________________________________________________________________
> There are now three new levels of MSN Hotmail Extra Storage!  
> Learn more. 
> http://join.msn.com/?pgmarket=en-us&page=hotmail/es2&ST=1
> 
From jason at cgt.duhs.duke.edu  Tue Feb  3 10:45:55 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Tue Feb  3 10:53:13 2004
Subject: [Bioperl-l] Difference between
In-Reply-To: <2F41CC6C9777D311ACBD009027B108EA06E9AFD5@excsrv32.mayo.edu>
References: <2F41CC6C9777D311ACBD009027B108EA06E9AFD5@excsrv32.mayo.edu>
Message-ID: <Pine.LNX.4.50.0402031045060.2090-100000@tenero.duhs.duke.edu>

One also gets slightly different values at times in -m 8 and -m 0 runs as well.
-jason

On Tue, 3 Feb 2004, Wiepert, Mathieu wrote:

> Hi,
>
> When I reported this, I was told that it was actually a minor bug, and they would look into it.  It didn't sound like something they were going to address any time soon, and I never followed up, so guess it is still the same issue...
>
> -mat
>
> > -----Original Message-----
> > From: Alan Li [mailto:immunoguest@hotmail.com]
> > Sent: Saturday, January 31, 2004 5:26 PM
> > To: Wiepert, Mathieu; bioperl-l@bioperl.org
> > Subject: RE: [Bioperl-l] Difference between
> >
> >
> > I would like to thank everyone for their responses.
> >
> > And yes, Mat is right about this being an issue with the XML
> > output of
> > stand-alone blast. I tried comparing the results of just the
> > stand-alone
> > blast using different -F flags.  The results below shows that
> > if "-F F" is
> > set the results are the same, but are different when using
> > "-F T" for the
> > XML output.
> >
> > So is there anything I could do to make the XML results the
> > same when the
> > filtering option is set to true?  Perhaps either through
> > another blast
> > parameter or by doing it programmatically?
> >
> > --------------------------------------------------------------
> > ---------
> >
> > blastall -p blastn -m 7 -F T -d ecoli/ecoli.nt -i test.txt
> >
> > <Hit>
> >           <Hit_num>1</Hit_num>
> >           <Hit_id>gi|1786181|gb|AE000111.1|AE000111</Hit_id>
> >           <Hit_def>Escherichia coli K-12 MG1655 section 1 of
> > 400 of the
> > complete genome</Hit_def>
> >           <Hit_accession>AE000111</Hit_accession>
> >           <Hit_len>10596</Hit_len>
> >           <Hit_hsps>
> >             <Hsp>
> >               <Hsp_num>1</Hsp_num>
> >               <Hsp_bit-score>589.253</Hsp_bit-score>
> >               <Hsp_score>297</Hsp_score>
> >               <Hsp_evalue>1.04898e-168</Hsp_evalue>
> >               <Hsp_query-from>237</Hsp_query-from>
> >               <Hsp_query-to>560</Hsp_query-to>
> >               <Hsp_hit-from>237</Hsp_hit-from>
> >               <Hsp_hit-to>560</Hsp_hit-to>
> >               <Hsp_query-frame>1</Hsp_query-frame>
> >               <Hsp_hit-frame>1</Hsp_hit-frame>
> >               <Hsp_identity>324</Hsp_identity>
> >               <Hsp_positive>324</Hsp_positive>
> >               <Hsp_align-len>324</Hsp_align-len>
> >
> > <Hsp_qseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC
> > TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
> > GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA
> > GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC
> > CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC
> > CGAACGTATTTTTGCCGAACTTTT</Hsp_qseq>
> >
> > <Hsp_hseq>AGGTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACC
> > TGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAA
> > GTTCGGCGGTACATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAA
> > GCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCAC
> > CTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGC
> > CGAACGTATTTTTGCCGAACTTTT</Hsp_hseq>
> >
> > <Hsp_midline>|||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > |||||||||||||||||||||||||||</Hsp_midline>
> >             </Hsp>
> >
> > --------------------------------------------------------------
> > ---------
> >
> > blastall -p blastn -m 0 -F T -d ecoli/ecoli.nt -i test.txt
> >
> > >gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section
> > 1 of 400 of the
> > >complete
> >            genome
> >           Length = 10596
> >
> > Score =  589 bits (297), Expect = e-168
> > Identities = 315/324 (97%)
> > Strand = Plus / Plus
> >
> >
> > Query: 237
> > aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 237
> > aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtg 296
> >
> >
> > Query: 297
> > cgggcnnnnnnnnncgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356
> >            |||||
> > ||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 297
> > cgggctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcgg 356
> >
> >
> > Query: 357
> > cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 357
> > cggtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaa 416
> >
> >
> > Query: 417
> > tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 417
> > tgccaggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacct 476
> >
> >
> > Query: 477
> > ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 477
> > ggtggcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgc 536
> >
> >
> > Query: 537 cgaacgtatttttgccgaactttt 560
> >            ||||||||||||||||||||||||
> > Sbjct: 537 cgaacgtatttttgccgaactttt 560
> >
> > --------------------------------------------------------------
> > ---------
> >
> > blastall -p blastn -m 7 -F F -d ecoli/ecoli.nt -i test.txt
> >
> > <Hit>
> >           <Hit_num>1</Hit_num>
> >           <Hit_id>gi|1786181|gb|AE000111.1|AE000111</Hit_id>
> >           <Hit_def>Escherichia coli K-12 MG1655 section 1 of
> > 400 of the
> > complete genome</Hit_def>
> >           <Hit_accession>AE000111</Hit_accession>
> >           <Hit_len>10596</Hit_len>
> >           <Hit_hsps>
> >             <Hsp>
> >               <Hsp_num>1</Hsp_num>
> >               <Hsp_bit-score>1110.61</Hsp_bit-score>
> >               <Hsp_score>560</Hsp_score>
> >               <Hsp_evalue>0</Hsp_evalue>
> >               <Hsp_query-from>1</Hsp_query-from>
> >               <Hsp_query-to>560</Hsp_query-to>
> >               <Hsp_hit-from>1</Hsp_hit-from>
> >               <Hsp_hit-to>560</Hsp_hit-to>
> >               <Hsp_query-frame>1</Hsp_query-frame>
> >               <Hsp_hit-frame>1</Hsp_hit-frame>
> >               <Hsp_identity>560</Hsp_identity>
> >               <Hsp_positive>560</Hsp_positive>
> >               <Hsp_align-len>560</Hsp_align-len>
> >
> > <Hsp_qseq>AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAA
> > AGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGA
> > CTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGA
> > GTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAG
> > GTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG
> > CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTAC
> > ATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGC
> > AGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATG
> > ATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTT
> > TGCCGAACTTTT</Hsp_qseq>
> >
> > <Hsp_hseq>AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAA
> > AGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTTTATTGA
> > CTTAGGTCACTAAATACTTTAACCAATATAGGCATAGCGCACAGACAGATAAAAATTACAGA
> > GTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAG
> > GTAACGGTGCGGGCTGACGCGTACAGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGG
> > CTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTAC
> > ATCAGTGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGC
> > AGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCACCAACCACCTGGTGGCGATG
> > ATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTT
> > TGCCGAACTTTT</Hsp_hseq>
> >
> > <Hsp_midline>|||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > |||||||||||||||</Hsp_midline>
> >             </Hsp>
> >
> > --------------------------------------------------------------
> > ---------
> >
> > blastall -p blastn -m 0 -F F -d ecoli/ecoli.nt -i test.txt
> >
> > >gb|AE000111.1|AE000111 Escherichia coli K-12 MG1655 section
> > 1 of 400 of the
> > >complete
> >            genome
> >           Length = 10596
> >
> > Score = 1110 bits (560), Expect = 0.0
> > Identities = 560/560 (100%)
> > Strand = Plus / Plus
> >
> >
> > Query: 1
> > agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 1
> > agcttttcattctgactgcaacgggcaatatgtctctgtgtggattaaaaaaagagtgtc 60
> >
> >
> > Query: 61
> > tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 61
> > tgatagcagcttctgaactggttacctgccgtgagtaaattaaaattttattgacttagg 120
> >
> >
> > Query: 121
> > tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 121
> > tcactaaatactttaaccaatataggcatagcgcacagacagataaaaattacagagtac 180
> >
> >
> > Query: 181
> > acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 181
> > acaacatccatgaaacgcattagcaccaccattaccaccaccatcaccattaccacaggt 240
> >
> >
> > Query: 241
> > aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 241
> > aacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgcggg 300
> >
> >
> > Query: 301
> > ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 301
> > ctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcggt 360
> >
> >
> > Query: 361
> > acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 361
> > acatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgcc 420
> >
> >
> > Query: 421
> > aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 421
> > aggcaggggcaggtggccaccgtcctctctgcccccgccaaaatcaccaaccacctggtg 480
> >
> >
> > Query: 481
> > gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540
> >
> > ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> > Sbjct: 481
> > gcgatgattgaaaaaaccattagcggccaggatgctttacccaatatcagcgatgccgaa 540
> >
> >
> > Query: 541 cgtatttttgccgaactttt 560
> >            ||||||||||||||||||||
> > Sbjct: 541 cgtatttttgccgaactttt 560
> >
> >
> > >From: "Wiepert, Mathieu" <Wiepert.Mathieu@mayo.edu>
> > >To: 'tai kwan do' <immunoguest@hotmail.com>,    bioperl-l@bioperl.org
> > >Subject: RE: [Bioperl-l] Difference between Date: Fri, 30
> > Jan 2004 11:13:05
> > >-0600
> > >
> > >Hi,
> > >
> > >I have a vague recollection of this problem, so this answer
> > is likely
> > >wrong, but I think it has something to do with the filtered
> > sequence?  You
> > >have 9 masked NT's, so it is probably a difference in the
> > defaults, and
> > >something to do with the XML output not masked?
> > >
> > >Sorry I can't find the emails I had with NCBI on this, but I
> > am maybe 70%
> > >sure that it is a problem like that, with defaults on the
> > local server
> > >versus NCBI, and the XML not using masked data?
> > >
> > >Someone else chime in if I am way off there...
> > >
> > >HTH,
> > >
> > >-mat
> > >
> >
> > _________________________________________________________________
> > There are now three new levels of MSN Hotmail Extra Storage!
> > Learn more.
> > http://join.msn.com/?pgmarket=en-us&page=hotmail/es2&ST=1
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From brian_osborne at cognia.com  Tue Feb  3 16:18:41 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Tue Feb  3 16:25:46 2004
Subject: [Bioperl-l] Uniprot
In-Reply-To: <200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr>
Message-ID: <GAEDKMGOKFBLJPKCLKCCCEBBDHAA.brian_osborne@cognia.com>

Sebastien,

Unfortunately no, there's no way to do a query of UniProt with Bioperl
currently. You're looking for some data that's neither in Swissprot nor in
PIR?

Brian O.

-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Sebastien Moretti
Sent: Tuesday, February 03, 2004 8:49 AM
To: bioperl-l@bioperl.org
Subject: [Bioperl-l] Uniprot

Hello
Is there a BioPerl module to send a request to UniProt db ?
My script send a request to Swissprot :

        #!/usr/bin/perl

        use strict;
        use Bio::DB::SwissProt;
        use Bio::SeqIO;
        my $acc=$ARGV[0];

        my $gb = new Bio::DB::SwissProt;
        my $stream = $gb->get_Seq_by_acc($acc);

        my $out=Bio::SeqIO->new(-format=>'swiss');

        my $result=$out->write_seq($stream);
        $result =~ s/^1.*$//;
        print $result;

        exit;

But how can I do the same with UniProt ?
Thanks

--
Sebastien MORETTI
CNRS - IGS
31 chemin Joseph Aiguier
13402 Marseille cedex 20, FRANCE
tel. 04 91 16 44 55 - 06 61 88 59 00
_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From jason at cgt.duhs.duke.edu  Tue Feb  3 16:28:24 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Tue Feb  3 16:34:39 2004
Subject: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node
In-Reply-To: <9DBC0AB8-5425-11D8-AAD0-000A959EB4C4@gmx.net>
References: <9DBC0AB8-5425-11D8-AAD0-000A959EB4C4@gmx.net>
Message-ID: <Pine.LNX.4.50.0402010953540.7781-100000@tenero.duhs.duke.edu>

We can start making things create Taxonomy::Node objects - I know there
code floating out there which does
if( $sp->isa('Bio::Species') ) { }

so presumably we could make Bio::Species interface s.t. taxonomy::Node
isa Bio::Species...?  I don't want to confuse people either.

There may still be a little more functionality that is needed in the
Taxonomy::Node objects and in the db - specifically how to deal with
some of the methods which are really specific to the species level of
the taxonomy (tips) such as classification/bionomial/ etc methods.

-jason

On Sat, 31 Jan 2004, Hilmar Lapp wrote:

> Very cool Jason!!
>
> Now we can start hooking this into bioperl-db.
>
> And what about porting the SeqIO parsers, the target being to be able
> to deprecate Bio::Species altogether? Alternatively, change the
> SeqI/RichSeqI implementations to silently convert a Bio::Species
> instance on set to a Bio::Taxonomy::Node instance?
>
> 	-hilmar
>
> On Friday, January 30, 2004, at 02:07  PM, Jason Stajich wrote:
>
> > I think I've finally committed code which will allow
> > Bio::Taxonomy::Node
> > to act like Bio::Species while supporting the notion of being a node
> > in a
> > taxonomy hierarchy.  Added tests in t/Species.t to this effect.
> >
> > For Bio::DB::Taxonomy::flatfile I've added indexing by parent Id so it
> > is
> > quite fast to grab all the children for a given node.  So you can walk
> > up
> > and down the classification system now.  Practically speaking
> > this means to get all the taxon ids of species in the same genus with a
> > few simple lines like below.
> >
> > Unfortunately the the NCBI taxonomy API as part of E-Utils doesn't
> > quite
> > provide the information we need so the whole API can't be used without
> > downloading the taxonomy db locally.
> >
> > nodefile and namesfile are the files from ncbi taxdump see
> > Bio::DB::Taxonomy::flatfile for more info.
> >
> > #!/usr/bin/perl
> > use strict;
> > use warnings;
> >
> > use Bio::DB::Taxonomy;
> > my $db = Bio::DB::Taxonomy->new
> >     (-source => 'flatfile',
> >      -nodesfile=> '/home/jason/taxonomy/nodes.dmp',
> >      -namesfile=> '/home/jason/taxonomy/names.dmp');
> >
> > my $node = $db->get_Taxonomy_Node(-name => 'Caenorhabditis elegans');
> >
> > my $parent = $node->get_Parent_Node();
> > for my $n ( $parent->get_Children_Nodes() ) {
> >     print $n->binomial, "\t", $n->ncbi_taxid,"\n";
> > }
> >
> > Someday I'll get around to making a HowTO unless someone else wants to
> > do
> > it... =)
> >
> > -jason
> > --
> > Jason Stajich
> > Duke University
> > jason at cgt.mc.duke.edu
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From brian_osborne at cognia.com  Tue Feb  3 16:55:25 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Tue Feb  3 17:02:37 2004
Subject: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node
In-Reply-To: <Pine.LNX.4.50.0402010953540.7781-100000@tenero.duhs.duke.edu>
Message-ID: <GAEDKMGOKFBLJPKCLKCCKEBCDHAA.brian_osborne@cognia.com>

Jason,

So you'd automatically create the Node object without knowing if the
underlying names and nodes files are present? I agree with you, that could
be confusing.

Test for the existence of an env that specifies the directory that contains
these indexed files?

Brian O.

-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Jason Stajich
Sent: Tuesday, February 03, 2004 4:28 PM
To: Hilmar Lapp
Cc: Bioperl
Subject: Re: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node

We can start making things create Taxonomy::Node objects - I know there
code floating out there which does
if( $sp->isa('Bio::Species') ) { }

so presumably we could make Bio::Species interface s.t. taxonomy::Node
isa Bio::Species...?  I don't want to confuse people either.

There may still be a little more functionality that is needed in the
Taxonomy::Node objects and in the db - specifically how to deal with
some of the methods which are really specific to the species level of
the taxonomy (tips) such as classification/bionomial/ etc methods.

-jason

On Sat, 31 Jan 2004, Hilmar Lapp wrote:

> Very cool Jason!!
>
> Now we can start hooking this into bioperl-db.
>
> And what about porting the SeqIO parsers, the target being to be able
> to deprecate Bio::Species altogether? Alternatively, change the
> SeqI/RichSeqI implementations to silently convert a Bio::Species
> instance on set to a Bio::Taxonomy::Node instance?
>
>       -hilmar
>
> On Friday, January 30, 2004, at 02:07  PM, Jason Stajich wrote:
>
> > I think I've finally committed code which will allow
> > Bio::Taxonomy::Node
> > to act like Bio::Species while supporting the notion of being a node
> > in a
> > taxonomy hierarchy.  Added tests in t/Species.t to this effect.
> >
> > For Bio::DB::Taxonomy::flatfile I've added indexing by parent Id so it
> > is
> > quite fast to grab all the children for a given node.  So you can walk
> > up
> > and down the classification system now.  Practically speaking
> > this means to get all the taxon ids of species in the same genus with a
> > few simple lines like below.
> >
> > Unfortunately the the NCBI taxonomy API as part of E-Utils doesn't
> > quite
> > provide the information we need so the whole API can't be used without
> > downloading the taxonomy db locally.
> >
> > nodefile and namesfile are the files from ncbi taxdump see
> > Bio::DB::Taxonomy::flatfile for more info.
> >
> > #!/usr/bin/perl
> > use strict;
> > use warnings;
> >
> > use Bio::DB::Taxonomy;
> > my $db = Bio::DB::Taxonomy->new
> >     (-source => 'flatfile',
> >      -nodesfile=> '/home/jason/taxonomy/nodes.dmp',
> >      -namesfile=> '/home/jason/taxonomy/names.dmp');
> >
> > my $node = $db->get_Taxonomy_Node(-name => 'Caenorhabditis elegans');
> >
> > my $parent = $node->get_Parent_Node();
> > for my $n ( $parent->get_Children_Nodes() ) {
> >     print $n->binomial, "\t", $n->ncbi_taxid,"\n";
> > }
> >
> > Someday I'll get around to making a HowTO unless someone else wants to
> > do
> > it... =)
> >
> > -jason
> > --
> > Jason Stajich
> > Duke University
> > jason at cgt.mc.duke.edu
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From liam at mmb.usyd.edu.au  Tue Feb  3 20:56:02 2004
From: liam at mmb.usyd.edu.au (Liam Elbourne)
Date: Tue Feb  3 21:02:55 2004
Subject: [Bioperl-l] genome analysis
Message-ID: <a05210600bc45fe229ae9@[129.78.152.97]>

I'm looking for the quickest way to take a write a complete genbank 
entry (ie with all annotation and features) from a microbial genome 
entry, using the start and end of the area of interest. In particular 
I want to 'restart' the nucleotide positions, so that the beginning 
becomes position one in my created genbank entry, and the end becomes 
the original end minus the original start. I can see how to do this 
by loading the whole genome into a Bio::DB::GenBank object and 
iterating through it etc, but there must be a better way......

I am new to Bioperl, so if this the wrong list for this question, a 
gentle nudge in the right direction would be appreciated. The answer 
to my question above would also be appreciated!.


Regards,
Liam Elbourne

From ew9 at york.ac.uk  Wed Feb  4 05:11:29 2004
From: ew9 at york.ac.uk (Elizabeth Williams)
Date: Wed Feb  4 05:17:40 2004
Subject: [Bioperl-l] problem with neighbor.pm
Message-ID: <6.0.1.1.0.20040204100745.024d07a0@ew9.imap.york.ac.uk>

Hello,

I am trying to run the phylip modules on a set of Bio::seq sequences.  I 
have run into a problem with neighbor.pm  The module runs the program but 
then loses the tree somehow and comes up with this error message.

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: neighbor did not create tree correctly (expected /tmp/lHCvy7ByeN/treefile)
STACK: Error::throw
STACK: Bio::Root::Root::throw 
/biol/programs/perl580/lib/site_perl/5.8.0/Bio/Root/Root.pm:328
STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::_run 
/biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:412
STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::run 
/biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:353
STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::create_tree 
/biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:370
STACK: geneorigin.pl:74
-----------------------------------------------------------

The script I am using is below.
Anyone have any ideas what is causing the problem?  I am at a loss.

use Bio::DB::GenPept;
use Bio::Tools::Run::Alignment::Clustalw;
use Bio::Tools::Run::Phylo::Phylip::ProtDist;
use Bio::Tools::Run::Phylo::Phylip::Neighbor;

#use strict;
use Bio::SeqIO;
use Bio::Seq;
use Bio::AlignIO;
use Bio::SimpleAlign;

$ENV{PHYLIPDIR} = '/biol/programs/phylip/exe';
.
.
.
.
.
.
                         my @params_align = ('ktuple' => 2, 'matrix' => 
'BLOSUM');
                         my $factory = 
Bio::Tools::Run::Alignment::Clustalw->new(@params_align);
                         my $seq_array_ref = \@seq_array; # where 
@seq_array is an array of Bio::Seq objects created earlier
                         my $aln = $factory->align($seq_array_ref);
                         my @params_protdist = ('MODEL' => 'PAM');

                         my $protdist_factory = 
Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist);

                         my $matrix = $protdist_factory->run($aln);

                         my @params_neighbor = ('type'=>'NJ');


                         my $neighborfactory = 
Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor);

                         my $tree = $neighborfactory->create_tree($matrix);


Elizabeth J.B. Williams

From Eric.Jain at isb-sib.ch  Wed Feb  4 05:30:41 2004
From: Eric.Jain at isb-sib.ch (Eric Jain)
Date: Wed Feb  4 05:36:50 2004
Subject: [Bioperl-l] Re: Uniprot
References: <200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr>
	<GAEDKMGOKFBLJPKCLKCCCEBBDHAA.brian_osborne@cognia.com>
Message-ID: <005d01c3eb09$f08d7020$c300000a@caliente>

> Is there a BioPerl module to send a request to UniProt db ?
> My script send a request to Swissprot :

There is:

UniProt. Identical to Swiss-Prot and TrEMBL. You should be able to use
whatever tools you have been using so far to work with these two
databases. Distributed as two separate files.

UniRef. UniProt clusters. Available at three different levels of
sequence similarity. No BioPerl module available yet, as far as I know.

UniParc. Sequence archive. Doesn't really exist yet.

All these three together are also referred to as 'UniProt', in which
case 'UniProt UniProt' is called 'UniProt Knowledgebase'.

Anybody else who finds this confusing, raise your hand now...

--
Eric Jain

From jason at cgt.duhs.duke.edu  Wed Feb  4 08:23:47 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb  4 08:30:09 2004
Subject: [Bioperl-l] problem with neighbor.pm
In-Reply-To: <6.0.1.1.0.20040204100745.024d07a0@ew9.imap.york.ac.uk>
References: <6.0.1.1.0.20040204100745.024d07a0@ew9.imap.york.ac.uk>
Message-ID: <Pine.LNX.4.50.0402040823160.9364-100000@tenero.duhs.duke.edu>

phylip 3.5 or 3.6? -- you may need to twiddle one setting if you are using
phylip 3.6


-jason
On Wed, 4 Feb 2004, Elizabeth Williams wrote:

> Hello,
>
> I am trying to run the phylip modules on a set of Bio::seq sequences.  I
> have run into a problem with neighbor.pm  The module runs the program but
> then loses the tree somehow and comes up with this error message.
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: neighbor did not create tree correctly (expected /tmp/lHCvy7ByeN/treefile)
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Root/Root.pm:328
> STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::_run
> /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:412
> STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::run
> /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:353
> STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::create_tree
> /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:370
> STACK: geneorigin.pl:74
> -----------------------------------------------------------
>
> The script I am using is below.
> Anyone have any ideas what is causing the problem?  I am at a loss.
>
> use Bio::DB::GenPept;
> use Bio::Tools::Run::Alignment::Clustalw;
> use Bio::Tools::Run::Phylo::Phylip::ProtDist;
> use Bio::Tools::Run::Phylo::Phylip::Neighbor;
>
> #use strict;
> use Bio::SeqIO;
> use Bio::Seq;
> use Bio::AlignIO;
> use Bio::SimpleAlign;
>
> $ENV{PHYLIPDIR} = '/biol/programs/phylip/exe';
> .
> .
> .
> .
> .
> .
>                          my @params_align = ('ktuple' => 2, 'matrix' =>
> 'BLOSUM');
>                          my $factory =
> Bio::Tools::Run::Alignment::Clustalw->new(@params_align);
>                          my $seq_array_ref = \@seq_array; # where
> @seq_array is an array of Bio::Seq objects created earlier
>                          my $aln = $factory->align($seq_array_ref);
>                          my @params_protdist = ('MODEL' => 'PAM');
>
>                          my $protdist_factory =
> Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist);
>
>                          my $matrix = $protdist_factory->run($aln);
>
>                          my @params_neighbor = ('type'=>'NJ');
>
>
>
>                          my $neighborfactory =
> Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor);
>
>                          my $tree = $neighborfactory->create_tree($matrix);
>
>
> Elizabeth J.B. Williams
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From michael.watson at bbsrc.ac.uk  Wed Feb  4 08:26:07 2004
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Wed Feb  4 08:35:20 2004
Subject: [Bioperl-l] Blast Images
Message-ID: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk>

Hi

Does anything exist within Bioperl, or otherwise, to take a Blast output (or Search object) and produce an image showing the location of the hits on the query sequence?  (much like the NCBI have on their blast pages)

Thanks
Mick

From jason at cgt.duhs.duke.edu  Wed Feb  4 08:39:56 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb  4 08:46:14 2004
Subject: [Bioperl-l] problem with neighbor.pm
In-Reply-To: <6.0.1.1.0.20040204132900.0252a810@ew9.imap.york.ac.uk>
References: <6.0.1.1.0.20040204100745.024d07a0@ew9.imap.york.ac.uk>
	<Pine.LNX.4.50.0402040823160.9364-100000@tenero.duhs.duke.edu>
	<6.0.1.1.0.20040204132900.0252a810@ew9.imap.york.ac.uk>
Message-ID: <Pine.LNX.4.50.0402040835570.9364-100000@tenero.duhs.duke.edu>

either set the env variable PHYLIPVERSION in your shell or at the top of your script
$ENV{PHYLIPVERSION} = '3.6';
(before any use ... statements)

Or the less ideal, setting per Phylip factory object your create, i.e.:
 $protdist_factory->version('3.6');
 $neighborfactory->version('3.6');

-jason
On Wed, 4 Feb 2004, Elizabeth Williams wrote:

> i am using 3.6.  Which setting needs to be twiddled?
>
> At 13:23 04/02/2004, you wrote:
> >phylip 3.5 or 3.6? -- you may need to twiddle one setting if you are using
> >phylip 3.6
> >
> >
> >-jason
> >On Wed, 4 Feb 2004, Elizabeth Williams wrote:
> >
> > > Hello,
> > >
> > > I am trying to run the phylip modules on a set of Bio::seq sequences.  I
> > > have run into a problem with neighbor.pm  The module runs the program but
> > > then loses the tree somehow and comes up with this error message.
> > >
> > > ------------- EXCEPTION: Bio::Root::Exception -------------
> > > MSG: neighbor did not create tree correctly (expected
> > /tmp/lHCvy7ByeN/treefile)
> > > STACK: Error::throw
> > > STACK: Bio::Root::Root::throw
> > > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Root/Root.pm:328
> > > STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::_run
> > >
> > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:412
> > > STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::run
> > >
> > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:353
> > > STACK: Bio::Tools::Run::Phylo::Phylip::Neighbor::create_tree
> > >
> > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm:370
> > > STACK: geneorigin.pl:74
> > > -----------------------------------------------------------
> > >
> > > The script I am using is below.
> > > Anyone have any ideas what is causing the problem?  I am at a loss.
> > >
> > > use Bio::DB::GenPept;
> > > use Bio::Tools::Run::Alignment::Clustalw;
> > > use Bio::Tools::Run::Phylo::Phylip::ProtDist;
> > > use Bio::Tools::Run::Phylo::Phylip::Neighbor;
> > >
> > > #use strict;
> > > use Bio::SeqIO;
> > > use Bio::Seq;
> > > use Bio::AlignIO;
> > > use Bio::SimpleAlign;
> > >
> > > $ENV{PHYLIPDIR} = '/biol/programs/phylip/exe';
> > > .
> > > .
> > > .
> > > .
> > > .
> > > .
> > >                          my @params_align = ('ktuple' => 2, 'matrix' =>
> > > 'BLOSUM');
> > >                          my $factory =
> > > Bio::Tools::Run::Alignment::Clustalw->new(@params_align);
> > >                          my $seq_array_ref = \@seq_array; # where
> > > @seq_array is an array of Bio::Seq objects created earlier
> > >                          my $aln = $factory->align($seq_array_ref);
> > >                          my @params_protdist = ('MODEL' => 'PAM');
> > >
> > >                          my $protdist_factory =
> > > Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist);
> > >
> > >                          my $matrix = $protdist_factory->run($aln);
> > >
> > >                          my @params_neighbor = ('type'=>'NJ');
> > >
> > >
> > >
> > >                          my $neighborfactory =
> > > Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor);
> > >
> > >                          my $tree = $neighborfactory->create_tree($matrix);
> > >
> > >
> > > Elizabeth J.B. Williams
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> >--
> >Jason Stajich
> >Duke University
> >jason at cgt.mc.duke.edu
>
> Elizabeth J.B. Williams
> CNAP
> Department of Biology
> University of York
> York
> YO10 5YW
> mobile: 07813149274
> work: 01904 328757
> Fax: 01904 328762
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From jason at cgt.duhs.duke.edu  Wed Feb  4 08:41:56 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb  4 08:48:08 2004
Subject: [Bioperl-l] Blast Images
In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk>
References: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk>
Message-ID: <Pine.LNX.4.50.0402040840020.9364-100000@tenero.duhs.duke.edu>

In scripts/graphics/search_overview.PLS

You may have to tweak it some to inject that into a custom
cgi page.  the script should be a starting place not the end all be all.
With a little more tweaking you can take advantage of Todd's SVG output
instead of PNG if that floats yer boat.

-jason

On Wed, 4 Feb 2004, michael watson (IAH-C) wrote:

> Hi
>
> Does anything exist within Bioperl, or otherwise, to take a Blast output (or Search object) and produce an image showing the location of the hits on the query sequence?  (much like the NCBI have on their blast pages)
>
> Thanks
> Mick
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From brian_osborne at cognia.com  Wed Feb  4 08:48:12 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Wed Feb  4 08:54:30 2004
Subject: [Bioperl-l] Blast Images
In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk>
Message-ID: <GAEDKMGOKFBLJPKCLKCCEEBIDHAA.brian_osborne@cognia.com>

Mick,

Bio::Graphics. Take a look at the Graphics HOWTO
(http://bioperl.org/HOWTOs/html/Graphics-HOWTO.html).

Brian O.

-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of michael watson
(IAH-C)
Sent: Wednesday, February 04, 2004 8:26 AM
To: bioperl-l@bioperl.org
Subject: [Bioperl-l] Blast Images

Hi

Does anything exist within Bioperl, or otherwise, to take a Blast output (or
Search object) and produce an image showing the location of the hits on the
query sequence?  (much like the NCBI have on their blast pages)

Thanks
Mick

_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From ANTIGEN_SATURNV at brooklyn.cuny.edu  Tue Feb  3 21:45:11 2004
From: ANTIGEN_SATURNV at brooklyn.cuny.edu (ANTIGEN_SATURNV)
Date: Wed Feb  4 14:21:28 2004
Subject: [Bioperl-l] Antigen found VIRUS= W32/Mydoom (ED) (NAI,Sophos) worm
Message-ID: <4D655EDA19E2D611ABBD00508B3220CC01CD15BA@saturnv.brooklyn.cuny.edu>

Antigen for Exchange found body.zip->body.txt                                                                      .exe infected with VIRUS= W32/Mydoom (ED) (NAI,Sophos) worm.
The message is currently Purged.  The message, "", was
sent from bioperl-l@bioperl.org and was discovered in IMC Queues\Inbound
located at Brooklyn College/BCNET/SATURNV.
From ecky.l at gmx.de  Wed Feb  4 15:33:07 2004
From: ecky.l at gmx.de (Eckhard Lehmann)
Date: Wed Feb  4 15:39:18 2004
Subject: [Bioperl-l] Blast Images
References: <20B7EB075F2D4542AFFAF813E98ACD9302822622@cl-exsrv1.irad.bbsrc.ac.uk>
Message-ID: <10718.1075926787@www29.gmx.net>

Hi,

> Does anything exist within Bioperl, or otherwise, to take a Blast output
> (or Search object) and produce an image showing the location of the hits
on
> the query sequence?  (much like the NCBI have on their blast pages)

Bioperl-Tk does a good job if you want to have it outside any webpage and
inside a Perl-Tk widget. The package to consider is Bio::Tk::HitDisplay.

I wrote a blastviewer in Tcl/Tk that does almost the same (and shows a bit
more than Bio::Tk::HitDisplay), but that one comes with its own self written
BLAST parser which is not so effective as the one in BioPerl and may be a bit
buggy... but nevertheless it works and the parser is extensible ;-).

Eckhard ;)


From Sebastien.Moretti at igs.cnrs-mrs.fr  Wed Feb  4 05:11:59 2004
From: Sebastien.Moretti at igs.cnrs-mrs.fr (Sebastien Moretti)
Date: Wed Feb  4 22:26:16 2004
Subject: [Bioperl-l] Uniprot
In-Reply-To: <GAEDKMGOKFBLJPKCLKCCCEBBDHAA.brian_osborne@cognia.com>
References: <GAEDKMGOKFBLJPKCLKCCCEBBDHAA.brian_osborne@cognia.com>
Message-ID: <200402041111.59773.Sebastien.Moretti@igs.cnrs-mrs.fr>

Hello
I modify the Bio/DB/SwissProt.pm to be able to send a request to UniProt at 
EBI.
I attach the UniProt.pm file. Set it near the SwissProt.pm file.
I hope that it hasn't bugs.

-- 
Sebastien MORETTI
CNRS - IGS
31 chemin Joseph Aiguier
13402 Marseille cedex 20, FRANCE
tel. 04 91 16 44 55 - 06 61 88 59 00
-------------- next part --------------
A non-text attachment was scrubbed...
Name: UniProt.pm
Type: text/x-perl
Size: 11976 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040204/5b484e57/UniProt.bin
From jason at cgt.duhs.duke.edu  Wed Feb  4 22:57:58 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb  4 23:04:40 2004
Subject: [Bioperl-l] Uniprot
In-Reply-To: <200402041111.59773.Sebastien.Moretti@igs.cnrs-mrs.fr>
References: <GAEDKMGOKFBLJPKCLKCCCEBBDHAA.brian_osborne@cognia.com>
	<200402041111.59773.Sebastien.Moretti@igs.cnrs-mrs.fr>
Message-ID: <Pine.LNX.4.50.0402042252360.14385-100000@tenero.duhs.duke.edu>

Thanks Sebastien --

Since the change is only:
  'db' => 'swall' to
  'db' => 'uniprot'

We might try and fix this directly in SwissProt.pm without having to
create a whole new module.

The sort of long way to do this is to add this to your script like this:
$Bio::DB::SwissProt::HOSTS{'ebi'}->{'basevars'}->{'db'} = 'uniprot';


-jason

On Wed, 4 Feb 2004, Sebastien Moretti wrote:

> Hello
> I modify the Bio/DB/SwissProt.pm to be able to send a request to UniProt at
> EBI.
> I attach the UniProt.pm file. Set it near the SwissProt.pm file.
> I hope that it hasn't bugs.
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From heikki at ebi.ac.uk  Thu Feb  5 05:34:47 2004
From: heikki at ebi.ac.uk (Heikki Lehvaslaiho)
Date: Thu Feb  5 05:41:03 2004
Subject: [Bioperl-l] Uniprot
In-Reply-To: <200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr>
References: <20040203130426.21627.qmail@web25209.mail.ukl.yahoo.com>
	<200402031449.03089.Sebastien.Moretti@igs.cnrs-mrs.fr>
Message-ID: <200402051034.48276.heikki@ebi.ac.uk>

For the time being the old Swiss-Prot and Uniprot are identical at data level.
Uniprot is a political development integrating PIR. 

>From README:
"The UniProt Knowledgebase has been created from Swiss-Prot, TrEMBL and 
PIR-PSD.
It consists of two parts, one containing fully manually annotated records
and another one with computationally analysed records awaiting full manual
annotation. The two sections will be referred to as the Swiss-Prot 
Knowledgebase and TrEMBL Protein Database, respectively. PIR-PSD release 48.0 
of 28-Oct-2003 has been fully integrated into these sections. This was the 
last release of PIR-PSD. "

	-Heikki

On Tuesday 03 Feb 2004 13:49, Sebastien Moretti wrote:
> Hello
> Is there a BioPerl module to send a request to UniProt db ?
> My script send a request to Swissprot :
>
> 	#!/usr/bin/perl
>
> 	use strict;
> 	use Bio::DB::SwissProt;
> 	use Bio::SeqIO;
> 	my $acc=$ARGV[0];
>
> 	my $gb = new Bio::DB::SwissProt;
> 	my $stream = $gb->get_Seq_by_acc($acc);
>
> 	my $out=Bio::SeqIO->new(-format=>'swiss');
>
> 	my $result=$out->write_seq($stream);
> 	$result =~ s/^1.*$//;
> 	print $result;
>
> 	exit;
>
> But how can I do the same with UniProt ?
> Thanks

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________

From POSTMASTER at nt.mitsui-chem.co.jp  Thu Feb  5 06:51:32 2004
From: POSTMASTER at nt.mitsui-chem.co.jp (POSTMASTER@nt.mitsui-chem.co.jp)
Date: Thu Feb  5 06:58:04 2004
Subject: [Bioperl-l] Undeliverable message
Message-ID: <200402051151.i15Bpmd25421@mcimx03.mitsui-chem.co.jp>

------- Failure Reasons  --------

User  not listed in public Name & Address Book
jose@mitsui-chem.co.jp


------- Returned Message --------
Received: from mcimx03.mitsui-chem.co.jp ([10.1.134.2]) by nt.mitsui-chem.co.jp (Lotus SMTP MTA v4.6.7  (934.1 12-30-1999)) with SMTP id 49256E31.00411E93; Thu, 5 Feb 2004 20:51:16 +0900
Received: from mcimx01.mitsui-chem.co.jp (localhost [127.0.0.1])
	by mcimx03.mitsui-chem.co.jp (8.11.7/3.7W03122523) with ESMTP id i15BpPn25384
	for <jose@mitsui-chem.co.jp>; Thu, 5 Feb 2004 20:51:25 +0900 (JST)
Received: from bioperl.org ([61.183.73.4])
	by mcimx01.mitsui-chem.co.jp (8.11.7/3.7W02060615) with ESMTP id i15BpOf26351
	for <jose@mitsui-chem.co.jp>; Thu, 5 Feb 2004 20:51:24 +0900 (JST)
Message-Id: <200402051151.i15BpOf26351@mcimx01.mitsui-chem.co.jp>
From: bioperl-l@bioperl.org
To: jose@mitsui-chem.co.jp
Subject: test
Date: Thu, 5 Feb 2004 19:49:53 +0800
MIME-Version: 1.0
Content-Type: multipart/mixed;
	boundary="----=_NextPart_000_0013_1C5BFE5A.209C0CF0"
X-Priority: 3
X-MSMail-Priority: Normal

This is a multi-part message in MIME format.

------=_NextPart_000_0013_1C5BFE5A.209C0CF0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

------------------  Virus Warning Message (on mcimx03)

Found virus WORM_MYDOOM.A in file file.pif (in file.zip)
The uncleanable file file.zip is moved to /etc/iscan/virus/virOPC02S7gK.

---------------------------------------------------------

------=_NextPart_000_0013_1C5BFE5A.209C0CF0
Content-Type: text/plain;
	charset="Windows-1252"
Content-Transfer-Encoding: 7bit

The message contains Unicode characters and has been sent as a binary attachment.


------=_NextPart_000_0013_1C5BFE5A.209C0CF0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


------------------  Virus Warning Message (on mcimx03)

file.zip is removed from here because it contains a virus.

---------------------------------------------------------
------=_NextPart_000_0013_1C5BFE5A.209C0CF0--


From billthebrute at yahoo.fr  Thu Feb  5 09:54:07 2004
From: billthebrute at yahoo.fr (=?iso-8859-1?q?william=20ritchie?=)
Date: Thu Feb  5 10:00:15 2004
Subject: [Bioperl-l] mouse genome
Message-ID: <20040205145407.66829.qmail@web25208.mail.ukl.yahoo.com>

Hi I would like to blast against the mouse genome on
NCBI
through a RemoteBlast request but I don t know the
code
for the "mouse genome database"! Could you help me out
?
Cheers!


Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout ! 
Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/
From sdavis2 at mail.nih.gov  Thu Feb  5 10:02:49 2004
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu Feb  5 10:16:07 2004
Subject: [Bioperl-l] Parsing blast hits
Message-ID: <BC47C549.4458%sdavis2@mail.nih.gov>

As many posts here, I am new to bioperl.  I have a list of several thousand
queries (microarray oligos) and the resulting blast hits to mRNAs.  I would
like to determine which of the hits for each query are to the same "gene";
in other words, I want to find query sequences with mappings to only one
gene.  I am familiar with blasting and the technicalities of the blast
parsers, but I can't think how to tackle the bigger problem.  Do I need to
query the resulting hits and store the genes that they encode for each hit
and just make sure they are the same, or is there something more clever?
Any suggestions?

Thanks,
Sean


From reveritt at ucalgary.ca  Thu Feb  5 15:51:13 2004
From: reveritt at ucalgary.ca (reveritt@ucalgary.ca)
Date: Thu Feb  5 15:57:22 2004
Subject: [Bioperl-l] Fwd: Error running makefile
Message-ID: <200402052051.i15KpDc23210@mhost2.ucalgary.ca>

Forwarded From: reveritt@ucalgary.ca

> Hello,
> 
> I am trying to install BioPerl on a Windows NT workstation that is running 
> Perl 5.6 with all the necessary modules (ie IO::String).  When I run the 
> makefile I get the following error message:
>  
> Please inform the author.
> Could not open 'Bio/Root/Version.pm': No such file or directory at (eval 
49) 
> line 6.
> 
> Do you know how to fix this problem or is there documentation I should read?
> 
> Thanks,
> Rebecca Everitt
>  
>  
>  
> 
> 
> 
> -- 
> 
> 
> 


-- 


From clangin at siu.edu  Thu Feb  5 21:40:27 2004
From: clangin at siu.edu (Chet Langin)
Date: Thu Feb  5 21:23:17 2004
Subject: [Bioperl-l] GD test 10 fails
Message-ID: <4022FE9B.5030406@siu.edu>


While installing GD, test 10 failed, thus halting
the install from CPAN.

I installed the latest zlib, libgd, PNG, JPEG,
and FreeType libraries, and it still failed.

It looked like test 10 might be converting between
JPEG and PNG formats.

The only strange output during the make was a warning
about /usr/local/include being a system directory
when it was a non-system directory and that the
search order was changed.

I went ahead and forced install.  But, I was
wondering if this might cause me further trouble
down the road.

,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
<http://mypage.siu.edu/clangin> <clangin@siu.edu>
~~~Diagonally parked in a parallel universe ~~~~~


From jason at cgt.duhs.duke.edu  Thu Feb  5 21:29:54 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Thu Feb  5 21:36:12 2004
Subject: [Bioperl-l] GD test 10 fails
In-Reply-To: <4022FE9B.5030406@siu.edu>
References: <4022FE9B.5030406@siu.edu>
Message-ID: <Pine.LNX.4.50.0402052125290.21873-100000@tenero.duhs.duke.edu>

Only if you want to use Bio::Graphics

Are you sure the libgd version on your system matches the
requirements of the version of GD.pm you are installing.

-jason
On Thu, 5 Feb 2004, Chet Langin wrote:

>
> While installing GD, test 10 failed, thus halting
> the install from CPAN.
>
> I installed the latest zlib, libgd, PNG, JPEG,
> and FreeType libraries, and it still failed.
>
> It looked like test 10 might be converting between
> JPEG and PNG formats.
>
> The only strange output during the make was a warning
> about /usr/local/include being a system directory
> when it was a non-system directory and that the
> search order was changed.
>
> I went ahead and forced install.  But, I was
> wondering if this might cause me further trouble
> down the road.
>
> ,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
> <http://mypage.siu.edu/clangin> <clangin@siu.edu>
> ~~~Diagonally parked in a parallel universe ~~~~~
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From tdhoufek at unity.ncsu.edu  Thu Feb  5 21:49:52 2004
From: tdhoufek at unity.ncsu.edu (T.D. Houfek)
Date: Thu Feb  5 21:56:37 2004
Subject: [Bioperl-l] oligos, mRNAs, and genes
In-Reply-To: <200402051619.i15GJEHH004118@portal.open-bio.org>
References: <200402051619.i15GJEHH004118@portal.open-bio.org>
Message-ID: <1076035792.10041.72.camel@aether>

There are a lot of people around here a lot more qualified to answer,
and I hope someone will correct me if I misinform you (or
if I've misunderstood your question).  

If you're dealing with a eukaryote, I think the method you are hinting
at, effectively tallying which mRNAs were uniquely matched by your
oligos, could run into problems dealing with alternatively-spliced
genes, where there's not a 1:1
relationship between gene and mRNA product.  But I'm not sure what
the incidence of such genes is typically, I think it is just a few
percent of genes.  This shouldn't prevent you from finding "query
sequences with mappings to only one gene", and it certainly won't keep
you from sampling alternatively spliced products, but there might be a
few cases where one gene has more than one query oligo that matches it
(if multiple matched mRNA transcripts are subsequently related to the
activity of one gene). 

If your mRNA's correlated genes are already well characterized in one of
the major databases / formats, you should be able to use
BioPerl to explore the relations between genes and transcripts, but
is that your situation, or are these transcripts of yours somewhat less
well contextually situated?

TD


-- 

:.-----.----------.----------.-----.:
 T.D. Houfek
 tdhoufek-AT-unity-DOT-ncsu-DOT-edu
 Tobacco Genome Initiative
 NCSU, Raleigh, NC 27606
:.-----.----------.----------.-----.:


From tdhoufek at unity.ncsu.edu  Thu Feb  5 22:12:47 2004
From: tdhoufek at unity.ncsu.edu (T.D. Houfek)
Date: Thu Feb  5 22:19:35 2004
Subject: [Bioperl-l] Re: CVS hang problem
In-Reply-To: <1075989943.1474.7.camel@localhost.localdomain>
References: <200402030429.i134TWm2000680@uni03mr.unity.ncsu.edu>
	<1075941658.1362.18.camel@aether>
	<1075989943.1474.7.camel@localhost.localdomain>
Message-ID: <1076037156.14479.0.camel@aether>

For the (searchable) record, Scott was right.  I tried the CVS checkout
again today and got past the point where I was always stalling out
before, so it was just a matter of e-weather.

Thanks! 

TD

On Thu, 2004-02-05 at 09:05, Scott Cain wrote:
> TD,
> 
> There is one moderately big file in the dat directory (7M), so you may
> be running into bandwidth issues, either on your end or at SourceForge. 
> The anonymous cvs server there is notoriously overworked and it can be
> difficult to checkout large repositories.
> 
> Scott

-- 

:.-----.----------.----------.-----.:
 T.D. Houfek
 tdhoufek-AT-unity-DOT-ncsu-DOT-edu
 Tobacco Genome Initiative
 NCSU, Raleigh, NC 27606
:.-----.----------.----------.-----.:


From hcle028 at cse.unsw.edu.au  Thu Feb  5 22:53:43 2004
From: hcle028 at cse.unsw.edu.au (Hong Ching Lee)
Date: Thu Feb  5 22:59:58 2004
Subject: [Bioperl-l] Problems running blast
Message-ID: <Pine.LNX.4.58.0402061452001.29883@wagner.orchestra.cse.unsw.EDU.AU>

Hey everyone,

I have a question about whether i can run remote blast using just a string
or whether i have to make it into a fasta format file. Can anyone help me
with this?

Thank You,
Hong
From barry.moore at genetics.utah.edu  Fri Feb  6 01:02:52 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Fri Feb  6 01:09:05 2004
Subject: [Bioperl-l] Problems running blast
In-Reply-To: <Pine.LNX.4.58.0402061452001.29883@wagner.orchestra.cse.unsw.EDU.AU>
References: <Pine.LNX.4.58.0402061452001.29883@wagner.orchestra.cse.unsw.EDU.AU>
Message-ID: <40232E0C.50805@genetics.utah.edu>

Hong-

You don't have to make your sequence into a fasta file.  Have a look at 
the documentation for the submit_blast method of the 
Bio::Tools::Run::RemoteBlast module where it tells you that the input 
can be a sequence object, a reference to an array of sequence objects, 
or the filename of a fasta file.  If your script already has your 
sequence as any of the Bioperl sequence objects, then you are ready to 
go.  If your script has your sequence as a simple string, it is quite 
easy to convert that to a PrimarySeq object which you can then submit to 
BLAST.  The following script (adapted from the module documentation) 
suggests one way of converting a string to a PrimaySeq object and 
submitting it to BLAST.  See the example code in the Synopsis section of 
the RemoteBlast module documentation mentioned above for examples of how 
to submit a sequence object, or a fasta file to BLAST.

Barry

----------------------------------------------------------------------------------------------

#!/usr/bin/perl
use strict;
use warnings;
use Bio::PrimarySeq;
use Bio::Tools::Run::RemoteBlast;

#Your sequence as a string
my $sequence_string = 
"atggagagcagaggcccactggctacctcgcgcctgctgctgttgctgctgttgctacta";

#Initialize string as new sequence
my $seq = new Bio::PrimarySeq(-seq         => $sequence_string,
                              -display_id  => "Your_favorite_gene");

#Build the BLAST factory
my $BLAST_factory = Bio::Tools::Run::RemoteBlast->new('-prog'       => 
'blastn',
                                                      '-data'       => 'nr',
                                                      '-expect'     => .001,
                                                      '-readmethod' => 
'SearchIO' );
#Submit the sequence object to NCBI's BLAST server
my $job = $BLAST_factory->submit_blast($seq);
print STDERR "Blasting sequence ";
#Load the RIDs returned for the BLAST job submitted (in this case only one)
while ( my @rids = $BLAST_factory->each_rid ) {
  #Iterate over RIDs
  foreach my $rid ( @rids ) {
    #Hit the server for a result on RID
    my $blast_results = $BLAST_factory->retrieve_blast($rid);
    #Was a result returned?
    if( !ref($blast_results) ) {
      #If so and it returned an error remove that RID from the stack
      if ($blast_results < 0) {
        $BLAST_factory->remove_rid($rid);
      }
      print STDERR "."; #Keep staring at the dots
      sleep 5; #Plays nice with the servers
    }
    #If a result was returned and it isn't an error, then pass it to a 
variable...
    else {
      my $result = $blast_results->next_result();
      $BLAST_factory->remove_rid($rid); #...and remove it's RID from the 
stack.
      #Check the result for a hit...
      my $hit = $result->next_hit;
      if (ref($hit)) {
        my $hsp = $hit->next_hsp;
        #...collect some values from the result, hit and hsp objects and do
        #something with them.
        my $q_name = $result->query_name();
        my $h_name = $hit->name;
        my $evalue = $hsp->evalue();
        print "\nQuery name:  $q_name\nHit name:  $h_name\nLowest 
e-value:  $evalue\n";
      }
    }
  }
}


Hong Ching Lee wrote:

>Hey everyone,
>
>I have a question about whether i can run remote blast using just a string
>or whether i have to make it into a fasta format file. Can anyone help me
>with this?
>
>Thank You,
>Hong
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>


From hanna21volley at hotmail.com  Fri Feb  6 01:00:16 2004
From: hanna21volley at hotmail.com (stacey)
Date: Fri Feb  6 01:09:31 2004
Subject: [Bioperl-l] This Drug puts VlAGRA to shame!!
Message-ID: <1076047216-8771@excite.com>

The Biggest New Drug since V1agra! Many times as powerful.

C1AL1S has been seen all over TV as of late.

So why is it so much better than V1agra? Why are so many switching brands?

-A quicker more stable erection
-More enjoyable sex for both
-Longer sex
-Known to add length to you erection
-Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six)

We have it at a discounted savings. Save when you go through our site on all your orders.

See the difference today. 

http://mission.instrhh.com/s95c/index.php?id=s95


lamer deadpinkfloy suzuki biology liverpoo action tacobell 
canced parrot 
racoon septembe taffy paula 
cannon 
From clangin at siu.edu  Fri Feb  6 04:30:09 2004
From: clangin at siu.edu (Chet Langin)
Date: Fri Feb  6 04:12:55 2004
Subject: [Bioperl-l] GD test 10 fails
References: <4022FE9B.5030406@siu.edu>
	<Pine.LNX.4.50.0402052125290.21873-100000@tenero.duhs.duke.edu>
Message-ID: <40235EA1.4040008@siu.edu>


Greetings,

I installed gd-2.0.22 and GD-2.11.

--Chet


Jason Stajich wrote:
> Only if you want to use Bio::Graphics
> 
> Are you sure the libgd version on your system matches the
> requirements of the version of GD.pm you are installing.
> 
> -jason
> On Thu, 5 Feb 2004, Chet Langin wrote:
> 
> 
>>While installing GD, test 10 failed, thus halting
>>the install from CPAN.
>>
>>I installed the latest zlib, libgd, PNG, JPEG,
>>and FreeType libraries, and it still failed.
>>
>>It looked like test 10 might be converting between
>>JPEG and PNG formats.
>>
>>The only strange output during the make was a warning
>>about /usr/local/include being a system directory
>>when it was a non-system directory and that the
>>search order was changed.
>>
>>I went ahead and forced install.  But, I was
>>wondering if this might cause me further trouble
>>down the road.
>>
>>,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
>><http://mypage.siu.edu/clangin> <clangin@siu.edu>
>>~~~Diagonally parked in a parallel universe ~~~~~
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l@portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
> 
> 
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> 


-- 
,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
<http://mypage.siu.edu/clangin> <clangin@siu.edu>
~~~Diagonally parked in a parallel universe ~~~~~


From lstein at cshl.edu  Fri Feb  6 04:13:47 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Fri Feb  6 04:23:01 2004
Subject: [Bioperl-l] GD test 10 fails
In-Reply-To: <4022FE9B.5030406@siu.edu>
References: <4022FE9B.5030406@siu.edu>
Message-ID: <200402061113.47851.lstein@cshl.edu>

Hi Chet,

I wrote and maintain the GD library.  If you can send me the 
information on what operating system you're using and the versions of 
each of the libraries you have installed I might be able to help.

Lincoln

On Friday 06 February 2004 04:40 am, Chet Langin wrote:
> While installing GD, test 10 failed, thus halting
> the install from CPAN.
>
> I installed the latest zlib, libgd, PNG, JPEG,
> and FreeType libraries, and it still failed.
>
> It looked like test 10 might be converting between
> JPEG and PNG formats.
>
> The only strange output during the make was a warning
> about /usr/local/include being a system directory
> when it was a non-system directory and that the
> search order was changed.
>
> I went ahead and forced install.  But, I was
> wondering if this might cause me further trouble
> down the road.
>
> ,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
> <http://mypage.siu.edu/clangin> <clangin@siu.edu>
> ~~~Diagonally parked in a parallel universe ~~~~~
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
From khoueiry at ibsm.cnrs-mrs.fr  Fri Feb  6 04:32:39 2004
From: khoueiry at ibsm.cnrs-mrs.fr (KHOUEIRY pierre)
Date: Fri Feb  6 04:38:54 2004
Subject: [Bioperl-l] fetching a fasta file
Message-ID: <40235F37.3020701@ibsm.cnrs-mrs.fr>

Hello,
I have to search for sequences from a local fasta file. my sequences 
Id's are in a table
my @ID = 
('AAS_ECOLI','ABGT_ECOLI','ABRB_ECOLI','ACFD_ECOLI','ACRA_ECOLI','ACRB_ECOLI'). 

I tried to index my file but it doesn't work.
I used something like:
 $index = Bio::Index::Fasta->new("$file", 'WRITE');
 $index->make_index($file);

Sometimes I'm getting  message
Can't open 'DB_File' dbm file '/home/pierre/Perl/col2.fasta' : File exists

I want to fetch col2.fasta for all IDs in my table (@ID) above
In the doc of Bio::Index::Fasta, they index files and not their contents 
or am'I wrong. I want to do this approch because i want to search theses 
ID's in
a big nb of fasta files...

-- 

---------------------------------
Pierre Khoueiry	
khoueiry@ibsm.cnrs-mrs.fr
LCB - CNRS
31, Chemin Joseph Aiguier,
13402 Marseille CEDEX 20, France
---------------------------------


From Richard.Adams at ed.ac.uk  Fri Feb  6 04:48:58 2004
From: Richard.Adams at ed.ac.uk (Richard Adams)
Date: Fri Feb  6 04:55:04 2004
Subject: [Bioperl-l] protein networks
Message-ID: <4023630A.2030406@ed.ac.uk>

Hi,
I was wondering if anyone has written or is writing any modules to deal 
with protein interaction networks?

E.g., to read in from a DIP flatfile or XML, or other such interaction 
information source
and to have methods such as

get_number_of_interactors()
get_interactor_ids()
clustering_coefficient()
path_length(from, to)
degree()
mean_path_length().

etc.


Richard

-- 
Dr Richard Adams,
Psychiatric Genetics Group,
Medical Genetics,
Molecular Medicine Centre,
Western General Hospital,
Crewe Rd West,
Edinburgh UK
EH4 2XU

Tel: 44 131 651 1084
richard.adams@ed.ac.uk


From Marc.Logghe at devgen.com  Fri Feb  6 04:52:20 2004
From: Marc.Logghe at devgen.com (Marc Logghe)
Date: Fri Feb  6 04:58:42 2004
Subject: [Bioperl-l] fetching a fasta file
Message-ID: <BEE28BF86078B6429D6C780635718E21904B6D@morelia.be.devgen.com>


> -----Original Message-----
> From: KHOUEIRY pierre [mailto:khoueiry@ibsm.cnrs-mrs.fr]
> Sent: Friday, February 06, 2004 10:33 AM
> To: bioperl-l@bioperl.org
> Subject: [Bioperl-l] fetching a fasta file
> 
> 
> Hello,
> I have to search for sequences from a local fasta file. my sequences 
> Id's are in a table
> my @ID = 
> ('AAS_ECOLI','ABGT_ECOLI','ABRB_ECOLI','ACFD_ECOLI','ACRA_ECOL
> I','ACRB_ECOLI'). 
> 
> I tried to index my file but it doesn't work.
> I used something like:
>  $index = Bio::Index::Fasta->new("$file", 'WRITE');
>  $index->make_index($file);
> 
> Sometimes I'm getting  message
> Can't open 'DB_File' dbm file '/home/pierre/Perl/col2.fasta' 
> : File exists

Hi,
the problem is that you use the same name for your index file as for your fasta file.
this should do it:
 $index = Bio::Index::Fasta->new("${file}.idx", 'WRITE');
 $index->make_index($file);
HTH,
Marc

From michael.watson at bbsrc.ac.uk  Fri Feb  6 05:32:05 2004
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Fri Feb  6 05:39:38 2004
Subject: [Bioperl-l] Sub Seq Feature help
Message-ID: <20B7EB075F2D4542AFFAF813E98ACD930282263D@cl-exsrv1.irad.bbsrc.ac.uk>

Hello

I want to manipulate the start and end position of a CDS feature that looks like this:

FT	CDS	join(2307..3221,1..1623)

I have tried:

my @features = $seq->get_all_SeqFeatures;
foreach $f (@features) {
	my @subs = $f->sub_SeqFeature;
	foreach $s (@subs) {
		print $s->start, "-", $s->end, "\n";
	}
}

However, I get nothing out.  The code doesn't descend into the sub seq features as $f->sub_SeqFeature doesn't return anything.  Nor does $f->get_SeqFeatures.

Clearly I am doing something wrong, but what?  I am using Bioperl-1.2.3

Thanks
Mick
From james.wasmuth at ed.ac.uk  Fri Feb  6 05:36:42 2004
From: james.wasmuth at ed.ac.uk (James Wasmuth)
Date: Fri Feb  6 05:47:11 2004
Subject: [Bioperl-l] Problems running blast
In-Reply-To: <Pine.LNX.4.58.0402061452001.29883@wagner.orchestra.cse.unsw.EDU.AU>
References: <Pine.LNX.4.58.0402061452001.29883@wagner.orchestra.cse.unsw.EDU.AU>
Message-ID: <40236E3A.8070607@ed.ac.uk>

input can be:
    * sequence object
    * array ref of sequence objects
    * filename of file containing fasta formatted sequences


best bet is create a Seq object.  From the HOWTO:

use IO::String;
        use Bio::SeqIO;

        # get a string into $string somehow, with its format in 
        # $format, say from a web form
        my $stringfh = new IO::String($string);
        my $seqio = new Bio::SeqIO(-fh => $stringfh,
                                   -format => $format);

        while( my $seq = $seqio->next_seq ) {
             # process each seq
        }
        

hth
james


Hong Ching Lee wrote:

>Hey everyone,
>
>I have a question about whether i can run remote blast using just a string
>or whether i have to make it into a fasta format file. Can anyone help me
>with this?
>
>Thank You,
>Hong
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>  
>


-- 
Nematode Bioinformatics           ||
Blaxter Nematode Genomics Group	  ||
School of Biological Sciences	  ||
Ashworth Laboratories		  ||	
King's Buildings                  ||	tel: +44 131 650 7403
University of Edinburgh           ||	web: www.nematodes.org
Edinburgh			  ||
EH9 3JT	 			  ||
UK				  ||	

"I have not failed. I've just found 10,000 ways that don't work."
               --- Thomas Edison


From Marc.Logghe at devgen.com  Fri Feb  6 05:50:33 2004
From: Marc.Logghe at devgen.com (Marc Logghe)
Date: Fri Feb  6 05:56:44 2004
Subject: [Bioperl-l] Sub Seq Feature help
Message-ID: <BEE28BF86078B6429D6C780635718E210395E2@morelia.be.devgen.com>


> -----Original Message-----
> From: michael watson (IAH-C) [mailto:michael.watson@bbsrc.ac.uk]
> Sent: Friday, February 06, 2004 11:32 AM
> To: bioperl-l@bioperl.org
> Subject: [Bioperl-l] Sub Seq Feature help
> 
> 
> Hello
> 
> I want to manipulate the start and end position of a CDS 
> feature that looks like this:
> 
> FT	CDS	join(2307..3221,1..1623)
> 
> I have tried:
> 
> my @features = $seq->get_all_SeqFeatures;
> foreach $f (@features) {
> 	my @subs = $f->sub_SeqFeature;
> 	foreach $s (@subs) {
> 		print $s->start, "-", $s->end, "\n";
> 	}
> }
> 
Actually you are not dealing with sub_features here. Just a plain feature. What you really are looking for is sub_locations.
When you envoke the method 
my $location = $f->location;
you will get a Location object. In the specific case you showed, you will get a  Bio::Location::Split object.
There you will find the appropiate methods to achieve what you want (e.g. each_Location(), sub_Location)

HTH,
Marc

From brian_osborne at cognia.com  Fri Feb  6 07:38:01 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Fri Feb  6 07:44:11 2004
Subject: [Bioperl-l] Sub Seq Feature help
In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD930282263D@cl-exsrv1.irad.bbsrc.ac.uk>
Message-ID: <GAEDKMGOKFBLJPKCLKCCEECKDHAA.brian_osborne@cognia.com>

Michael,

There may be useful example code for you in the Feature-Annotation HOWTO
(http://bioperl.org/HOWTOs/html/Feature-Annotation.html) or in the FAQ
(http://bioperl.org/Core/Latest/faq.html#Q5.3).

Brian O.

-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of michael watson
(IAH-C)
Sent: Friday, February 06, 2004 5:32 AM
To: bioperl-l@bioperl.org
Subject: [Bioperl-l] Sub Seq Feature help

Hello

I want to manipulate the start and end position of a CDS feature that looks
like this:

FT      CDS     join(2307..3221,1..1623)

I have tried:

my @features = $seq->get_all_SeqFeatures;
foreach $f (@features) {
        my @subs = $f->sub_SeqFeature;
        foreach $s (@subs) {
                print $s->start, "-", $s->end, "\n";
        }
}

However, I get nothing out.  The code doesn't descend into the sub seq
features as $f->sub_SeqFeature doesn't return anything.  Nor does
$f->get_SeqFeatures.

Clearly I am doing something wrong, but what?  I am using Bioperl-1.2.3

Thanks
Mick
_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From lstein at cshl.edu  Fri Feb  6 09:33:13 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Fri Feb  6 09:40:02 2004
Subject: [Bioperl-l] GD test 10 fails
In-Reply-To: <40236791.4040107@siu.edu>
References: <4022FE9B.5030406@siu.edu> <200402061113.47851.lstein@cshl.edu>
	<40236791.4040107@siu.edu>
Message-ID: <200402061633.13772.lstein@cshl.edu>

Hi,

OK, the issue is that test 10 was broken when Tom Boutell released 
gd-2.0.22.  Get GD version 2.12 and all should be well.  It'll be 
appearing on CPAN in a day or so.

Lincoln

On Friday 06 February 2004 12:08 pm, Chet Langin wrote:
> Greetings,
>
> SuSE 8.1
> gd-2.0.22
> GD-2.0.22
> freetype-2.1.5
> jpeg-6b
> libpng-1.2.5
> zlib-1.2.1
>
> I did not install a new XPM because I wasn't sure about the imake
> system and the latest version was 1998, which should have come with
> my distro.
>
> I started with a new server and installed SuSE.  I did the online
> SuSE updates.  I got CPAN going and did the "r" updates.  I
> installed some lower level modules from CPAN. I got Apache and
> MySQL running. I updated the MySQL modules from CPAN.I looked for
> earlier version files of libgd on my machine, but did not have any.
>  I installed zlib. I installed libgd.  It said that I already had
> freetype, jpeg, and png.  I started to install GD. It failed.  I
> went on the Internet and reinstalled freetype, jpeg and png.  I
> reinstalled libgd.  I tried to install GD and it failed on test 10.
>
> ===================================================================
>=== Running Mkbootstrap for GD ()
> chmod 644 GD.bs
> rm -f blib/arch/auto/GD/GD.so
> LD_RUN_PATH="/usr/local/lib:/usr/lib:/usr/X11R6/lib" cc  -shared
> GD.o -o blib/arch/auto/GD/GD.so   -L/usr/local/lib -L/usr/lib/X11
> -L/usr/X11R6/lib -L/usr/X11/lib -L/usr/local/lib -lgd -lpng -lz
> -lfreetype -ljpeg -lm -lX11 -lXpm
> chmod 755 blib/arch/auto/GD/GD.so
> cp GD.bs blib/arch/auto/GD/GD.bs
> chmod 644 blib/arch/auto/GD/GD.bs
> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
> "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
> t/GD..........FAILED test 10
> 	Failed 1/10 tests, 90.00% okay (less 1 skipped test: 8 okay,
> 80.00%) t/Polyline....ok
> Failed Test Stat Wstat Total Fail  Failed  List of Failed
> -------------------------------------------------------------------
>------------ t/GD.t                    10    1  10.00%  10
> 1 subtest skipped.
> Failed 1/2 test scripts, 50.00% okay. 1/11 subtests failed, 90.91%
> okay. make: *** [test_dynamic] Error 29
> ===================================================================
>====
>
> --Chet
>
> Lincoln Stein wrote:
> > Hi Chet,
> >
> > I wrote and maintain the GD library.  If you can send me the
> > information on what operating system you're using and the
> > versions of each of the libraries you have installed I might be
> > able to help.
> >
> > Lincoln
> >
> > On Friday 06 February 2004 04:40 am, Chet Langin wrote:
> >>While installing GD, test 10 failed, thus halting
> >>the install from CPAN.
> >>
> >>I installed the latest zlib, libgd, PNG, JPEG,
> >>and FreeType libraries, and it still failed.
> >>
> >>It looked like test 10 might be converting between
> >>JPEG and PNG formats.
> >>
> >>The only strange output during the make was a warning
> >>about /usr/local/include being a system directory
> >>when it was a non-system directory and that the
> >>search order was changed.
> >>
> >>I went ahead and forced install.  But, I was
> >>wondering if this might cause me further trouble
> >>down the road.
> >>
> >>,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
> >><http://mypage.siu.edu/clangin> <clangin@siu.edu>
> >>~~~Diagonally parked in a parallel universe ~~~~~
> >>
> >>
> >>_______________________________________________
> >>Bioperl-l mailing list
> >>Bioperl-l@portal.open-bio.org
> >>http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
From clangin at siu.edu  Fri Feb  6 10:27:33 2004
From: clangin at siu.edu (Chet Langin)
Date: Fri Feb  6 10:10:16 2004
Subject: [Bioperl-l] GD test 10 fails
References: <4022FE9B.5030406@siu.edu>
	<200402061113.47851.lstein@cshl.edu>	<40236791.4040107@siu.edu>
	<200402061633.13772.lstein@cshl.edu>
Message-ID: <4023B265.8010707@siu.edu>


Greetings,

It was GD-2.11.  Sorry for the typo.

Thanks for checking on it!  I'll keep an eye out
for GD 2.12.

--Chet


Lincoln Stein wrote:
> Hi,
> 
> OK, the issue is that test 10 was broken when Tom Boutell released 
> gd-2.0.22.  Get GD version 2.12 and all should be well.  It'll be 
> appearing on CPAN in a day or so.
> 
> Lincoln
> 
> On Friday 06 February 2004 12:08 pm, Chet Langin wrote:
> 
>>Greetings,
>>
>>SuSE 8.1
>>gd-2.0.22
>>GD-2.0.22
>>freetype-2.1.5
>>jpeg-6b
>>libpng-1.2.5
>>zlib-1.2.1
>>
>>I did not install a new XPM because I wasn't sure about the imake
>>system and the latest version was 1998, which should have come with
>>my distro.
>>
>>I started with a new server and installed SuSE.  I did the online
>>SuSE updates.  I got CPAN going and did the "r" updates.  I
>>installed some lower level modules from CPAN. I got Apache and
>>MySQL running. I updated the MySQL modules from CPAN.I looked for
>>earlier version files of libgd on my machine, but did not have any.
>> I installed zlib. I installed libgd.  It said that I already had
>>freetype, jpeg, and png.  I started to install GD. It failed.  I
>>went on the Internet and reinstalled freetype, jpeg and png.  I
>>reinstalled libgd.  I tried to install GD and it failed on test 10.
>>
>>===================================================================
>>=== Running Mkbootstrap for GD ()
>>chmod 644 GD.bs
>>rm -f blib/arch/auto/GD/GD.so
>>LD_RUN_PATH="/usr/local/lib:/usr/lib:/usr/X11R6/lib" cc  -shared
>>GD.o -o blib/arch/auto/GD/GD.so   -L/usr/local/lib -L/usr/lib/X11
>>-L/usr/X11R6/lib -L/usr/X11/lib -L/usr/local/lib -lgd -lpng -lz
>>-lfreetype -ljpeg -lm -lX11 -lXpm
>>chmod 755 blib/arch/auto/GD/GD.so
>>cp GD.bs blib/arch/auto/GD/GD.bs
>>chmod 644 blib/arch/auto/GD/GD.bs
>>PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
>>"test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
>>t/GD..........FAILED test 10
>>	Failed 1/10 tests, 90.00% okay (less 1 skipped test: 8 okay,
>>80.00%) t/Polyline....ok
>>Failed Test Stat Wstat Total Fail  Failed  List of Failed
>>-------------------------------------------------------------------
>>------------ t/GD.t                    10    1  10.00%  10
>>1 subtest skipped.
>>Failed 1/2 test scripts, 50.00% okay. 1/11 subtests failed, 90.91%
>>okay. make: *** [test_dynamic] Error 29
>>===================================================================
>>====
>>
>>--Chet
>>
>>Lincoln Stein wrote:
>>
>>>Hi Chet,
>>>
>>>I wrote and maintain the GD library.  If you can send me the
>>>information on what operating system you're using and the
>>>versions of each of the libraries you have installed I might be
>>>able to help.
>>>
>>>Lincoln
>>>
>>>On Friday 06 February 2004 04:40 am, Chet Langin wrote:
>>>
>>>>While installing GD, test 10 failed, thus halting
>>>>the install from CPAN.
>>>>
>>>>I installed the latest zlib, libgd, PNG, JPEG,
>>>>and FreeType libraries, and it still failed.
>>>>
>>>>It looked like test 10 might be converting between
>>>>JPEG and PNG formats.
>>>>
>>>>The only strange output during the make was a warning
>>>>about /usr/local/include being a system directory
>>>>when it was a non-system directory and that the
>>>>search order was changed.
>>>>
>>>>I went ahead and forced install.  But, I was
>>>>wondering if this might cause me further trouble
>>>>down the road.
>>>>
>>>>,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
>>>><http://mypage.siu.edu/clangin> <clangin@siu.edu>
>>>>~~~Diagonally parked in a parallel universe ~~~~~
>>>>
>>>>
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l@portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
> 


-- 
,,Chet Langin,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
<http://mypage.siu.edu/clangin> <clangin@siu.edu>
~~~Diagonally parked in a parallel universe ~~~~~


From tewang at ea.nacs.uci.edu  Fri Feb  6 11:29:18 2004
From: tewang at ea.nacs.uci.edu (Eric Wang)
Date: Fri Feb  6 11:35:24 2004
Subject: [Bioperl-l] Computing Allele Frequency From Heterozygosity
In-Reply-To: <200402061510.i16FAgHH018676@portal.open-bio.org>
Message-ID: <Pine.LNX.4.44.0402060827400.19085-100000@gradea.uci.edu>

Dear All,

I am wondering if bioperl has implemented a way of converting 
Heterozygosity number (found in dbSNP) to true allele frequency.  If not, 
I'd like to make some contributions. =)

Many thanks!

Eric

From allenday at ucla.edu  Fri Feb  6 11:44:20 2004
From: allenday at ucla.edu (Allen Day)
Date: Fri Feb  6 11:50:26 2004
Subject: [Bioperl-l] protein networks
In-Reply-To: <4023630A.2030406@ed.ac.uk>
Message-ID: <Pine.LNX.4.44.0402060842440.13093-100000@sumo.genetics.ucla.edu>

Richard,

To my knowledge, nothing exists in bioperl.  If you were to implement
something, http://psidev.sf.net would be a good place to start.  The
Proteomics Standards Initiative, of which DIP is a part, is working to
develop standard formats for proteomics data.

-Allen


On Fri, 6 Feb 2004, Richard Adams wrote:

> Hi,
> I was wondering if anyone has written or is writing any modules to deal 
> with protein interaction networks?
> 
> E.g., to read in from a DIP flatfile or XML, or other such interaction 
> information source
> and to have methods such as
> 
> get_number_of_interactors()
> get_interactor_ids()
> clustering_coefficient()
> path_length(from, to)
> degree()
> mean_path_length().
> 
> etc.
> 
> 
> Richard
> 
> 


From jason at cgt.duhs.duke.edu  Fri Feb  6 13:09:52 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Fri Feb  6 13:16:05 2004
Subject: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node
In-Reply-To: <GAEDKMGOKFBLJPKCLKCCKEBCDHAA.brian_osborne@cognia.com>
References: <GAEDKMGOKFBLJPKCLKCCKEBCDHAA.brian_osborne@cognia.com>
Message-ID: <Pine.LNX.4.50.0402031714300.3456-100000@tenero.duhs.duke.edu>

hmm - I was thinking that it is possible to create Taxonomy::Node which
behaves just like a Bio::Species object if we feed it all the necessary
information up front (the classification array essentially).  It is only
necessary to have a Bio::DB::Taxonomy handle if you want to do more
sophisticated things [get all the sibling nodes at this level, etc].

Basically, I would expect Taxonomy::Node to be able to do everything that
Bio::Species can do, AND also be db aware.  It is just this pre-loaded
with data versus a fully DB-dependent object.

This differs from the way I built Taxonomy::Node at first, where if you
wanted the Kingdom for a species, you had to walk up the hierarchy - now
you push that all down at object creation time via the classification
array.

So for the simple case of Genbank/Swiss/EMBL parsing, we would operate as
normal, and create Bio::Species (Bio::Taxonomy::Node really) objects
as per normal.  Only if someone wanted to do fun Bio::Taxonomy stuff they
would need to instantiate a Bio::Taxonomy::Node from a taxonomydb (it
needs to get the ncbi_taxid and a dbhandle).

-jason

On Tue, 3 Feb 2004, Brian Osborne wrote:

> Jason,
>
> So you'd automatically create the Node object without knowing if the
> underlying names and nodes files are present? I agree with you, that could
> be confusing.
>
> Test for the existence of an env that specifies the directory that contains
> these indexed files?
>
> Brian O.
>
> -----Original Message-----
> From: bioperl-l-bounces@portal.open-bio.org
> [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Jason Stajich
> Sent: Tuesday, February 03, 2004 4:28 PM
> To: Hilmar Lapp
> Cc: Bioperl
> Subject: Re: [Bioperl-l] Bio::Species / Bio::Taxonomy::Node
>
> We can start making things create Taxonomy::Node objects - I know there
> code floating out there which does
> if( $sp->isa('Bio::Species') ) { }
>
> so presumably we could make Bio::Species interface s.t. taxonomy::Node
> isa Bio::Species...?  I don't want to confuse people either.
>
> There may still be a little more functionality that is needed in the
> Taxonomy::Node objects and in the db - specifically how to deal with
> some of the methods which are really specific to the species level of
> the taxonomy (tips) such as classification/bionomial/ etc methods.
>
> -jason
>
> On Sat, 31 Jan 2004, Hilmar Lapp wrote:
>
> > Very cool Jason!!
> >
> > Now we can start hooking this into bioperl-db.
> >
> > And what about porting the SeqIO parsers, the target being to be able
> > to deprecate Bio::Species altogether? Alternatively, change the
> > SeqI/RichSeqI implementations to silently convert a Bio::Species
> > instance on set to a Bio::Taxonomy::Node instance?
> >
> >       -hilmar
> >
> > On Friday, January 30, 2004, at 02:07  PM, Jason Stajich wrote:
> >
> > > I think I've finally committed code which will allow
> > > Bio::Taxonomy::Node
> > > to act like Bio::Species while supporting the notion of being a node
> > > in a
> > > taxonomy hierarchy.  Added tests in t/Species.t to this effect.
> > >
> > > For Bio::DB::Taxonomy::flatfile I've added indexing by parent Id so it
> > > is
> > > quite fast to grab all the children for a given node.  So you can walk
> > > up
> > > and down the classification system now.  Practically speaking
> > > this means to get all the taxon ids of species in the same genus with a
> > > few simple lines like below.
> > >
> > > Unfortunately the the NCBI taxonomy API as part of E-Utils doesn't
> > > quite
> > > provide the information we need so the whole API can't be used without
> > > downloading the taxonomy db locally.
> > >
> > > nodefile and namesfile are the files from ncbi taxdump see
> > > Bio::DB::Taxonomy::flatfile for more info.
> > >
> > > #!/usr/bin/perl
> > > use strict;
> > > use warnings;
> > >
> > > use Bio::DB::Taxonomy;
> > > my $db = Bio::DB::Taxonomy->new
> > >     (-source => 'flatfile',
> > >      -nodesfile=> '/home/jason/taxonomy/nodes.dmp',
> > >      -namesfile=> '/home/jason/taxonomy/names.dmp');
> > >
> > > my $node = $db->get_Taxonomy_Node(-name => 'Caenorhabditis elegans');
> > >
> > > my $parent = $node->get_Parent_Node();
> > > for my $n ( $parent->get_Children_Nodes() ) {
> > >     print $n->binomial, "\t", $n->ncbi_taxid,"\n";
> > > }
> > >
> > > Someday I'll get around to making a HowTO unless someone else wants to
> > > do
> > > it... =)
> > >
> > > -jason
> > > --
> > > Jason Stajich
> > > Duke University
> > > jason at cgt.mc.duke.edu
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> >
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu


From mcipriano at mbl.edu  Fri Feb  6 16:43:59 2004
From: mcipriano at mbl.edu (Michael Cipriano)
Date: Fri Feb  6 16:50:01 2004
Subject: [Bioperl-l] question on simple align object
Message-ID: <000f01c3ecfa$5414e320$8fae8080@Ripley>

Is there a way I can get a Simple Alignment object that I created from a
clustalw alignment into a string of a specific format? I do not want to
deal with any files at all, I want to just be able to print the
alignment from a cgi-script on a web page.

So far I have this:

# Create alignment
my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);

my $aln = $factory->align($seq_array_ref);

I tried all sorts of things, but can't get specifically what I need
(which is simply the whole alignment file as a msf formated string). I
would like to not have to deal with any temporary files, unless there is
a way I can get to the temporary file that is created from the clustalw
alignment and just stick that into a string.

Thanks for any help in advance.


From brian_osborne at cognia.com  Fri Feb  6 17:00:08 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Fri Feb  6 17:06:56 2004
Subject: [Bioperl-l] question on simple align object
In-Reply-To: <000f01c3ecfa$5414e320$8fae8080@Ripley>
Message-ID: <GAEDKMGOKFBLJPKCLKCCMEDCDHAA.brian_osborne@cognia.com>

Michael,

I just wrote this on the command-line, it is sloppy but it seems to work:

~/programming/perl/Bioperl>perl -e 'use Bio::AlignIO; $io =
Bio::AlignIO->new(-file => "aln.clustalw", -format => "clustalw" ); my $aln
= $io->next_aln;
use IO::String; $out = IO::String->new(\$str); $ioout =
Bio::AlignIO->new(-format=> "msf", -fh => $out ); $ioout->write_aln($aln);
print $str;'

I've created my SimpleAlign object from a file, you won't need to do that,
of course.

Brian O.

-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Michael Cipriano
Sent: Friday, February 06, 2004 4:44 PM
To: bioperl-l@bioperl.org
Subject: [Bioperl-l] question on simple align object

Is there a way I can get a Simple Alignment object that I created from a
clustalw alignment into a string of a specific format? I do not want to
deal with any files at all, I want to just be able to print the
alignment from a cgi-script on a web page.

So far I have this:

# Create alignment
my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);

my $aln = $factory->align($seq_array_ref);

I tried all sorts of things, but can't get specifically what I need
(which is simply the whole alignment file as a msf formated string). I
would like to not have to deal with any temporary files, unless there is
a way I can get to the temporary file that is created from the clustalw
alignment and just stick that into a string.

Thanks for any help in advance.


_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From shawnh at stanford.edu  Fri Feb  6 17:09:32 2004
From: shawnh at stanford.edu (Shawn Hoon)
Date: Fri Feb  6 17:11:01 2004
Subject: [Bioperl-l] question on simple align object
In-Reply-To: <000f01c3ecfa$5414e320$8fae8080@Ripley>
References: <000f01c3ecfa$5414e320$8fae8080@Ripley>
Message-ID: <2452422C-58F1-11D8-8EAE-000A95783436@stanford.edu>

use Bio::AlignIO;
use IO::String;

my $stringio = IO::String->new($string);
my $aout = Bio::AlignIO->new(-fh=>$stringio,-format=>'clustalw');

> my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
>
> my $aln = $factory->align($seq_array_ref);

$aout->write_aln($aln);

print "Alignment\n".$string."\n";

something like above should work.

shawn


On Friday, February 6, 2004, at 1:43PM, Michael Cipriano wrote:

> Is there a way I can get a Simple Alignment object that I created from 
> a
> clustalw alignment into a string of a specific format? I do not want to
> deal with any files at all, I want to just be able to print the
> alignment from a cgi-script on a web page.
>
> So far I have this:
>
> # Create alignment
> my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM');
> my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
>
> my $aln = $factory->align($seq_array_ref);
>
> I tried all sorts of things, but can't get specifically what I need
> (which is simply the whole alignment file as a msf formated string). I
> would like to not have to deal with any temporary files, unless there 
> is
> a way I can get to the temporary file that is created from the clustalw
> alignment and just stick that into a string.
>
> Thanks for any help in advance.
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l


From heikki at nildram.co.uk  Sat Feb  7 19:26:32 2004
From: heikki at nildram.co.uk (Heikki Lehvaslaiho)
Date: Sat Feb  7 19:32:38 2004
Subject: [Bioperl-l] Computing Allele Frequency From Heterozygosity
In-Reply-To: <Pine.LNX.4.44.0402060827400.19085-100000@gradea.uci.edu>
References: <Pine.LNX.4.44.0402060827400.19085-100000@gradea.uci.edu>
Message-ID: <200402080026.32523.heikki@nildram.co.uk>

Eric,

We've got nothing along those lines. Please submit your contributions using 
bugzilla.

	-Heikki

On Friday 06 Feb 2004 16:29, Eric Wang wrote:
> Dear All,
>
> I am wondering if bioperl has implemented a way of converting
> Heterozygosity number (found in dbSNP) to true allele frequency.  If not,
> I'd like to make some contributions. =)
>
> Many thanks!
>
> Eric
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
From barry.moore at genetics.utah.edu  Sat Feb  7 23:24:48 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Sat Feb  7 23:30:59 2004
Subject: [Bioperl-l] Remote BLAST returning lots of 500 time out errors
In-Reply-To: <000f01c3ecfa$5414e320$8fae8080@Ripley>
References: <000f01c3ecfa$5414e320$8fae8080@Ripley>
Message-ID: <4025BA10.7000508@genetics.utah.edu>


I have a script that uses Bio::Tools::Run::RemoteBlast to BLAST the 
translations of all ORFs from a mRNA transcript against the database. It 
works fine, except that if I run several sequences at once, after about 
50 ORFs worth of BLASTing, NCBI starts to return errors (500 read 
time-out) for every job submitted. I can't figure out what's going on 
here. Any ideas?

The script is kind of long and it take several minutes to get to the 
errors, but if anyone wants to try to recreate the error I've attached 
the code below. Some of you will probably recognize bits of your code 
that I've pilfered from various Bioperl docs. I'm running Bioperl 1.4, 
ActiveState perl 5.8.0.805 on Windows XP. I get the error by running:

perl ORF_BLAST1.pl ?min_length 150 NM_001112 NM_007327 NM_015833 NM_021569

Barry Moore
Dept. Human Genetics
University of Utah

------------------------------------------------------------------------------------------------------

#!/usr/bin/perl

#ORF_BLAST1.pl
#See end of file for POD documentation

use strict;
use warnings;
use GD;
use Getopt::Long;
use Bio::SeqIO;
use Bio::PrimarySeq;
use Bio::DB::GenBank;
use Bio::Tools::Run::RemoteBlast;

#Give documentation when requested, or when missing command line arguments.
if ( ! $ARGV[0] || $ARGV[0] =~ /^-{1,2}(h|help|\?)$/i ) {
system ( "perldoc", $0 ) and die "For usage, use perldoc $0\n";
exit( 0 );
}

#Declare module level variables.
my $in_filename; #This variable holds the filename of the input file.
my $out_filename; #Suprisingly this variable holds the filname of the 
output file.
my $format; #This variable defines the output format (png or jpg).
my $min_length; #This variable defines the minimum ORF length to plot.
my $require_start; #This boolean varible identifies of an ORF must begin 
with a start.
my $seqio; #This variable holds the SeqIO object.

#Handle command line options.
GetOptions (
'in_file:s' => \$in_filename,
'out_file:s' => \$out_filename,
'format:s' => \$format,
'min_length:i' => \$min_length,
'start!' => \$require_start
);
my @accession = @ARGV;

#Set defaults.
$format ||= 'jpg';
$min_length ||= 150;
$require_start ||= 0;

#Define new SeqIO object. Take input from a file first if a
#filename has been specified. Otherwise take input from accession 
numbers off
#the command line (but don't try to do both)
if ($in_filename) {
$seqio = Bio::SeqIO->new(-format=>'fasta', -file=>$in_filename) or die 
"could not create Bio::SeqIO";
}
elsif (@accession) {
my $gb = new Bio::DB::GenBank();
$seqio = $gb->get_Stream_by_id(\@accession);
}

#Remote-BLAST factory object creation and blast-parameter initialization
my $BLAST_factory = Bio::Tools::Run::RemoteBlast->new('-prog' => 'blastp',
'-data' => 'nr',
'-expect' => '10',
'-readmethod' => 'SearchIO' );

#Main program loop to loop over all sequences input.
while (my $seq_obj = $seqio->next_seq) {
#Assign sequence specfic variables
my @starts = ();
my @stops = ();
my @orfs = ();
my $out_filename;
my $sequence = $seq_obj->seq;
my $sequence_length = length($sequence);
my $header = $seq_obj->display_name."|".$seq_obj->desc;

#Internal loop to find starts, stops, and ORFs
for my $count1 (0 .. $sequence_length) {
my $open = 0;
if ($count1 < 4) {$open = 1}
my $count2 = $count1;
my $codon = substr($sequence, $count1, 3);
my $frame = $count1 % 3; #Get the modulus of $count1/3 for ascertaining 
the frame
#Convert the modulus above into frame 1, 2, or 3.
if ($frame == 1) {$frame = 2}
elsif ($frame == 2) {$frame = 3}
elsif ($frame == 0) {$frame = 1}
#Add starts to stack.
if ($codon =~ /ATG/i) {
push @starts, {start => $count1,
frame => $frame};
if ($require_start == 1) {$open = 1} #Open the ORF flag.
}
#Add stops to stack.
if ($codon =~ /TGA|TAG|TAA/i) {
push @stops, {stop => $count1,
frame => $frame};
if ($require_start == 0) {$open = 1} #Open the ORF flag.
}
#Find extend of ORF if one has been opened by either of the above
#conditionals.
if ($open == 1) {
$codon = "";
my $count2 = $count1;
#Loop to step forward through ORF looking for next in frame stop.
while (($codon !~ /TGA|TAG|TAA/i) and ($count2 < $sequence_length - 4)) {
$count2 = $count2 + 3; #Keep it in frame.
$codon = substr($sequence, $count2, 3);
}
#Make sure the ORF is long enough to count...
if ($count2 - $count1 >= $min_length) {
push @orfs, {begin => $count1,
end => $count2,
frame => $frame
}; #...then push it onto the ORF stack.
}
}
}

@orfs || die "No ORFs of $min_length nucleotide in length found";

#Loop to BLAST each ORF against the database, and check for a hit.
my $BLAST_count;
for my $orf (@orfs) {
#Assign ORF specific variables.
my $begin = $$orf{begin};
my $end = $$orf{end};
my $frame = $$orf{frame};
$BLAST_count++;

#Initialize subsequence as new sequence
my $seq = new Bio::PrimarySeq
(-seq => $seq_obj->subseq($begin + 1, $end),
-display_id => "${frame}_${begin}_${end}");
#Translate sequence
my $trans = $seq->translate();

#Blast the sequence against a database:
my $job = $BLAST_factory->submit_blast($trans);
print STDERR "Blasting ORF ",$BLAST_count," of ", scalar @orfs, "...";
#Loop to load the RIDs returned for the BLAST job submitted (this probably
#doesn't need to be a loop here but I won't take it out yet)
while ( my @rids = $BLAST_factory->each_rid ) {
#Loop iterate over RIDs, and hit NCBI's BLAST server for a result
foreach my $rid ( @rids ) {
#Hit the server for a result on RID.
my $blast_results = $BLAST_factory->retrieve_blast($rid);
#Was a result returned?
if( !ref($blast_results) ) {
#If so and it returned an error remove that RID from the stack
if ($blast_results < 0) {
$BLAST_factory->remove_rid($rid);
}
print STDERR "."; #Keep the user staring at the dots.
sleep 5; #Plays nice with the servers.
}
#If a result was returned and it isn't an error, then pass it to a
#variable...
else {
my $result = $blast_results->next_result();
$BLAST_factory->remove_rid($rid); #...and remove it's RID from the stack.
#Check the result for a hit...
my $hit = $result->next_hit;
if (ref($hit)) {
my $hsp = $hit->next_hsp;
#...collect it's evalue from the hsp object, and add to the ORFs hash
$$orf{evalue} = $hsp->evalue();
}
#If no evalue found, default to 100 to keep undef from looking like a
#significant e-value.
else {$$orf{evalue} = 100}
print "\n";
}
}
}
}

#Main block to draw image.
my $image = new GD::Image(900, 150); #Create a new image.

#Allocate some colors.
my %color = (
white => $image->colorAllocate(255,255,255),
aqua => $image->colorAllocate(0,255,255),
black => $image->colorAllocate(0,0,0),
blue => $image->colorAllocate(0,0,255),
gray => $image->colorAllocate(128,128,128),
fuchsia => $image->colorAllocate(255,0,255),
green => $image->colorAllocate(0,255,0),
lime => $image->colorAllocate(0,255,255),
maroon => $image->colorAllocate(128,0,0),
navy => $image->colorAllocate(0,0,128),
olive => $image->colorAllocate(128,128,0),
purple => $image->colorAllocate(128,0,128),
red => $image->colorAllocate(255,0,0),
silver => $image->colorAllocate(192,192,192),
teal => $image->colorAllocate(0,128,128),
yellow1 => $image->colorAllocate(255,255,0),
yellow2 => $image->colorAllocate(200,200,0),
yellow3 => $image->colorAllocate(150,150,0)
);

#Make the background transparent and interlaced.
$image->transparent($color{white});
$image->interlaced('true');
#Put a black frame around the picture.
$image->rectangle(0,0,899,149,$color{black});
#Add the title line.
$image->string(gdGiantFont,10,10,$header,$color{black});
#Draw the lines for each frame.
$image->line(10,50,890,50,$color{black});
$image->line(10,75,890,75,$color{black});
$image->line(10,100,890,100,$color{black});
#Draw a line for the ruler.
$image->line(10,125,890,125,$color{black});

#Loop to add ruler ticks and numbers to image.
for my $tick (0 .. 10) {
#Convert sequence coordniates to image X-asix values.
$tick = $sequence_length/10*$tick;
$tick = convert($tick, $sequence_length);
#Add ruler ticks.
$image->line($tick,125,$tick,130,$color{black});
#Add nubmers to ruler.
$image->string(gdSmallFont,$tick-(2*length($tick-15)),130,$tick-15,$color{black});
}

#Loop to add ORFs to image.
for my $orf (@orfs) {
my $top; #The variable sets the top of ORF rectangle.
my $bottom; #This varibale sets the bottom of ORF rectangle.
#Asign the Y coordinates for the ORF to place them in the correct frame.
if ($$orf{frame} == 1) {$top = 40; $bottom = 60}
elsif ($$orf{frame} == 2) {$top = 65; $bottom = 85}
elsif ($$orf{frame} == 3) {$top = 90; $bottom = 110}
#Convert sequence coordniates to image X-axis values.
my $begin = convert($$orf{begin}, $sequence_length);
my $end = convert($$orf{end}, $sequence_length);
#Asign a shade of yellow to the ORF if the BLAST on that ORF returned an
#evaule.
my $orf_color = $color{black}; #Default ORF color to black.
if (defined $$orf{evalue}) {
if ($$orf{evalue} <= 10) {$orf_color = $color{yellow3}} #Dark yellow
if ($$orf{evalue} <= 1.0e-3) {$orf_color = $color{yellow2}} #Meduim yellow
if ($$orf{evalue} < 1.0e-25) {$orf_color = $color{yellow1}} #Bright yellow
}
#Draw rectangles for the ORFs.
$image->filledRectangle($begin,$top,$end,$bottom,$orf_color);
#Print the e-value on the ORF if it is below 10.
if ($$orf{evalue} < 10) {
$image->string(gdSmallFont,$begin + 3,$top + 2,$$orf{evalue},$color{black});
}
}

#Add green ticks to the image for each start.
for my $start (@starts) {
my $top; #This variable sets the top of the start line.
my $bottom; #This varibale sets the bottom of the start line.
#Assign the Y coordinates for the start line to put it in the correct frame.
if ($$start{frame} == 1) {$top = 50; $bottom = 60}
elsif ($$start{frame} == 2) {$top = 75; $bottom = 85}
elsif ($$start{frame} == 3) {$top = 100; $bottom = 110}
#Convert sequence coordniates to image X-axis values.
my $location = convert($$start{start}, $sequence_length);
#Draw the start ticks.
$image->line($location,$top,$location,$bottom,$color{green});
}

#Add red ticks to the image for each stop.
for my $stop (@stops) {
my $top; #This variable sets the top of the stop line.
my $bottom; #This varibale sets the bottom of the stop line.
#Assign the Y coordinates for the stop line to put it in the correct frame.
if ($$stop{frame} == 1) {$top = 40; $bottom = 60}
elsif ($$stop{frame} == 2) {$top = 65; $bottom = 85}
elsif ($$stop{frame} == 3) {$top = 90; $bottom = 110}
#Convert sequence coordniates to image X-asix values.
my $location = convert($$stop{stop}, $sequence_length);
#Draw the stop ticks.
$image->line($location,$top,$location,$bottom,$color{red});
}

#Set a default output filename if none was set on the command line.
if (! $out_filename) {
if ( $in_filename &&=~ /(.*?)\..*/) {
$out_filename = $1.".".$format;
}
elsif ( $seq_obj->primary_id !~ /unknown/) {
$out_filename = $seq_obj->display_name().".".$format;
}
else {
$out_filename = $seq_obj->primary_id().".".$format;
}
}
#Open a filehandle for output and make sure we are writing a binary stream.
open (OUT, ">$out_filename");
binmode OUT;

#Write the image to a file in specified format.
if ($format =~ /jpg|jpeg/) {
print OUT $image->jpeg;
}
if ($format =~ /png/) {
print OUT $image->png;
}
close OUT;
}

#A subroutine to convert sequence coordinates to x-axis values on the image.
sub convert {
my ($value, $length) = @_;
$value = (($value/$length)*870)+15; #Convert a sequence length value to 
an X-axis value.
$value = sprintf("%.0f", $value); #Round $value to nearest integer.
return $value;
}

=head1 NAME

ORF_BLAST1.pl

=head1 SYNOPSIS

perl ORF_Plot.pl [--options] NM_007327

=head1 DESCRIPTION

This program will take a sequence file as input, and generate a 
graphical output
of it's ORF architecture in 3 frames plotting ORFs, start codons (ATG) 
and stop
codons. It will BLAST the translation of each ORF against NCBI, and 
color the
shades of yellow to black, depending on the e-value returned for that ORF.

INPUT:

Input can be a list of space seperated accession numbers on the command 
line,
or a fasta file.

OUTPUT:

Output is a figure saved as either a png or jpg file to the current 
directory.

OPTIONS:

Several options can be specified, but all are optional.

--in_file filename
Use to set the input file name. The file that contains the input
sequences in fasta format.
--out_file filename
Use to set the output file name. Defaults to input file name, then
Bioperl's display name (usually the accession number), then Bioperl's
accession number (usually the gi number).
--min_size integer
Use to set the minimum ORF size that will be plotted in the figure.
--start
Use to require plotted ORFs to begin with a start
--format
Use to set the output format. Valid values are png or jpg (or jpeg).
Defaults to jpg.

=head1 USING

perl ORF_BLAST1.pl --min_size 300 --start --format png NM_001112 
NM_007327 NM_015833

or

perl ORF_BLAST1.pl --in_file sequence.fasta --out_file image_file 
--min_size 300 --start

=head1 REQUIRES

GD
Getopt::Long
Bio::SeqIO
Bio::PrimarySeq
Bio::DB::GenBank
Bio::Tools::Run::RemoteBlast

=head1 AUTHOR

Barry Moore
Department of Human Genetics
University of Utah
Salt Lake City, UT 84112
USA

Address bug reports and comments to: barry.moore@genetics.utah.edu

=head1 BUGS

Currently after about 50 ORFs BLASTed, NCBI starts to return time-out 
errors.

=head1 FUTURE DIRECTIONS

Add command line options for the BLAST parameters.

=head1 COPYRIGHT

Copyright 2003, Barry Moore. All rights reserved.

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.

=head1 SEE ALSO


=cut


From postmaster at portlandpress.com  Sun Feb  8 22:24:48 2004
From: postmaster at portlandpress.com (postmaster@portlandpress.com)
Date: Sun Feb  8 22:30:49 2004
Subject: [Bioperl-l] Subject : Virus Detected in "Hi"
Message-ID: <200402090330.i193UlHH022108@portal.open-bio.org>

A mail message with subject "Hi" has been found containing
a virus.
The message was sent from  bioperl-l@bioperl.org  to the following:
registration@portland-services.com

The email has been Deleted.
For more information contact support@portlandpress.com
The description of the Virus is shown below:
Scenarios/G.Virus: Threat: 'W32/MyDoom-A' detected by 'Sophos AV Interface for MIMEsweeper'.
Scenarios/G.Exe: 'ItemLength.GE.0'.


From hcle028 at cse.unsw.edu.au  Sun Feb  8 22:45:22 2004
From: hcle028 at cse.unsw.edu.au (Hong Ching Lee)
Date: Sun Feb  8 22:51:25 2004
Subject: [Bioperl-l] Problems running remote blast
Message-ID: <Pine.LNX.4.58.0402091440090.8273@weill.orchestra.cse.unsw.EDU.AU>

Hi everyone,

I have a question about running remote blast. The scenario is that I have
a DNA sequence stored as a string in $seq. What I'd like to do is to
submit it to blastn, then retrieve the result and put it into html format
without processing it.

I've noticed the existence of modules like Bio::Tools::Blast::HTML, but
I'm not sure about how I should use them, if I should use them at all.

Thank You,
Hong

PS: Thank You for answering my previous message
From Administrator at portal.open-bio.org  Mon Feb  9 02:31:55 2004
From: Administrator at portal.open-bio.org (Administrator@portal.open-bio.org)
Date: Mon Feb  9 02:31:31 2004
Subject: [Bioperl-l] ScanMail Message: To Sender virus found and action
	taken.
Message-ID: <097301c3eede$cb487660$0764010a@OSRL.NET>

ScanMail for Microsoft Exchange has detected virus-infected attachment(s).

Sender = bioperl-l@bioperl.org
Recipient(s) = Karren Jolliffe
Subject = HELLO
Scanning Time = 02/09/2004 07:31:54
Engine/Pattern = 6.810-1005/757

Action on virus found:
The attachment body.zip contains WORM_MYDOOM.A virus. ScanMail has Deleted it. 

Warning to sender. ScanMail has detected a virus in an email you sent.
From Richard.Adams at ed.ac.uk  Mon Feb  9 03:46:55 2004
From: Richard.Adams at ed.ac.uk (Richard Adams)
Date: Mon Feb  9 03:52:59 2004
Subject: [Bioperl-l] Remote BLAST returning lots of 500 time out
	Remote BLAST returning lots of 500 time outerrors
Message-ID: <402748FF.2030406@ed.ac.uk>

Hi Barry,
There's certainly nothing wrong with your code, there seems to be a 
problem with the way the RIDs are stored in temporary files ... I get 
the same problems with my code as well... will look into it.

Richard

-- 
Dr Richard Adams
Bioinformatician,
Psychiatric Genetics Group,
Medical Genetics,
Molecular Medicine Centre,
Western General Hospital,
Crewe Rd West,
Edinburgh UK
EH4 2XU

Tel: 44 131 651 1084
richard.adams@ed.ac.uk


From ricky21chaos at hotmail.com  Mon Feb  9 08:33:20 2004
From: ricky21chaos at hotmail.com (curtis)
Date: Mon Feb  9 08:42:41 2004
Subject: [Bioperl-l] This Drug puts VlAGRA to shame!!
Message-ID: <1076333600-19929@excite.com>

The Biggest New Drug since V1agra! Many times as powerful.

C1AL1S has been seen all over TV as of late.

So why is it so much better than V1agra? Why are so many switching brands?

-A quicker more stable erection
-More enjoyable sex for both
-Longer sex
-Known to add length to you erection
-Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six)

We have it at a discounted savings. Save when you go through our site on all your orders.

See the difference today. 

http://genius.roninnz.com/s95c/index.php?id=s95


lulu zenithjamaica barry whitney pearl wonder benson 
zenith fool 
diana abby lulu mantra 
kiss 
From Richard.Adams at ed.ac.uk  Mon Feb  9 08:48:56 2004
From: Richard.Adams at ed.ac.uk (Richard Adams)
Date: Mon Feb  9 08:55:01 2004
Subject: [Bioperl-l] Remote BLAST returning lots of 500 time out errors
Message-ID: <40278FC8.3050903@ed.ac.uk>

Hi Barry,
I've changed the RemoteBlast module so that it no longer uses temporary 
files - version 1.19 in CVS.
This is just a temporary solution as I'm not sure why making temporary 
files is causing this problem just now.
But it should work OK all being in memory, unless you are making vast 
Blast outqput  files.
You don't have to change your script at all. It now appears to be run OK.

Just curious, why don't you just submit the RNA sequence and use  
Blastx? This translates all your sequences and means you only have to submit
1/6th as many sequences to the server..
Cheers
Richard

-- 
Dr Richard Adams
Bioinformatician,
Psychiatric Genetics Group,
Medical Genetics,
Molecular Medicine Centre,
Western General Hospital,
Crewe Rd West,
Edinburgh UK
EH4 2XU

Tel: 44 131 651 1084
richard.adams@ed.ac.uk


From barry.moore at genetics.utah.edu  Mon Feb  9 09:44:39 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Mon Feb  9 09:50:48 2004
Subject: [Bioperl-l] Remote BLAST returning lots of 500 time out errors
In-Reply-To: <40278FC8.3050903@ed.ac.uk>
References: <40278FC8.3050903@ed.ac.uk>
Message-ID: <40279CD7.3070904@genetics.utah.edu>

Richard-

Thanks much for the help, and the solution.  I use this program to help 
look for trans-frame proteins - that is proteins that require a 
frameshift for expression.  With that in mind, this program BLASTs the 
translation of every ORF, and plots that ORF in shades of yellow 
(representing it's e-value) on a 3 frame plot of the transcript.  It may 
be that blastx would work, and I could just map the location of 
significant HSPs onto my plot.  When I started working on the program I 
tried translating the entire transcript (stops and all) in 3 frames, and 
BLASTing the 3 frames.  I noticed that I wouldn't get HSPs to some small 
ORFs that I could get by BLASTing those ORFs individually.  Because of 
that and because at the time it seemed simpler to keep track of and plot 
the results if each ORF was handled separately, I went that way.  In 
retrospect now that I've seen how long it can take to BLAST 26 small 
ORFs I think it would be a good idea to go back and check more carefully 
if I can achieve the same results with blastx.  It may be that by 
tweaking the parameters to BLAST, I can see hits to all the small ORFs 
on the transcript.  Thanks again for your help, and for the suggestions.

Barry

Richard Adams wrote:

> Hi Barry,
> I've changed the RemoteBlast module so that it no longer uses 
> temporary files - version 1.19 in CVS.
> This is just a temporary solution as I'm not sure why making temporary 
> files is causing this problem just now.
> But it should work OK all being in memory, unless you are making vast 
> Blast outqput  files.
> You don't have to change your script at all. It now appears to be run OK.
>
> Just curious, why don't you just submit the RNA sequence and use  
> Blastx? This translates all your sequences and means you only have to 
> submit
> 1/6th as many sequences to the server..
> Cheers
> Richard
>


From Richard.Adams at ed.ac.uk  Tue Feb 10 03:31:56 2004
From: Richard.Adams at ed.ac.uk (Richard Adams)
Date: Tue Feb 10 03:37:56 2004
Subject: [Bioperl-l] Problems running remote blast
Message-ID: <402896FC.8090607@ed.ac.uk>

Hi,
First of all make a sequence object :

my $seqobj = Bio::PrimarySeq->new(-seq=> $your_Sequence_String, 
-display_id => 'whatever');

Then use the synopsis from Bio::Tools::Run::RemoteBlast which is a 
minimal remote blasting module (changing
parameters as appropriate).
     
    The result that is returned in $result can be written in HTML

        my $result = $rc->next_result();


         my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
         my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                                                  
-file   => ">searchio.html");
                             # get a result from Bio::SearchIO parsing 
or build it up in memory
                            $outhtml->write_result($result);

The SearchIO modules are clearly explained in a HOWTO document if you 
want to know more.

Cheers
Richard

-- 
Dr Richard Adams
Bioinformatician,
Psychiatric Genetics Group,
Medical Genetics,
Molecular Medicine Centre,
Western General Hospital,
Crewe Rd West,
Edinburgh UK
EH4 2XU

Tel: 44 131 651 1084
richard.adams@ed.ac.uk


From sdavis2 at mail.nih.gov  Tue Feb 10 12:02:45 2004
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Tue Feb 10 12:04:50 2004
Subject: [Bioperl-l] Building biosql database errors
Message-ID: <BC4E78E5.45C3%sdavis2@mail.nih.gov>

I am trying to build a biosql database on a mysql database.  I have mysql
and biosql schema running and can successfully load some data, but for a
proportion of the data when loading ontology or locuslink, I get the
following (many times).  Am I doing something wrong, or is this to be
expected?  I would just push on with --safe (as given below), but clearly
part of the data is not loaded correctly after looking at the result.  I
have the same problem when loading locuslink.  Any input is appreciated.

Thanks,
Sean


% perl ../../../bioperl-db/scripts/biosql/load_ontology.pl --safe --fmtargs
"-defs_file,GO.defs" --dbuser sdavis --dbpass mic2222 --namespace "Gene
Ontology" --format goflat component.ontology.2004-02-01
process.ontology.2004-02-01 function.ontology.2004-02-01

-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were
("GO:0042597","periplasmic space","The region between the inner
(cytoplasmic) and outer membrane (Gram-negative bacteria) or inner membrane
and cell wall (fungi).","") FKs (84)
Duplicate entry 'periplasmic space-84' for key 2
---------------------------------------------------
Could not store term relationship (periplasmic space (sensu
Fungi),IS_A,periplasmic space):

------------- EXCEPTION  -------------
MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be found
by unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
/Library/Perl/5.8.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207
STACK Bio::DB::Persistent::PersistentObject::create
/Library/Perl/5.8.1/Bio/DB/Persistent/PersistentObject.pm:243
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
/Library/Perl/5.8.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:170
STACK Bio::DB::Persistent::PersistentObject::create
/Library/Perl/5.8.1/Bio/DB/Persistent/PersistentObject.pm:243
STACK (eval) ../../../bioperl-db/scripts/biosql/load_ontology.pl:548
STACK toplevel ../../../bioperl-db/scripts/biosql/load_ontology.pl:547

--------------------------------------

From awitney at sghms.ac.uk  Tue Feb 10 13:07:52 2004
From: awitney at sghms.ac.uk (Adam Witney)
Date: Tue Feb 10 13:15:00 2004
Subject: [Bioperl-l] Bioperl-db make test failures
Message-ID: <BC4ECE78.2D7E2%awitney@sghms.ac.uk>

Hi,

I am trying out bioperl-db and biosql. I downloaded both from CVS and
installed the biosql schema ok. However I have some test failures with
bioperl-db:

t/cluster.......ok 5/160Use of uninitialized value in join or string at
blib/lib/Bio/DB/BioSQL/BaseDriver.pm line 1835, <GEN0> line 1.

-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::BiosequenceAdaptor (driver) failed, values
were ("","0","dna","") FKs (2)
ERROR:  invalid input syntax for integer: ""

... And

t/species.......ok 68/65
------------- EXCEPTION  -------------
MSG: create: object (Bio::Species) failed to insert or to be found by unique
key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207
STACK Bio::DB::Persistent::PersistentObject::create
blib/lib/Bio/DB/Persistent/PersistentObject.pm:243
STACK toplevel t/species.t:76

Are these known problems or have I missed something?

Thanks

Adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From andreas.bernauer at gmx.de  Tue Feb 10 15:35:09 2004
From: andreas.bernauer at gmx.de (Andreas Bernauer)
Date: Tue Feb 10 15:41:11 2004
Subject: [Bioperl-l] Building biosql database errors
In-Reply-To: <BC4E78E5.45C3%sdavis2@mail.nih.gov>
References: <BC4E78E5.45C3%sdavis2@mail.nih.gov>
Message-ID: <20040210203509.GI369@hgt.mcb.uconn.edu>

Sean Davis wrote:
> I am trying to build a biosql database on a mysql database.  I have mysql
> and biosql schema running and can successfully load some data, but for a
> proportion of the data when loading ontology or locuslink, I get the
> following (many times).  Am I doing something wrong, or is this to be
> expected?  

> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were
> ("GO:0042597","periplasmic space","The region between the inner
> (cytoplasmic) and outer membrane (Gram-negative bacteria) or inner membrane
> and cell wall (fungi).","") FKs (84)
> Duplicate entry 'periplasmic space-84' for key 2
   ^
   |
   +-------------------------+
                             |
I don't know, but maybe this + means something?  I guess, the db
can't handle duplicate keys.

Andreas.
From sdavis2 at mail.nih.gov  Tue Feb 10 18:21:58 2004
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Tue Feb 10 18:23:49 2004
Subject: GO identifiers being mis-parsed?  Was Re: [Bioperl-l] Building
	biosql database errors
Message-ID: <BC4ED1C6.4723%sdavis2@mail.nih.gov>

I thank Andreas for pointing out the obvious and the email list points this
out as an ongoing problem, solved for the meantime by installing GO with
--nodelete.  However, there was another set of errors that remained after
fixing this and seemingly related to terms such as:

UM-BBD-pathwayID:van

When I changed the ':' to '-' globally in the GO flat files, I got rid of
the exceptions (Any ideas?).  I continued to have issues with installing,
though, in that the name column often contains the GO:xxxxxxx number rather
than the name.  I assume that the identifier column is supposed to contain
the GO:xxxxxxx numbers, instead.  It seems that any values in the flat file
that contain a ':' are being treated as GO identifiers, as when I change all
things like "metacyc:xxxx" or "EC:xxxx" to use '-' instead of ':', the
output is as expected.  I don't know enough about methods code to find where
that parsing occurs, but just wanted to bring it up as an issue for me.  Is
this a problem specific to me or have others found similar issues?

I am using bioperl-1.4 with bioperl-db installed from cvs this morning on
macos 10.3.2. 

Sean

On 2/10/04 3:35 PM, "Andreas Bernauer" <andreas.bernauer@gmx.de> wrote:

> Sean Davis wrote: I am trying to build a biosql database on a mysql database.
> I have mysql and biosql schema running and can successfully load some data,
> but for a proportion of the data when loading ontology or locuslink, I get the
> following (many times).  Am I doing something wrong, or is this to be
> expected? 
> 
>> -------------------- WARNING --------------------- MSG: insert in
>> Bio::DB::BioSQL::TermAdaptor (driver) failed, values were
>> ("GO:0042597","periplasmic space","The region between the inner (cytoplasmic)
>> and outer membrane (Gram-negative bacteria) or inner membrane and cell wall
>> (fungi).","") FKs (84) Duplicate entry 'periplasmic space-84' for key 2 ^ |
>> +-------------------------+ | I don't know, but maybe this + means something?
>> I guess, the db can't handle duplicate keys.
>> 
> Andreas. _______________________________________________ Bioperl-l mailing
> list Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 


From hlapp at gmx.net  Wed Feb 11 04:14:18 2004
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed Feb 11 04:20:19 2004
Subject: GO identifiers being mis-parsed? Was Re: [Bioperl-l] Building
	biosql database errors
In-Reply-To: <BC4ED1C6.4723%sdavis2@mail.nih.gov>
Message-ID: <ABEDD356-5C72-11D8-ACEB-000A959EB4C4@gmx.net>


On Tuesday, February 10, 2004, at 03:21  PM, Sean Davis wrote:

> UM-BBD-pathwayID:van
>
> When I changed the ':' to '-' globally in the GO flat files, I got rid 
> of
> the exceptions (Any ideas?).  I continued to have issues with 
> installing,
> though, in that the name column often contains the GO:xxxxxxx number 
> rather
> than the name.

A bug was introduced into the GO parser (dagflat in fact) that causes 
this. I fixed it in the main trunk a week or two ago, but haven't yet 
migrated the fix to the branch. Will do that too.

	-hilmar

-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From hlapp at gnf.org  Wed Feb 11 14:15:38 2004
From: hlapp at gnf.org (Hilmar Lapp)
Date: Wed Feb 11 14:21:40 2004
Subject: [Bioperl-l] Building biosql database errors
In-Reply-To: <BC4E78E5.45C3%sdavis2@mail.nih.gov>
Message-ID: <AD1DE412-5CC6-11D8-BA20-000A959EB4C4@gnf.org>

This could have multiple reasons. Generally speaking, in an ideal world 
the unique key constraint is on the tuple (name, ontology), and given 
the error message this seems to be the constraint you're violating, 
because

	- you may have loaded GO before and did not specify --lookup or 
similar options that let the script deal with pre-existing content to 
be updated according to one of different policies (check out the 
load_ontology.pl POD for a more elaborate discussion of update options)

	- the term that appears to violate the constraint exists also as an 
obsoleted term with the same name, but a different GO identifier, and 
you did not choose to ignore and delete obsoleted terms

The latter is particularly nasty and a reflection of the fact that the 
world we live in is not ideal. If you are going to always purge 
existing terms from the database and then reload GO then you can keep 
the unique key constraint the way it is, and just need to make sure 
that this strategy is reflected in the options (--noobsolete). The 
downside of doing so (and of using --delobsolete, too) is that deleting 
the terms will remove their associations to bioentries and features as 
well, i.e., if any bioentry or feature was annotated with either any GO 
term (if reload from scratch) or a GO term that is being obsoleted (if 
using --delobsolete) then obviously you lose that annotation when 
deleting the term(s). If you'll reload those associations right 
afterwards, then there's no problem with this.

Alternatively, if you want to keep GO in the database and then update 
it with a new release, then apart from choosing what to do with terms 
that are obsolete (see the load_ontology.pl POD for the choices you 
have) you need to change the unique key constraint to the tuple of 
(name, ontology, is_obsolete). This should be a commented-out option in 
the schema DDL file.

Hth,

	-hilmar

On Tuesday, February 10, 2004, at 09:02  AM, Sean Davis wrote:

> I am trying to build a biosql database on a mysql database.  I have 
> mysql
> and biosql schema running and can successfully load some data, but for 
> a
> proportion of the data when loading ontology or locuslink, I get the
> following (many times).  Am I doing something wrong, or is this to be
> expected?  I would just push on with --safe (as given below), but 
> clearly
> part of the data is not loaded correctly after looking at the result.  
> I
> have the same problem when loading locuslink.  Any input is 
> appreciated.
>
> Thanks,
> Sean
>
>
> % perl ../../../bioperl-db/scripts/biosql/load_ontology.pl --safe 
> --fmtargs
> "-defs_file,GO.defs" --dbuser sdavis --dbpass mic2222 --namespace "Gene
> Ontology" --format goflat component.ontology.2004-02-01
> process.ontology.2004-02-01 function.ontology.2004-02-01
>
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values 
> were
> ("GO:0042597","periplasmic space","The region between the inner
> (cytoplasmic) and outer membrane (Gram-negative bacteria) or inner 
> membrane
> and cell wall (fungi).","") FKs (84)
> Duplicate entry 'periplasmic space-84' for key 2
> ---------------------------------------------------
> Could not store term relationship (periplasmic space (sensu
> Fungi),IS_A,periplasmic space):
>
> ------------- EXCEPTION  -------------
> MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be 
> found
> by unique key
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
> /Library/Perl/5.8.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207
> STACK Bio::DB::Persistent::PersistentObject::create
> /Library/Perl/5.8.1/Bio/DB/Persistent/PersistentObject.pm:243
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
> /Library/Perl/5.8.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:170
> STACK Bio::DB::Persistent::PersistentObject::create
> /Library/Perl/5.8.1/Bio/DB/Persistent/PersistentObject.pm:243
> STACK (eval) ../../../bioperl-db/scripts/biosql/load_ontology.pl:548
> STACK toplevel ../../../bioperl-db/scripts/biosql/load_ontology.pl:547
>
> --------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From hlapp at gnf.org  Wed Feb 11 14:18:15 2004
From: hlapp at gnf.org (Hilmar Lapp)
Date: Wed Feb 11 14:24:16 2004
Subject: [Bioperl-l] Bioperl-db make test failures
In-Reply-To: <BC4ECE78.2D7E2%awitney@sghms.ac.uk>
Message-ID: <0AA63EAA-5CC7-11D8-BA20-000A959EB4C4@gnf.org>

This is strange, actually both of them. Did you run the tests against a 
database with content loaded prior to the tests, or was it a freshly 
created instance of the schema?

If the latter, which RDBMS, and version of bioperl are you using?

	-hilmar

On Tuesday, February 10, 2004, at 10:07  AM, Adam Witney wrote:

> Hi,
>
> I am trying out bioperl-db and biosql. I downloaded both from CVS and
> installed the biosql schema ok. However I have some test failures with
> bioperl-db:
>
> t/cluster.......ok 5/160Use of uninitialized value in join or string at
> blib/lib/Bio/DB/BioSQL/BaseDriver.pm line 1835, <GEN0> line 1.
>
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::BiosequenceAdaptor (driver) failed, 
> values
> were ("","0","dna","") FKs (2)
> ERROR:  invalid input syntax for integer: ""
>
> ... And
>
> t/species.......ok 68/65
> ------------- EXCEPTION  -------------
> MSG: create: object (Bio::Species) failed to insert or to be found by 
> unique
> key
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
> blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207
> STACK Bio::DB::Persistent::PersistentObject::create
> blib/lib/Bio/DB/Persistent/PersistentObject.pm:243
> STACK toplevel t/species.t:76
>
> Are these known problems or have I missed something?
>
> Thanks
>
> Adam
>
>
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From awitney at sghms.ac.uk  Wed Feb 11 14:39:27 2004
From: awitney at sghms.ac.uk (Adam Witney)
Date: Wed Feb 11 14:46:24 2004
Subject: [Bioperl-l] Bioperl-db make test failures
In-Reply-To: <0AA63EAA-5CC7-11D8-BA20-000A959EB4C4@gnf.org>
Message-ID: <BC50356F.2D921%awitney@sghms.ac.uk>

On 11/2/04 7:18 pm, "Hilmar Lapp" <hlapp@gnf.org> wrote:

> This is strange, actually both of them. Did you run the tests against a
> database with content loaded prior to the tests, or was it a freshly
> created instance of the schema?
> 
> If the latter, which RDBMS, and version of bioperl are you using?

I had created the database and only run load_ncbi_taxonomy.pl to download
the taxonomy database from NCBI

Both biosql-schema and bioper-db were downloaded from CVS yesterday. RDBMS
is PostgreSQL 7.4.1

Thanks

adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

From hlapp at gnf.org  Wed Feb 11 15:00:30 2004
From: hlapp at gnf.org (Hilmar Lapp)
Date: Wed Feb 11 15:06:31 2004
Subject: [Bioperl-l] Bioperl-db make test failures
In-Reply-To: <BC50356F.2D921%awitney@sghms.ac.uk>
Message-ID: <F1B95B68-5CCC-11D8-BA20-000A959EB4C4@gnf.org>

That explains the species test failure. It tests, among other things, 
whether it can successfully insert a species. As it is not a made up 
taxon, it'll fail if you pre-loaded the ncbi taxon database.

Generally, I recommend creating a test schema for test scripts that's 
separate from the instance you use for production or anything else that 
you don't want to throw away in an instant and not be sorry.

  	-hilmar

On Wednesday, February 11, 2004, at 11:39  AM, Adam Witney wrote:

> On 11/2/04 7:18 pm, "Hilmar Lapp" <hlapp@gnf.org> wrote:
>
>> This is strange, actually both of them. Did you run the tests against 
>> a
>> database with content loaded prior to the tests, or was it a freshly
>> created instance of the schema?
>>
>> If the latter, which RDBMS, and version of bioperl are you using?
>
> I had created the database and only run load_ncbi_taxonomy.pl to 
> download
> the taxonomy database from NCBI
>
> Both biosql-schema and bioper-db were downloaded from CVS yesterday. 
> RDBMS
> is PostgreSQL 7.4.1
>
> Thanks
>
> adam
>
>
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From awitney at sghms.ac.uk  Wed Feb 11 15:07:01 2004
From: awitney at sghms.ac.uk (Adam Witney)
Date: Wed Feb 11 15:13:57 2004
Subject: [Bioperl-l] Bioperl-db make test failures
In-Reply-To: <F1B95B68-5CCC-11D8-BA20-000A959EB4C4@gnf.org>
Message-ID: <BC503BE5.2D92B%awitney@sghms.ac.uk>


I installed the bioperl-db module anyway, but when I tried to load a GenBank
file into the database, the same species failure came up... Should I not
load the ncbi taxon database?

Thanks

adam

> That explains the species test failure. It tests, among other things,
> whether it can successfully insert a species. As it is not a made up
> taxon, it'll fail if you pre-loaded the ncbi taxon database.
> 
> Generally, I recommend creating a test schema for test scripts that's
> separate from the instance you use for production or anything else that
> you don't want to throw away in an instant and not be sorry.
> 
> -hilmar
> 
> On Wednesday, February 11, 2004, at 11:39  AM, Adam Witney wrote:
> 
>> On 11/2/04 7:18 pm, "Hilmar Lapp" <hlapp@gnf.org> wrote:
>> 
>>> This is strange, actually both of them. Did you run the tests against
>>> a
>>> database with content loaded prior to the tests, or was it a freshly
>>> created instance of the schema?
>>> 
>>> If the latter, which RDBMS, and version of bioperl are you using?
>> 
>> I had created the database and only run load_ncbi_taxonomy.pl to
>> download
>> the taxonomy database from NCBI
>> 
>> Both biosql-schema and bioper-db were downloaded from CVS yesterday.
>> RDBMS
>> is PostgreSQL 7.4.1
>> 
>> Thanks
>> 
>> adam
>> 
>> 
>> -- 
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>> 
>> 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

From Annie.Law at nrc-cnrc.gc.ca  Thu Feb 12 12:46:56 2004
From: Annie.Law at nrc-cnrc.gc.ca (Law, Annie)
Date: Thu Feb 12 12:52:57 2004
Subject: [Bioperl-l] Locuslink parser
Message-ID: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca>

Hi,

I would appreciate help with the following.  I have searched for questions
on the locuslink parser but have not found answers to my questions. I am
trying to understand how to use the locuslink parser. I am most interested
in obtaining the fields  locuslink id, 
GO id, accession number, unigene id. However, when I use the following code.
I am only able to get information for the fields:
ALIAS_PROT,ALIAS_SYMBOL,CDD,CHR,CURRENT_LOCUSID,ECNUM,EXTANNOT,MAP,NC,NR,OFF
ICIAL_GENE_NAME,OFFICIAL_SYMBOL
PHENOTYPE,PREFERRED_GENE_NAME, PREFERRED_PRODUCT, PREFERRED_SYMBOL, PRODUCT

In the best scenario I would like to be able to obtain all of the
information availabe form the LL_tmpl file From locus link meaning all of
the fields.  How do I access the fields I want after the parser has done its
work? 
Thanks,
Annie.

			 ACCNUM
		       ALIAS_PROT
		       ALIAS_SYMBOL
		       ASSEMBLY
		       BUTTON
		       CDD
		       CHR
		       COMP
		       CONTIG
		       CURRENT_LOCUSID
		       DB_DESCR
		       DB_LINK
		       ECNUM
		       EVID
		       EXTANNOT
		       GO
		       GRIF
		       LINK
		       LOCUSID
		       LOCUS_CONFIRMED
		       LOCUS_TYPE
		       MAP
		       MAPLINK
		       NC
		       NG
		       NM
		       NP
		       NR
		       OFFICIAL_GENE_NAME
		       OFFICIAL_SYMBOL
		       OMIM
		       ORGANISM
		       PHENOTYPE
		       PHENOTYPE_ID
		       PMID
		       PREFERRED_GENE_NAME
		       PREFERRED_PRODUCT
		       PREFERRED_SYMBOL
		       PRODUCT
		       PROT
		       RELL
		       STATUS
		       STS
		       SUMFUNC
		       SUMMARY
		       TRANSVAR
		       TYPE
		       UNIGENE
		       XG
		       XM
		       XP
		       XR


use Bio::SeqIO;
use strict;

my $io = Bio::SeqIO->new(-file => '/var/lib/mysql/LL_tmpl', -format =>
"locuslink");

while (my $seq_obj=$io->next_seq()){
my $anno_collection = $seq_obj->annotation;

foreach my $key ($anno_collection->get_all_annotation_keys){
  my @annotations = $anno_collection->get_Annotations($key);
  foreach my $value (@annotations){
    print "tagname: ", $value->tagname, "\n";		
    # $value is an Bio::Annotation, and has an "as_text" method
    print " annotation value: ", $value->as_text, "\n"; 	

  }
}
}#cycling through all of the sequences.


From dfclark at neo.tamu.edu  Thu Feb 12 13:16:45 2004
From: dfclark at neo.tamu.edu (David Clark)
Date: Thu Feb 12 13:22:16 2004
Subject: [Bioperl-l] Fasta Genome Splice
In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca>
References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca>
Message-ID: <9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu>

Hello,

I'm a relative newcomer to bioperl, and would like a point in the right 
direction.  I need to separate the yeast genome into two partial 
genomes--one with all ORF's, and one with everything else.  I have a 
tab delimited list of the ORF's with the coordinates, and can probably 
parse that myself, but I wanted to see if anyone could point me to some 
example code, or give me some place to start in separating genomes 
based on the coordinates.

Thanks,

David Clark
dfclark@neo.tamu.edu

From pm66 at nyu.edu  Thu Feb 12 13:30:44 2004
From: pm66 at nyu.edu (Philip MacMenamin)
Date: Thu Feb 12 13:37:50 2004
Subject: [Bioperl-l] Strangeness in bioperl-1.4::Bio::DB::GFF::Segment...?
Message-ID: <200402121832.i1CIW42N006581@mx6.nyu.edu>

Hi, 

So I have WS118 SQLdb running here, and I run 
select * from fattribute where gname = 'AH6.5';
and I get some stuff returned.

So, if I run (via perl)
my $panelSeg = $db->segment('AH6');
I get stuff returned (ie all of the AH's, like I'd expect). 

However, if I run:
my $panelSeg = $db->segment('AH6.5');
I get nothing returned. 

This seems odd to me...

I am of course doomed to figure out why as soon as I post this though :)

Philip

From lstein at cshl.edu  Thu Feb 12 13:39:37 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Thu Feb 12 13:45:52 2004
Subject: [Bioperl-l] Strangeness in bioperl-1.4::Bio::DB::GFF::Segment...?
In-Reply-To: <200402121832.i1CIW42N006581@mx6.nyu.edu>
References: <200402121832.i1CIW42N006581@mx6.nyu.edu>
Message-ID: <200402122039.37363.lstein@cshl.edu>

The class of the gene sequence has changed.  You'll have to get it 
this way:

	$panelSeg= $db->segment(CDS => 'AH6.5')

Lincoln

On Thursday 12 February 2004 08:30 pm, Philip MacMenamin wrote:
> Hi,
>
> So I have WS118 SQLdb running here, and I run
> select * from fattribute where gname = 'AH6.5';
> and I get some stuff returned.
>
> So, if I run (via perl)
> my $panelSeg = $db->segment('AH6');
> I get stuff returned (ie all of the AH's, like I'd expect).
>
> However, if I run:
> my $panelSeg = $db->segment('AH6.5');
> I get nothing returned.
>
> This seems odd to me...
>
> I am of course doomed to figure out why as soon as I post this
> though :)
>
> Philip
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
From hlapp at gnf.org  Thu Feb 12 14:09:51 2004
From: hlapp at gnf.org (Hilmar Lapp)
Date: Thu Feb 12 14:15:51 2004
Subject: [Bioperl-l] Locuslink parser
In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca>
Message-ID: <08A1CF46-5D8F-11D8-BD29-000A959EB4C4@gnf.org>


On Thursday, February 12, 2004, at 09:46  AM, Law, Annie wrote:

> I am most intereste in obtaining the fields  locuslink id, GO id, 
> accession number, unigene id.

The locuslink ID is the $seq->accession_number. GO should be there as 
term annotations, unigene ID and other accessions should be present as 
dbxref annotations.

You can test for an annotation being a term annotation or a dbxref:

	foreach my $ann (@annotations) {
		if ($ann->isa("Bio::Ontology::TermI")) {
			# this is an ontology term as annotation
		}
		if ($ann->isa("Bio::Annotation::DBLink")) {
			# this is a dbxref annotation
		}
	}

Using the map function you can easily filter for annotation types, for 
example:

	@term_annotations = map { $_->isa("Bio::Ontology::TermI"); } 
$seq->get_Annotations();

BTW if you want to get all annotations from a seq object, you can just 
say $seq->get_Annotations() and omit the key.

Hth,

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From jason at cgt.duhs.duke.edu  Thu Feb 12 14:19:05 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Thu Feb 12 14:25:11 2004
Subject: [Bioperl-l] Fasta Genome Splice
In-Reply-To: <9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu>
References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca>
	<9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu>
Message-ID: <Pine.LNX.4.50.0402121410410.18208-100000@tenero.duhs.duke.edu>

You want these as a fasta file per orf and per non-orf region or just 2
datasets with the genome masked (all N's or lowercased)?

-jason
On Thu, 12 Feb 2004, David Clark wrote:

> Hello,
>
> I'm a relative newcomer to bioperl, and would like a point in the right
> direction.  I need to separate the yeast genome into two partial
> genomes--one with all ORF's, and one with everything else.  I have a
> tab delimited list of the ORF's with the coordinates, and can probably
> parse that myself, but I wanted to see if anyone could point me to some
> example code, or give me some place to start in separating genomes
> based on the coordinates.
>
> Thanks,
>
> David Clark
> dfclark@neo.tamu.edu
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From dfclark at neo.tamu.edu  Thu Feb 12 14:59:20 2004
From: dfclark at neo.tamu.edu (David Clark)
Date: Thu Feb 12 15:05:32 2004
Subject: [Bioperl-l] Fasta Genome Splice
In-Reply-To: <Pine.LNX.4.50.0402121410410.18208-100000@tenero.duhs.duke.edu>
References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca>
	<9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu>
	<Pine.LNX.4.50.0402121410410.18208-100000@tenero.duhs.duke.edu>
Message-ID: <F28DEF29-5D95-11D8-8442-0030657E637C@neo.tamu.edu>

Good point.  What I need is two fasta files: one with the ofr regions 
masked, and one with the non-ofr regions masked.  There was another 
thing I wanted to do that I didn't mention before: how can I generate 
the reverse compliment of a whole genome file?

On Feb 12, 2004, at 1:19 PM, Jason Stajich wrote:

> You want these as a fasta file per orf and per non-orf region or just 2
> datasets with the genome masked (all N's or lowercased)?
>
> -jason
> On Thu, 12 Feb 2004, David Clark wrote:
>
>> Hello,
>>
>> I'm a relative newcomer to bioperl, and would like a point in the 
>> right
>> direction.  I need to separate the yeast genome into two partial
>> genomes--one with all ORF's, and one with everything else.  I have a
>> tab delimited list of the ORF's with the coordinates, and can probably
>> parse that myself, but I wanted to see if anyone could point me to 
>> some
>> example code, or give me some place to start in separating genomes
>> based on the coordinates.
>>
>> Thanks,
>>
>> David Clark
>> dfclark@neo.tamu.edu

From ryank at drizzle.com  Thu Feb 12 15:29:03 2004
From: ryank at drizzle.com (Ryan Kuykendall)
Date: Thu Feb 12 15:34:57 2004
Subject: [Bioperl-l] Fasta Genome Splice
In-Reply-To: <F28DEF29-5D95-11D8-8442-0030657E637C@neo.tamu.edu>
Message-ID: <Pine.LNX.4.44.0402121211230.22886-100000@drizzle.com>


I'm sure there is a Perl module for generating the reverse compliment of a
whole genome, but assuming you wanted to write the code from scratch:

## ...and assuming your genome file has been turned into an array of bases
## called @listOfBases;

my $baseComplimentMap = 
{
 'a' => 't',
 'c' => 'g',
 'g' => 'c',
 't' => 'a',	
};

my @baseComplimentList = ();

foreach my $base ( @listOfBases )
{
    my $complimentBase = $baseComplimentMap->{$base};
    push( @baseComplimentList, $complimentBase );
}

That would do it...

============================================================
Ryan Kuykendall
ryank@drizzle.com

http://undef.com/ryank/ryanAtBawa50percent.JPG
============================================================

On Thu, 12 Feb 2004, David Clark wrote:

> Good point.  What I need is two fasta files: one with the ofr regions 
> masked, and one with the non-ofr regions masked.  There was another 
> thing I wanted to do that I didn't mention before: how can I generate 
> the reverse compliment of a whole genome file?
> 
> On Feb 12, 2004, at 1:19 PM, Jason Stajich wrote:
> 
> > You want these as a fasta file per orf and per non-orf region or just 2
> > datasets with the genome masked (all N's or lowercased)?
> >
> > -jason
> > On Thu, 12 Feb 2004, David Clark wrote:
> >
> >> Hello,
> >>
> >> I'm a relative newcomer to bioperl, and would like a point in the 
> >> right
> >> direction.  I need to separate the yeast genome into two partial
> >> genomes--one with all ORF's, and one with everything else.  I have a
> >> tab delimited list of the ORF's with the coordinates, and can probably
> >> parse that myself, but I wanted to see if anyone could point me to 
> >> some
> >> example code, or give me some place to start in separating genomes
> >> based on the coordinates.
> >>
> >> Thanks,
> >>
> >> David Clark
> >> dfclark@neo.tamu.edu
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 
============================================================
Ryan Kuykendall
ryank@drizzle.com

http://undef.com/ryank/ryanAtBawa50percent.JPG
============================================================

From jason at cgt.duhs.duke.edu  Thu Feb 12 15:46:33 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Thu Feb 12 15:52:39 2004
Subject: [Bioperl-l] Fasta Genome Splice
In-Reply-To: <F28DEF29-5D95-11D8-8442-0030657E637C@neo.tamu.edu>
References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca>
	<9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu>
	<Pine.LNX.4.50.0402121410410.18208-100000@tenero.duhs.duke.edu>
	<F28DEF29-5D95-11D8-8442-0030657E637C@neo.tamu.edu>
Message-ID: <Pine.LNX.4.50.0402121526070.18208-100000@tenero.duhs.duke.edu>


On Thu, 12 Feb 2004, David Clark wrote:

> Good point.  What I need is two fasta files: one with the ofr regions
> masked, and one with the non-ofr regions masked.

This is a little bit of work, but pretty easy since you can fit whole
yeast chromosomes into memory.  I do it by figuring out what I want to
mask and then do:
 substr($chromseq,$start,$len,'N'x$len)

So you can just write a simple parser for the chromsomal_features.tab
while(<FILE> ){
  my ($feature,$gene,$sgdid, ... etc ) = split(/\t/,$_);
  # do the substr replace here
}

> There was another thing I wanted to do that I didn't mention before: how
> can I generate the reverse compliment of a whole genome file?

That's easy with emboss
% revseq FILE.fwd FILE.rev

With bioperl -- see the Sequence HOWTO in the howto section of the bioperl
website.  you want to use the revcom method in bioperl Bio::PrimarySeq
objects.

# change fasta to whatever format you have/want the sequences in
my $in = Bio::SeqIO->new(-file => 'filename', -format => 'fasta');
my $out = Bio::SeqIO->new(-file => '>filename.rev', -format => 'fasta');
while( my $s = $in->next_seq ) {
  $out->write_seq($s->revcom);
}


-jason
> On Feb 12, 2004, at 1:19 PM, Jason Stajich wrote:
>
> > You want these as a fasta file per orf and per non-orf region or just 2
> > datasets with the genome masked (all N's or lowercased)?
> >
> > -jason
> > On Thu, 12 Feb 2004, David Clark wrote:
> >
> >> Hello,
> >>
> >> I'm a relative newcomer to bioperl, and would like a point in the
> >> right
> >> direction.  I need to separate the yeast genome into two partial
> >> genomes--one with all ORF's, and one with everything else.  I have a
> >> tab delimited list of the ORF's with the coordinates, and can probably
> >> parse that myself, but I wanted to see if anyone could point me to
> >> some
> >> example code, or give me some place to start in separating genomes
> >> based on the coordinates.
> >>
> >> Thanks,
> >>
> >> David Clark
> >> dfclark@neo.tamu.edu
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From lstein at cshl.edu  Fri Feb 13 04:29:50 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Fri Feb 13 04:35:56 2004
Subject: [Bioperl-l] Fasta Genome Splice
In-Reply-To: <Pine.LNX.4.44.0402121211230.22886-100000@drizzle.com>
References: <Pine.LNX.4.44.0402121211230.22886-100000@drizzle.com>
Message-ID: <200402131129.50947.lstein@cshl.edu>

There is actually a one-liner for this.  You can find it in Jim 
Tisdall's "Beginning Bioinformatics" book, which I strongly recommend 
to anyone who wants to do basic bioinformatics tasks without learning 
Bioperl.

Lincoln

On Thursday 12 February 2004 10:29 pm, Ryan Kuykendall wrote:
> I'm sure there is a Perl module for generating the reverse
> compliment of a whole genome, but assuming you wanted to write the
> code from scratch:
>
> ## ...and assuming your genome file has been turned into an array
> of bases ## called @listOfBases;
>
> my $baseComplimentMap =
> {
>  'a' => 't',
>  'c' => 'g',
>  'g' => 'c',
>  't' => 'a',
> };
>
> my @baseComplimentList = ();
>
> foreach my $base ( @listOfBases )
> {
>     my $complimentBase = $baseComplimentMap->{$base};
>     push( @baseComplimentList, $complimentBase );
> }
>
> That would do it...
>
> ============================================================
> Ryan Kuykendall
> ryank@drizzle.com
>
> http://undef.com/ryank/ryanAtBawa50percent.JPG
> ============================================================
>
> On Thu, 12 Feb 2004, David Clark wrote:
> > Good point.  What I need is two fasta files: one with the ofr
> > regions masked, and one with the non-ofr regions masked.  There
> > was another thing I wanted to do that I didn't mention before:
> > how can I generate the reverse compliment of a whole genome file?
> >
> > On Feb 12, 2004, at 1:19 PM, Jason Stajich wrote:
> > > You want these as a fasta file per orf and per non-orf region
> > > or just 2 datasets with the genome masked (all N's or
> > > lowercased)?
> > >
> > > -jason
> > >
> > > On Thu, 12 Feb 2004, David Clark wrote:
> > >> Hello,
> > >>
> > >> I'm a relative newcomer to bioperl, and would like a point in
> > >> the right
> > >> direction.  I need to separate the yeast genome into two
> > >> partial genomes--one with all ORF's, and one with everything
> > >> else.  I have a tab delimited list of the ORF's with the
> > >> coordinates, and can probably parse that myself, but I wanted
> > >> to see if anyone could point me to some
> > >> example code, or give me some place to start in separating
> > >> genomes based on the coordinates.
> > >>
> > >> Thanks,
> > >>
> > >> David Clark
> > >> dfclark@neo.tamu.edu
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
From Annie.Law at nrc-cnrc.gc.ca  Fri Feb 13 11:53:04 2004
From: Annie.Law at nrc-cnrc.gc.ca (Law, Annie)
Date: Fri Feb 13 11:59:07 2004
Subject: [Bioperl-l] Locuslink parser
Message-ID: <10C94843061E094A98C02EB77CFC328722FE02@nrcmrdex1d.imsb.nrc.ca>

Hi Hilmar,

Thanks for your response.  

By what you're saying I think that my existing code would be able To access
the GO identifier.  If I look up the tagname molecular function then I will
get the value to be for example: Molecular function|ATP binding|GO:0005524.
The method that I can think of is to take this value and write some code To
parse the GO identifier out.  Is there a more direct method?

I used the test to test for term annotation or dbxref then if it was dbxref
I was able to get the primary id and the 
Database name. Thanks!  I am learning more about the objects I am using.  Do
you know if there is some doucmentation with Figures showing all of the
relationship of objects with Bio::Seq class eg relationship of Bio::Seq and
Bio::Annotation Collection among others. 

However, I am still unable to get all of the fields for example SUMFUNC( a
brief summary of the function of the products of this locus), ORGANISM, OMIM
etc...  I am not sure how to access these.  
It also seems if I use 
	foreach my $ann (@annotations) {
		if ($ann->isa("Bio::Ontology::TermI")) {
			# this is an ontology term as annotation
		}
		if ($ann->isa("Bio::Annotation::DBLink")) {
			# this is a dbxref annotation
		}
	}
I am filtering out some of the annotation types such as OFFICIAL_GENE_NAME,
CHR, OFFICIAL_SYMBOL, etc.. I only get GO information and DBLINK
information.  
If I use the following I will get the maximum number of annotation and
dbxref fields I have been able to extract so far. Is there another category
I am missing.  Better yet how do I find out what are the other missing
categories? Ie. Other than Bio::Ontology::TermI, or Bio::Annotation::DBLink

while (my $seq_obj=$io->next_seq()){
my $anno_collection = $seq_obj->annotation;

foreach my $key ($anno_collection->get_all_annotation_keys){
  my @annotations = $anno_collection->get_Annotations($key);
  foreach my $value (@annotations){
    print "tagname: ", $value->tagname, "\n";		
    # $value is an Bio::Annotation, and has an "as_text" method
    print " annotation value: ", $value->as_text, "\n"; 	

  }
}
}

**In the example you provided below I can see that all of the type
Bio::Ontology::TermI annotation types being Grouped and stuck in
@term_annotations but what is the $_-> for ? And why do you need the line
$seq->get_Annotations(); Below it? 
@term_annotations = map { $_->isa("Bio::Ontology::TermI"); } 
$seq->get_Annotations();

Thanks very much,
Annie.


-----Original Message-----
From: Hilmar Lapp [mailto:hlapp@gnf.org] 
Sent: Thursday, February 12, 2004 2:10 PM
To: Law, Annie
Cc: 'bioperl-l@bioperl.org'
Subject: Re: [Bioperl-l] Locuslink parser


On Thursday, February 12, 2004, at 09:46  AM, Law, Annie wrote:

> I am most intereste in obtaining the fields  locuslink id, GO id,
> accession number, unigene id.

The locuslink ID is the $seq->accession_number. GO should be there as 
term annotations, unigene ID and other accessions should be present as 
dbxref annotations.

You can test for an annotation being a term annotation or a dbxref:

	foreach my $ann (@annotations) {
		if ($ann->isa("Bio::Ontology::TermI")) {
			# this is an ontology term as annotation
		}
		if ($ann->isa("Bio::Annotation::DBLink")) {
			# this is a dbxref annotation
		}
	}

Using the map function you can easily filter for annotation types, for 
example:

	@term_annotations = map { $_->isa("Bio::Ontology::TermI"); } 
$seq->get_Annotations();

BTW if you want to get all annotations from a seq object, you can just 
say $seq->get_Annotations() and omit the key.

Hth,

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------

From brian_osborne at cognia.com  Fri Feb 13 12:17:25 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Fri Feb 13 12:23:59 2004
Subject: [Bioperl-l] Locuslink parser
In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE02@nrcmrdex1d.imsb.nrc.ca>
Message-ID: <GAEDKMGOKFBLJPKCLKCCGEGMDHAA.brian_osborne@cognia.com>

Annie,

>Do
>you know if there is some doucmentation with Figures showing all of the
>relationship of objects with Bio::Seq class eg relationship of Bio::Seq and
>Bio::Annotation Collection among others.

There are class diagrams available, either as DIA files in the models/
directory within the package or as PDF on the Web documentation page
(http://www.bioperl.org/Core/Latest/modules.html). There are also diagrams
in the Pasteur tutorial
(http://www.pasteur.fr/recherche/unites/sis/formation/bioperl).

Brian O.

-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Law, Annie
Sent: Friday, February 13, 2004 11:53 AM
To: 'Hilmar Lapp'
Cc: 'bioperl-l@bioperl.org'
Subject: RE: [Bioperl-l] Locuslink parser

Hi Hilmar,

Thanks for your response.

By what you're saying I think that my existing code would be able To access
the GO identifier.  If I look up the tagname molecular function then I will
get the value to be for example: Molecular function|ATP binding|GO:0005524.
The method that I can think of is to take this value and write some code To
parse the GO identifier out.  Is there a more direct method?

I used the test to test for term annotation or dbxref then if it was dbxref
I was able to get the primary id and the
Database name. Thanks!  I am learning more about the objects I am using.  Do
you know if there is some doucmentation with Figures showing all of the
relationship of objects with Bio::Seq class eg relationship of Bio::Seq and
Bio::Annotation Collection among others.

However, I am still unable to get all of the fields for example SUMFUNC( a
brief summary of the function of the products of this locus), ORGANISM, OMIM
etc...  I am not sure how to access these.
It also seems if I use
        foreach my $ann (@annotations) {
                if ($ann->isa("Bio::Ontology::TermI")) {
                        # this is an ontology term as annotation
                }
                if ($ann->isa("Bio::Annotation::DBLink")) {
                        # this is a dbxref annotation
                }
        }
I am filtering out some of the annotation types such as OFFICIAL_GENE_NAME,
CHR, OFFICIAL_SYMBOL, etc.. I only get GO information and DBLINK
information.
If I use the following I will get the maximum number of annotation and
dbxref fields I have been able to extract so far. Is there another category
I am missing.  Better yet how do I find out what are the other missing
categories? Ie. Other than Bio::Ontology::TermI, or Bio::Annotation::DBLink

while (my $seq_obj=$io->next_seq()){
my $anno_collection = $seq_obj->annotation;

foreach my $key ($anno_collection->get_all_annotation_keys){
  my @annotations = $anno_collection->get_Annotations($key);
  foreach my $value (@annotations){
    print "tagname: ", $value->tagname, "\n";
    # $value is an Bio::Annotation, and has an "as_text" method
    print " annotation value: ", $value->as_text, "\n";

  }
}
}

**In the example you provided below I can see that all of the type
Bio::Ontology::TermI annotation types being Grouped and stuck in
@term_annotations but what is the $_-> for ? And why do you need the line
$seq->get_Annotations(); Below it?
@term_annotations = map { $_->isa("Bio::Ontology::TermI"); }
$seq->get_Annotations();

Thanks very much,
Annie.


-----Original Message-----
From: Hilmar Lapp [mailto:hlapp@gnf.org]
Sent: Thursday, February 12, 2004 2:10 PM
To: Law, Annie
Cc: 'bioperl-l@bioperl.org'
Subject: Re: [Bioperl-l] Locuslink parser


On Thursday, February 12, 2004, at 09:46  AM, Law, Annie wrote:

> I am most intereste in obtaining the fields  locuslink id, GO id,
> accession number, unigene id.

The locuslink ID is the $seq->accession_number. GO should be there as
term annotations, unigene ID and other accessions should be present as
dbxref annotations.

You can test for an annotation being a term annotation or a dbxref:

        foreach my $ann (@annotations) {
                if ($ann->isa("Bio::Ontology::TermI")) {
                        # this is an ontology term as annotation
                }
                if ($ann->isa("Bio::Annotation::DBLink")) {
                        # this is a dbxref annotation
                }
        }

Using the map function you can easily filter for annotation types, for
example:

        @term_annotations = map { $_->isa("Bio::Ontology::TermI"); }
$seq->get_Annotations();

BTW if you want to get all annotations from a seq object, you can just
say $seq->get_Annotations() and omit the key.

Hth,

        -hilmar
--
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------

_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From hlapp at gnf.org  Fri Feb 13 14:28:18 2004
From: hlapp at gnf.org (Hilmar Lapp)
Date: Fri Feb 13 14:34:18 2004
Subject: [Bioperl-l] Locuslink parser
In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE02@nrcmrdex1d.imsb.nrc.ca>
Message-ID: <C6F96036-5E5A-11D8-BCB3-000A959EB4C4@gnf.org>


On Friday, February 13, 2004, at 08:53  AM, Law, Annie wrote:

> I am learning more about the objects I am using.  Do
> you know if there is some doucmentation with Figures showing all of the
> relationship of objects with Bio::Seq class eg relationship of 
> Bio::Seq and
> Bio::Annotation Collection among others.
>

Brian answered that, right?


> However, I am still unable to get all of the fields for example 
> SUMFUNC( a
> brief summary of the function of the products of this locus), 
> ORGANISM, OMIM
> etc...  I am not sure how to access these.

SUMFUNC becomes an annotation of type Bio::Annotation::SimpleValue, 
with a tag name of SUMFUNC. ORGANISM is a Bio::Species object available 
through $seq->species. OMIM references should be available as dbxrefs 
(Bio::Annotation::DBLink), possibly with the database renamed to 'MIM'.

There's I think not a good reference yet as to where which tag goes, 
but the bottom line is that almost every tag ends up as an annotation 
of some kind, with ORGANISM being a notable exception.

>
> It also seems if I use
> 	foreach my $ann (@annotations) {
> 		if ($ann->isa("Bio::Ontology::TermI")) {
> 			# this is an ontology term as annotation
> 		}
> 		if ($ann->isa("Bio::Annotation::DBLink")) {
> 			# this is a dbxref annotation
> 		}
> 	}
> I am filtering out some of the annotation types such as 
> OFFICIAL_GENE_NAME,
> CHR, OFFICIAL_SYMBOL, etc..

I'm not sure I understand what you mean. I just gave some examples for 
how to test what type an annotation is of. There are other types too 
than the two given in the example. The array you get from 
$seq->annotation->get_Annotations() does contain all and any annotation 
that has been associated with the sequence.

>  I only get GO information and DBLINK
> information.
> If I use the following I will get the maximum number of annotation and
> dbxref fields I have been able to extract so far. Is there another 
> category
> I am missing.  Better yet how do I find out what are the other missing
> categories? Ie. Other than Bio::Ontology::TermI, or 
> Bio::Annotation::DBLink
>

Check out Bio/Annotation/*.pm to see all theoretically possible types. 
The most important are DBLink, SimpleValue, OntologyTerm (which 
basically adapts a Bio::Ontology::TermI), Comment, and Reference. Note 
that Reference is not used by the locuslink parser at this point.

>
> **In the example you provided below I can see that all of the type
> Bio::Ontology::TermI annotation types being Grouped and stuck in
> @term_annotations but what is the $_-> for ? And why do you need the 
> line
> $seq->get_Annotations(); Below it?

It's perl syntax and in part obfuscated by my or your email reader 
introducing a line break after the closing curly brace. Checkout

	$ perldoc -f map

for documentation on how to use the map function. Now, using the map 
function in my example was in fact wrong, and calling get_Annotations() 
on a Bio::SeqI object also won't work. Sorry about these mistakes. 
Here's the corrected version:

	@term_anns = grep { $_->isa("Bio::Ontology::TermI"); } 
$seq->annotaton->get_Annotations();

(There was no linebreak above, but adding one won't bother perl.) 
Again, you can read about grep in perl by

	$ perdoc -f grep

-hilmar

> @term_annotations = map { $_->isa("Bio::Ontology::TermI"); }
> $seq->get_Annotations();
>
> Thanks very much,
> Annie.
>
>
>
> -----Original Message-----
> From: Hilmar Lapp [mailto:hlapp@gnf.org]
> Sent: Thursday, February 12, 2004 2:10 PM
> To: Law, Annie
> Cc: 'bioperl-l@bioperl.org'
> Subject: Re: [Bioperl-l] Locuslink parser
>
>
>
> On Thursday, February 12, 2004, at 09:46  AM, Law, Annie wrote:
>
>> I am most intereste in obtaining the fields  locuslink id, GO id,
>> accession number, unigene id.
>
> The locuslink ID is the $seq->accession_number. GO should be there as
> term annotations, unigene ID and other accessions should be present as
> dbxref annotations.
>
> You can test for an annotation being a term annotation or a dbxref:
>
> 	foreach my $ann (@annotations) {
> 		if ($ann->isa("Bio::Ontology::TermI")) {
> 			# this is an ontology term as annotation
> 		}
> 		if ($ann->isa("Bio::Annotation::DBLink")) {
> 			# this is a dbxref annotation
> 		}
> 	}
>
> Using the map function you can easily filter for annotation types, for
> example:
>
> 	@term_annotations = map { $_->isa("Bio::Ontology::TermI"); }
> $seq->get_Annotations();
>
> BTW if you want to get all annotations from a seq object, you can just
> say $seq->get_Annotations() and omit the key.
>
> Hth,
>
> 	-hilmar
> -- 
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From dag at sonsorol.org  Fri Feb 13 14:53:51 2004
From: dag at sonsorol.org (Chris Dagdigian)
Date: Fri Feb 13 14:59:44 2004
Subject: [Bioperl-l] Re: Bundle-Bioperl installation under Activestate
In-Reply-To: <20040213190801.31707.qmail@web10003.mail.yahoo.com>
References: <20040213190801.31707.qmail@web10003.mail.yahoo.com>
Message-ID: <402D2B4F.6000104@sonsorol.org>


Hi Jennifer,

I've never used perl on Windows - only under various Unix flavors and my 
personal systems are mostly Mac OS X or Linux these days.

I'm cc'ing this reply to the bioperl discussion list where I know there 
are active windows users of bioperl. We also have some .ppd files on our 
download site http://bioperl.org/DIST/ but since I've never used the 
ActiveState stuff I have very little clue about them!

A quick google search for io::string + ppd turnd up this link which may 
be helpful:

http://www.apache.org/dist/perl/win32-bin/ppms/IO-String.ppd

There seem to be several ppm archives on the net, IO-String is also 
found here apparently: 
http://www.online-mirror.org/apache/perl/win32-bin/ppms/

Good luck with the course!

Regards,
Chris


Jennifer Hsu wrote:
> Hi, Chris:
> My name is Jennifer, I am a student in BioPerl class at Foothill College, Los Altos, CA. Our class
> is trying to install bioperl. The entire class is encountering the problem of not being able to
> find IO-String. 
> 
> So I searched for IO-String and got 4 choices:
> ppm> search IO-String
> Searching in Active Repositories
> 1. IO-String <1.02> Emulate IO::File interface for in-core strings
> 2. IO-String <1.03> Emulate file interface for in-core strings
> 3. IO-String <1.04> Emulate file interface for in-core strings
> 4. IO-stringy <2.108> stringy - I/O on in-core objects like strings and ar~ 
> 
> - I tried to install 3, 2, 1 , but each time I got:
> PPD for 'IO-String.ppd' could not be found.
> It found that IO-stringy (item 4) is already installed in my system, but BioPerl is looking for
> IO-String.
> - I have ActivePerl 5.6.1.635. Is this build incompatible with Bioperl?
> - Please advise me, where can I find IO-String? What can I do to build this PPD. All the students
> in my class are stuck, Help!
> Thanks
>                      Jennifer
> 


-- 
Chris Dagdigian, <dag@sonsorol.org>
BioTeam  - Independent life science IT & informatics consulting
Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E iChat/AIM: bioteamdag  Web: http://bioteam.net
From barry.moore at genetics.utah.edu  Fri Feb 13 19:53:56 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Fri Feb 13 19:59:52 2004
Subject: [Bioperl-l] Re: Bundle-Bioperl installation under Activestate
In-Reply-To: <402D2B4F.6000104@sonsorol.org>
References: <20040213190801.31707.qmail@web10003.mail.yahoo.com>
	<402D2B4F.6000104@sonsorol.org>
Message-ID: <402D71A4.8050000@genetics.utah.edu>

Jennifer,

What repositories are you using with ppm (try typing rep at the ppm> 
prompt if you don't know).  I just reinstalled IO-String O.K. but it may 
have installed from a non-standard ppm repository.  I don't know why 
ActiveState wouldn't have IO-String, but Randy Kobes has a ppm 
collection that does.  Set him up as a repository, and you should be 
able to install.  Again from the ppm> promt type "rep add Kobes 
http://theoryx5.uwinnipeg.ca/ppms".  Then re-try your IO-Sting install.  
I'm using ActiveState Perl 5.8, but from what I hear on this list, 5.6 
should work just fine.

Good luck,

Barry Moore

Chris Dagdigian wrote:

>
> Hi Jennifer,
>
> I've never used perl on Windows - only under various Unix flavors and 
> my personal systems are mostly Mac OS X or Linux these days.
>
> I'm cc'ing this reply to the bioperl discussion list where I know 
> there are active windows users of bioperl. We also have some .ppd 
> files on our download site http://bioperl.org/DIST/ but since I've 
> never used the ActiveState stuff I have very little clue about them!
>
> A quick google search for io::string + ppd turnd up this link which 
> may be helpful:
>
> http://www.apache.org/dist/perl/win32-bin/ppms/IO-String.ppd
>
> There seem to be several ppm archives on the net, IO-String is also 
> found here apparently: 
> http://www.online-mirror.org/apache/perl/win32-bin/ppms/
>
> Good luck with the course!
>
> Regards,
> Chris
>
>
>
> Jennifer Hsu wrote:
>
>> Hi, Chris:
>> My name is Jennifer, I am a student in BioPerl class at Foothill 
>> College, Los Altos, CA. Our class
>> is trying to install bioperl. The entire class is encountering the 
>> problem of not being able to
>> find IO-String.
>> So I searched for IO-String and got 4 choices:
>> ppm> search IO-String
>> Searching in Active Repositories
>> 1. IO-String <1.02> Emulate IO::File interface for in-core strings
>> 2. IO-String <1.03> Emulate file interface for in-core strings
>> 3. IO-String <1.04> Emulate file interface for in-core strings
>> 4. IO-stringy <2.108> stringy - I/O on in-core objects like strings 
>> and ar~
>> - I tried to install 3, 2, 1 , but each time I got:
>> PPD for 'IO-String.ppd' could not be found.
>> It found that IO-stringy (item 4) is already installed in my system, 
>> but BioPerl is looking for
>> IO-String.
>> - I have ActivePerl 5.6.1.635. Is this build incompatible with Bioperl?
>> - Please advise me, where can I find IO-String? What can I do to 
>> build this PPD. All the students
>> in my class are stuck, Help!
>> Thanks
>>                      Jennifer
>>
>
>
>


From dfclark at neo.tamu.edu  Fri Feb 13 20:39:22 2004
From: dfclark at neo.tamu.edu (David Clark)
Date: Fri Feb 13 20:45:16 2004
Subject: [Bioperl-l] Fasta Genome Splice
In-Reply-To: <Pine.LNX.4.50.0402121526070.18208-100000@tenero.duhs.duke.edu>
References: <10C94843061E094A98C02EB77CFC328722FE01@nrcmrdex1d.imsb.nrc.ca>
	<9DCDC030-5D87-11D8-B24E-000A95E8AC0C@neo.tamu.edu>
	<Pine.LNX.4.50.0402121410410.18208-100000@tenero.duhs.duke.edu>
	<F28DEF29-5D95-11D8-8442-0030657E637C@neo.tamu.edu>
	<Pine.LNX.4.50.0402121526070.18208-100000@tenero.duhs.duke.edu>
Message-ID: <9D52E948-5E8E-11D8-859C-0030657E637C@neo.tamu.edu>

Thanks Jason, this is exactly what I needed.  I just took peek in 
Seq.pm to see how the sequence objects are implemented, used your 
example, and I'm ready to go.

David

On Feb 12, 2004, at 2:46 PM, Jason Stajich wrote:

> On Thu, 12 Feb 2004, David Clark wrote:
>
>> Good point.  What I need is two fasta files: one with the ofr regions
>> masked, and one with the non-ofr regions masked.
>
> This is a little bit of work, but pretty easy since you can fit whole
> yeast chromosomes into memory.  I do it by figuring out what I want to
> mask and then do:
>  substr($chromseq,$start,$len,'N'x$len)
>
> So you can just write a simple parser for the chromsomal_features.tab
> while(<FILE> ){
>   my ($feature,$gene,$sgdid, ... etc ) = split(/\t/,$_);
>   # do the substr replace here
> }
>
>> There was another thing I wanted to do that I didn't mention before: 
>> how
>> can I generate the reverse compliment of a whole genome file?
>
> That's easy with emboss
> % revseq FILE.fwd FILE.rev
>
> With bioperl -- see the Sequence HOWTO in the howto section of the 
> bioperl
> website.  you want to use the revcom method in bioperl Bio::PrimarySeq
> objects.
>
> # change fasta to whatever format you have/want the sequences in
> my $in = Bio::SeqIO->new(-file => 'filename', -format => 'fasta');
> my $out = Bio::SeqIO->new(-file => '>filename.rev', -format => 
> 'fasta');
> while( my $s = $in->next_seq ) {
>   $out->write_seq($s->revcom);
> }

From pedro21angus at hotmail.com  Sun Feb 15 08:21:34 2004
From: pedro21angus at hotmail.com (royce)
Date: Sun Feb 15 06:34:38 2004
Subject: [Bioperl-l] Forget V1AGRA, there's a new game in town!
Message-ID: <1076851294-936@excite.com>

Here is an fantastic way to please your lady.

You can be ready for up to thirty-six hours.

The results are far greater than any other product.

http://fastactingpills.com/sv/?pid=eph9106


action barrykiss dougie mikael kleenex ladybug wanker
cookies rambo1sailor front242 cannonda 
meow lloyd smiths 
From lstein at cshl.edu  Sun Feb 15 13:13:53 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Sun Feb 15 13:19:58 2004
Subject: [Bioperl-l] Creating imagemaps from Bio::Graphics (Was Re:
	Bio::Graphics questions)
In-Reply-To: <200402131128.00665.lstein@cshl.edu>
References: <89AA811FD79DC94788093B23DA79E71FD9A83F@edunivmail02.ad.umassmed.edu>
	<200402131128.00665.lstein@cshl.edu>
Message-ID: <200402152013.53586.lstein@cshl.edu>

Hi Nathan,

I've just committed a chunk of code to Bio::Graphics::Panel that 
should drastically simplify the task of generating a clickable 
imagemap from within a CGI script.  It should be pretty obvious what 
to do from the POD documentation.

Best,

Lincoln


On Friday 13 February 2004 11:28 am, Lincoln Stein wrote:
> Oh gee, I just answered that question on the bioperl mailing list
> and now I can't find it.  Maybe you can find it in the archive?
>
> In any case, since this is such a general and useful question, and
> the answer requires about a page of typing, I'm going to
> incorporate the answer into the next revision of the tutorial.  I
> think it needs some example code to go along with it.
>
> Regards,
>
> Lincoln
>
> On Thursday 12 February 2004 10:41 pm, Agrin, Nathan wrote:
> > Hey Lincoln,
> >
> > I talked to you a while back about some Bio::Graphics questions
> > and was hoping you could help me with one (or two) more.  I need
> > to generate HTML image maps using a dynamically created
> > Bio::Graphics image based off of blast reports.  I am still
> > somewhat new to CGI, so bare with me. Basically, I think my main
> > problems are first off, generating the image, and having the
> > browser display it.  I know you need to put the correct image/png
> > header in the script, but where will the image reside once it's
> > created, and how can direct the browser to that image?
> >
> > Also, I tried looking at the Generic Genome Browser for info on
> > creating HTML image maps, and could find none.  Can you point me
> > in the right direction, or send me an example script?
> >
> > Thanks in advance,
> > Nate Agrin
> >
> > Nathan Agrin
> > Research Associate
> > UMass Medical Center
> > 55 Lake Ave. N.
> > Worcester MA, 01655
> > (508)-856-6018
> > nathan.agrin@umassmed.edu

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
From darndt at treasurehouseimports.com  Sun Feb 15 14:24:55 2004
From: darndt at treasurehouseimports.com (David Arndt)
Date: Sun Feb 15 14:33:05 2004
Subject: [Bioperl-l] Does Bio::Tools::Glimmer only parse GlimmerM?
Message-ID: <A27EBEBA-5FEC-11D8-81B2-0003937E1360@treasurehouseimports.com>

Does anyone know whether Bio::Tools::Glimmer will parse results from 
the regular Glimmer (not GlimmerM) correctly?

Thanks

From pvh at egenetics.com  Sun Feb 15 15:29:04 2004
From: pvh at egenetics.com (Peter van Heusden)
Date: Sun Feb 15 15:34:58 2004
Subject: [Bioperl-l] Validating Bioperl
Message-ID: <402FD690.2030407@egenetics.com>

Hi BioPerl people

I have been hired by Electric Genetics to spend no less than 50% of my time "validating" Bioperl. What this means is that I'm slowly going through BioPerl, reviewing the code and documentation, and trying to ensure three related things:


1) The documentation clearly specifies input, output and exception 
conditions for the code.
2) The code complies with the documentation and behaves as expected.
3) The test suite exhaustively tests the code to ensure that 2 is true.

The goal of this 'validation' is to be able to offer some kind of 
assurance to our customers (who includ some big names in the 
pharmaceuticals field) that Bioperl is robust enough to be included 
without worry in their development process. Their fear surrounding open 
source tools is based on past experiences, particularly upgrading across 
various versions of operating systems and tools, and the slow tightening 
of FDA requirements for software included in any clinical development 
process.

The tangible output of the validation work will be:

  - improved code that is submitted back to the Bioperl CVS
  - new features, as requested by our pharma clients, that are implemented by EG and submitted to the Bioperl CVS
  - professional-grade documentation, which is provided to EG's customers as part of the Bioperl validation and support product on offer

Finally to give a bit of background: Electric Genetics is a 
bioinformatics software company based in South Africa and the USA. The 
name should be familiar to a number of BioPerl hackers - we've been 
around for some years and sponsored the first 'BioHackathon' (in Cape 
Town in 2002). We've been open source enthusiasts for years, and with 
this product can finally bridge the gap between our commerical reality 
and our open source aspirations.

Looking forward to lots of BioPerl hacking,
Peter


From wes.barris at csiro.au  Mon Feb 16 01:33:02 2004
From: wes.barris at csiro.au (Wes Barris)
Date: Mon Feb 16 01:39:02 2004
Subject: [Bioperl-l] Bioperl and ACE files
Message-ID: <4030641E.2000403@csiro.au>

Hi,

I have an ACE file that I am trying to process with bioperl.  A portion
of the ACE file looks like this:

AF CB429506 U 2
AF CB428704 U 6
AF CB430643 U 1
AF CB431187 U 0
AF CB430639 U -7
AF CB430480 C 24
AF CB430055 U 10

Notice the line in the middle that shows a starting position of '0'
(zero)?  When bioperl tries to process this sequence, an error is
thrown.  I have found the port of the bioperl code that throws the
error:
Bio/LocatableSeq.pm:
sub get_nse{
    my ($self,$char1,$char2) = @_;

    $char1 ||= "/";
    $char2 ||= "-";

    $self->throw("Attribute id not set") unless $self->id();
    $self->throw("Attribute start not set") unless $self->start();<-----
    $self->throw("Attribute end not set") unless $self->end();

    return $self->id() . $char1 . $self->start . $char2 . $self->end ;

}

Notice how "$self->start()" is tested.  When it encounters a sequence
whose start is set to zero, an error is thrown.

I don't know much about the ACE file format.  Do I have a questionable
ACE file or is this test incomplete?
-- 
Wes Barris
E-Mail: Wes.Barris@csiro.au


From lstein at cshl.edu  Mon Feb 16 04:05:43 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Mon Feb 16 04:11:41 2004
Subject: [Bioperl-l] Re: Bundle-Bioperl installation under Activestate
In-Reply-To: <402D2B4F.6000104@sonsorol.org>
References: <20040213190801.31707.qmail@web10003.mail.yahoo.com>
	<402D2B4F.6000104@sonsorol.org>
Message-ID: <200402161105.43148.lstein@cshl.edu>

Hi Jen,

Don't forget that you can also use CPAN for installing Perl modules on 
Windows:

	C:\ perl -MCPAN -e shell
	cpan> install IO::String

This will work as long as you are installing a "pure perl" module that 
doesn't need compilation.  Most of the CPAN modules fall into this 
category.

Lincoln

On Friday 13 February 2004 09:53 pm, Chris Dagdigian wrote:
> Hi Jennifer,
>
> I've never used perl on Windows - only under various Unix flavors
> and my personal systems are mostly Mac OS X or Linux these days.
>
> I'm cc'ing this reply to the bioperl discussion list where I know
> there are active windows users of bioperl. We also have some .ppd
> files on our download site http://bioperl.org/DIST/ but since I've
> never used the ActiveState stuff I have very little clue about
> them!
>
> A quick google search for io::string + ppd turnd up this link which
> may be helpful:
>
> http://www.apache.org/dist/perl/win32-bin/ppms/IO-String.ppd
>
> There seem to be several ppm archives on the net, IO-String is also
> found here apparently:
> http://www.online-mirror.org/apache/perl/win32-bin/ppms/
>
> Good luck with the course!
>
> Regards,
> Chris
>
> Jennifer Hsu wrote:
> > Hi, Chris:
> > My name is Jennifer, I am a student in BioPerl class at Foothill
> > College, Los Altos, CA. Our class is trying to install bioperl.
> > The entire class is encountering the problem of not being able to
> > find IO-String.
> >
> > So I searched for IO-String and got 4 choices:
> > ppm> search IO-String
> > Searching in Active Repositories
> > 1. IO-String <1.02> Emulate IO::File interface for in-core
> > strings 2. IO-String <1.03> Emulate file interface for in-core
> > strings 3. IO-String <1.04> Emulate file interface for in-core
> > strings 4. IO-stringy <2.108> stringy - I/O on in-core objects
> > like strings and ar~
> >
> > - I tried to install 3, 2, 1 , but each time I got:
> > PPD for 'IO-String.ppd' could not be found.
> > It found that IO-stringy (item 4) is already installed in my
> > system, but BioPerl is looking for IO-String.
> > - I have ActivePerl 5.6.1.635. Is this build incompatible with
> > Bioperl? - Please advise me, where can I find IO-String? What can
> > I do to build this PPD. All the students in my class are stuck,
> > Help!
> > Thanks
> >                      Jennifer

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
From jason at cgt.duhs.duke.edu  Mon Feb 16 07:46:41 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Mon Feb 16 07:52:35 2004
Subject: [Bioperl-l] Bioperl and ACE files
In-Reply-To: <4030641E.2000403@csiro.au>
References: <4030641E.2000403@csiro.au>
Message-ID: <Pine.LNX.4.50.0402160735290.23581-100000@tenero.duhs.duke.edu>

As always, more code and information as to how you got here makes it
easier for someone to answer.

Not really sure how you are getting to the point where you have created
Bio::LocatableSeq objects - presumably you are trying to do an assembly
so I'll guess you got there from Bio::Assembly::IO.

You may need to get help from Robson about what the format is supposed to
support.  A start of 0 is not really proper in Bioperl -
sequences/features start at 1 in our system, so the assembly code needs to
adjust for that.  presumably those numbers are offsets not actual start
positions so the parsing code may need some looking at.

-jason


On Mon, 16 Feb 2004, Wes Barris wrote:

> Hi,
>
> I have an ACE file that I am trying to process with bioperl.  A portion
> of the ACE file looks like this:
>
> AF CB429506 U 2
> AF CB428704 U 6
> AF CB430643 U 1
> AF CB431187 U 0
> AF CB430639 U -7
> AF CB430480 C 24
> AF CB430055 U 10
>
> Notice the line in the middle that shows a starting position of '0'
> (zero)?  When bioperl tries to process this sequence, an error is
> thrown.  I have found the port of the bioperl code that throws the
> error:
> Bio/LocatableSeq.pm:
> sub get_nse{
>     my ($self,$char1,$char2) = @_;
>
>     $char1 ||= "/";
>     $char2 ||= "-";
>
>     $self->throw("Attribute id not set") unless $self->id();
>     $self->throw("Attribute start not set") unless $self->start();<-----
>     $self->throw("Attribute end not set") unless $self->end();
>
>     return $self->id() . $char1 . $self->start . $char2 . $self->end ;
>
> }
>
> Notice how "$self->start()" is tested.  When it encounters a sequence
> whose start is set to zero, an error is thrown.
>
> I don't know much about the ACE file format.  Do I have a questionable
> ACE file or is this test incomplete?
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From katie21amelie at hotmail.com  Mon Feb 16 11:07:08 2004
From: katie21amelie at hotmail.com (cody)
Date: Mon Feb 16 09:20:28 2004
Subject: [Bioperl-l] Stronger than V1AGRA?!
Message-ID: <1076947628-9152@excite.com>

Here is an fantastic way to please your lady.

You can be ready for up to thirty-six hours.

The results are far greater than any other product.

http://fastactingpills.com/sv/?pid=eph9106


river boogieroxy cougars andre parrot ricky trek
robinhoo cookiesmaster1 roy olivier 
honda1 kiss hanson 
From billthebrute at yahoo.fr  Mon Feb 16 10:42:38 2004
From: billthebrute at yahoo.fr (=?iso-8859-1?q?william=20ritchie?=)
Date: Mon Feb 16 10:48:26 2004
Subject: [Bioperl-l] get sequence failure
Message-ID: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com>

Hi,

I m trying to retrieve a sequence with the following
code:

use Bio::SearchIO;
use Bio::Perl;
use Bio::SeqIO;

my $seq_object = get_sequence("nr","NT_039208.2");

and I m getting this error:

MSG: id does not exist
STACK Bio::DB::WebDBSeqI::get_Seq_by_id
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:155
STACK Bio::Perl::get_sequence
/usr/lib/perl5/site_perl/5.8.0/Bio/Perl.pm:513
STACK toplevel getseq.pl:9

this ID does exist, using the ncbi interface I can
retrieve it!
Help!!!!Please!!


Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout ! 
Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/
From jason at cgt.duhs.duke.edu  Mon Feb 16 13:09:33 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Mon Feb 16 13:15:23 2004
Subject: [Bioperl-l] get sequence failure
In-Reply-To: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com>
References: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com>
Message-ID: <Pine.LNX.4.50.0402161306530.25293-100000@tenero.duhs.duke.edu>

You need to use 'refseq' as the db type for refseq sequences ("NT", "NM")

'nr' is not a recognized type of seq db in the first place anyways...

I'm noticing for the documentation and code that the types are really
documented very well for get_sequence -- need someone to fix this please.

-jason

On Mon, 16 Feb 2004, [iso-8859-1] william ritchie wrote:

> Hi,
>
> I m trying to retrieve a sequence with the following
> code:
>
> use Bio::SearchIO;
> use Bio::Perl;
> use Bio::SeqIO;
>
> my $seq_object = get_sequence("nr","NT_039208.2");
>
> and I m getting this error:
>
> MSG: id does not exist
> STACK Bio::DB::WebDBSeqI::get_Seq_by_id
> /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:155
> STACK Bio::Perl::get_sequence
> /usr/lib/perl5/site_perl/5.8.0/Bio/Perl.pm:513
> STACK toplevel getseq.pl:9
>
> this ID does exist, using the ncbi interface I can
> retrieve it!
> Help!!!!Please!!
>
>
>
>
>
>
>
> Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout !
> Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu

From tex at biosysadmin.com  Mon Feb 16 17:49:04 2004
From: tex at biosysadmin.com (Tex Thompson)
Date: Mon Feb 16 17:44:31 2004
Subject: [Bioperl-l] Bug in GCG SeqIO Formatting?
Message-ID: <Pine.LNX.4.44.0402161732190.12085-100000@biosysadmin.net>

Hello Mailing List,

I have a user complaining that the following code isn't working on his
GCG-formatted sequence files:

#!/usr/bin/perl

use strict;

use Bio::SeqIO; 
my $io  = Bio::SeqIO->new( -file => "af317472.gbpln3", -format => "gcg");
my $out = Bio::SeqIO->new( -fh => \*STDOUT, -format => "fasta" );

while ( my $seq = $io->next_seq ) {
   $out->write_seq( $seq );
}

Here's an example sequence file:

!!NA_SEQUENCE 1.0
LOCUS       AF317472                2679 bp    DNA     linear   PLN 07-DEC-2000
DEFINITION  Candida albicans cAMP-dependent protein kinase regulatory subunit
            (PKA-R) gene, complete cds.
ACCESSION   AF317472
VERSION     AF317472.1  GI:11596392
KEYWORDS    .
SOURCE      Candida albicans
  ORGANISM  Candida albicans
            Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes;
            Saccharomycetales; mitosporic Saccharomycetales; Candida.
REFERENCE   1  (bases 1 to 2679)
  AUTHORS   Giasson,L. and Parrot,M.
  TITLE     Sequence of the Candida albicans cAMP-dependent protein kinase
            regulatory subunit
  JOURNAL   Unpublished
REFERENCE   2  (bases 1 to 2679)
  AUTHORS   Giasson,L. and Parrot,M.
  TITLE     Direct Submission
  JOURNAL   Submitted (27-OCT-2000) School of Dentistry, Laval University,
            GREB, Ste-Foy, Quebec G1K 7P4, Canada
FEATURES             Location/Qualifiers
     source          1. .2679
                     /organism="Candida albicans"
                     /mol_type="genomic DNA"
                     /strain="CAI4"
                     /db_xref="taxon:5476"
     gene            <977. .>2356
                     /gene="PKA-R"
     mRNA            <977. .>2356
                     /gene="PKA-R"
                     /product="cAMP-dependent protein kinase regulatory
                     subunit"
     CDS             977. .2356
                     /gene="PKA-R"
                     /codon_start=1
                     /transl_table=12
                     /product="cAMP-dependent protein kinase regulatory
                     subunit"
                     /protein_id="AAG38599.1"
                     /db_xref="GI:11596393"
                     /translation="MSNPQQQFISDELSQLQKEIISKNPQDVLQFCANYFNTKLQAQR
                     SELWSQQAKAEAAGIDLFPSVDHVNVNSSGVSIVNDRQPSFKSPFGVNDPHSNHDEDP
                     HAKDTKTDTAAAAVGGGIFKSNFDVKKSASNPPTKEVDPDDPSKPSSSSQPNQQSASA
                     SSKTPSSKIPVAFNANRRTSVSAEALNPAKLKLDSWKPPVNNLSITEEETLANNLKNN
                     FLFKQLDANSKKTVIAALQQKSFAKDTVIIQQGDEGDFFYIIETGTVDFYVNDAKVSS
                     SSEGSSFGELALMYNSPRAATAVAATDVVCWALDRLTFRRILLEGTFNKRLMYEDFLK
                     DIEVLKSLSDHARSKLADALSTEMYHKGDKIVTEGEQGENFYLIESGNCQVYNEKLGN
                     IKQLTKGDYFGELALIKDLPRQATVEALDNVIVATLGKSGFQRLLGPVVEVLKEQDPT
                     KSQDPTAGH"
ORIGIN

AF317472  Length: 2679  February 16, 2004 17:02  Type: N  Check: 9369  ..

       1  GAATTCAAAA AATCAAAAAA ATCAAAAAAA AACCGTGGAA GGTAAGTTGT 

      51  ATATTTATAA ATCAACGTGA ATAATTTTCA ACACTGTGTC AACATCTGTG 

     101  AAAAAAACCT GTGTGTACTG CATATAGGAC CTCACCTATT ACGTAGAATA 

     151  TACTAGAAAT AGTTACAACC ATAAAAAGAT TAATTGTGCT TACGTGGCAA 

     201  CTTTGAGATT TTTCTTTTTT CTGTTTCTTT CTTTCTTTTT TTGGCTTAAA 

     251  CAACAAATGT CGCAAATTAT ACAAACGACA TTTGCTGCCC ATGTCATTTT 

     301  GTCGTTATCA CGTGAAGTGT CGCAGATTTA TGTATTCTCA CTTCATTTCT 

     351  ATGGTCATCA ATTGTTCATT CATTCTCTAT CTTCAAAAAT CTGTGATTTG 

     401  ATGATTTTGA TTAAAAGAAA GCAAAGAGAA TACTGAAAAA AAGCAAAGAG 

     451  AATATAGAAA AGAAACAATA AAAGAATAGT TTCTAAGTTA CTTTGGAGTC 

     501  TGCTATTACC ATGTATCTAT GTGATTGCCC TATCAAATTG GACAATACGG 

     551  GTTTTTGTTT AGTCACGATA ATCACAAACT TCCCCCAGCA ATGACATACG 

     601  TAGCAAGTAA TATTTATATC TCTTCTATTT TTTTGATCTT ACATAATCTG 

     651  TCGTGTTTTT TTAAGTTGTT GTTATGAAGA AGTAATTTCA TAATGATCAA 

     701  GTGTGTAACT GAAATTTCAT CGCAATTTTA AACAAACAAG CTAATAATTA 

     751  TTATTATTAA TAGTTAATTT GCTAAGTTGA GTAAAATTTG CTTTTCTTGA 

     801  GAAAAAGGAG AAATTACTTT GGGAGTGAGT TTGAAGAGAG AAACTAAAGT 

     851  AAGTAAATGA GTGAGAGGGA GAGACAGAGA GCGAGAGGGG GAGTAAAAAA 

     901  AAAAGTTGCC CACAAACAAA TTGTGATACC GGTCTTTTAG CATATATCTT 

     951  CTACTCTTCA ATCAACATCT TTACCAATGT CTAATCCTCA ACAACAATTC 

    1001  ATATCTGATG AATTGTCGCA GTTACAGAAA GAAATAATTT CCAAAAACCC 

    1051  GCAAGATGTC TTACAGTTTT GCGCCAACTA TTTCAACACC AAGTTACAAG 

    1101  CTCAAAGAAG TGAGTTATGG TCGCAACAAG CTAAAGCAGA AGCCGCAGGC 

    1151  ATCGACTTAT TCCCATCTGT TGATCATGTG AATGTTAATT CTAGTGGTGT 

    1201  GAGCATTGTG AATGATAGAC AACCAAGTTT TAAATCACCT TTTGGTGTTA 

    1251  ATGATCCACA TCTGAATCAC GACGAAGATC CCCATGCCAA AGATACCAAA 

    1301  ACAGATACTG CTGCTGCTGC TGTTGGTGGG GGTATTTTCA AATCAAATTT 

    1351  TGATGTTAAA AAGAGTGCTT CTAATCCTCC AACCAAGGAA GTAGATCCAG 

    1401  ATGACCCATC AAAACCATCG TCATCGAGCC AACCAAATCA ACAATCAGCA 

    1451  TCAGCATCAT CAAAAACGCC ATCATCAAAG ATCCCAGTTG CTTTCAACGC 

    1501  TAATAGAAGA ACATCTGTAT CTGCTGAAGC CTTGAATCCA GCAAAATTGA 

    1551  AATTAGATAG TTGGAAACCT CCAGTTAATA ATTTGAGCAT TACCGAAGAA 

    1601  GAAACATTAG CCAACAATTT AAAGAACAAT TTCCTTTTCA AACAATTGGA 

    1651  CGCAAACTCT AAGAAAACTG TGATTGCTGC TTTACAACAA AAATCATTTG 

    1701  CTAAAGATAC AGTAATTATC CAACAAGGTG ATGAAGGGGA CTTTTTTTAC 

    1751  ATTATTGAAA CTGGTACAGT TGATTTCTAT GTTAATGATG CTAAAGTAAG 

    1801  TTCCAGTAGC GAAGGGTCAT CTTTTGGGGA ATTGGCTTTG ATGTATAATT 

    1851  CACCAAGAGC TGCTACGGCA GTTGCTGCCA CCGATGTTGT CTGTTGGGCA 

    1901  TTGGACCGTT TGACATTCCG TCGAATTCTT TTGGAAGGTA CTTTTAACAA 

    1951  GAGATTGATG TACGAGGATT TCTTAAAAGA TATTGAGGTT TTGAAATCTC 

    2001  TTTCGGATCA TGCACGTTCA AAATTGGCAG ATGCATTGAG CACAGAAATG 

    2051  TATCACAAGG GTGATAAAAT AGTCACTGAA GGTGAACAAG GAGAGAACTT 

    2101  TTATTTAATA GAAAGTGGAA ACTGTCAAGT TTACAATGAA AAGTTGGGCA 

    2151  ATATCAAACA ATTAACAAAA GGTGATTATT TTGGTGAGCT TGCATTAATA 

    2201  AAAGACTTAC CAAGACAAGC TACTGTGGAA GCATTGGATA ATGTAATCGT 

    2251  TGCCACATTA GGTAAATCCG GGTTCCAAAG ATTATTGGGT CCTGTTGTGG 

    2301  AGGTATTGAA AGAACAAGAC CCTACAAAGA GTCAAGACCC AACTGCTGGT 

    2351  CATTAAGTGT ACAATAAGTA GTTGTTTATT ATCTTATATT GTTTTATGTT 

    2401  AGTATATTCT ATCTTTTTTT TTTTGGCTTA CTCACCTTCT GGTGTTTTCG 

    2451  TTGCGATTTT GATAATGGAT GGTTGGTGCA AAAGTTCAAC TACATTTCTT 

    2501  GTTGTCAGGT ATATACGAGA TGGCAGCATG AACGAGCTCA CCATGGGTTG 

    2551  AACATTATTG AAGTTATCCG GCCGTGCCTT TTGCGAAACA TGGTAACTAA 

    2601  TATATTGCAA ACTTGGCTTC TACAGAAAAT ATACAATCTA ATACCTTGAG 

    2651  GAATTTCCTC TATATATAAT AGAGAATTC

I'm not a GCG expert, but is this a correctly formatted GCG file in the first
place? If not, is this an error in the SeqIO parser?  I've found this behavior
to be the same on Solaris 8 and on Linux, both running BioPerl 1.4 and Perl
5.8.1.

Thanks a bunch,

Tex Thompson
RIT Bioinformatics

From wes.barris at csiro.au  Mon Feb 16 18:19:02 2004
From: wes.barris at csiro.au (Wes Barris)
Date: Mon Feb 16 18:28:10 2004
Subject: [Bioperl-l] ace.pm
Message-ID: <40314FE6.60906@csiro.au>

Hi,

ACE files generated by an application called tgicl have "CO"
lines of the form:

CO CL15Contig2 794 4 0 U

This line is not parsed properly by the ace.pm bioperl module.
Notice this line from Bio/Assembly/IO/ace.pm .

         (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New 
contig found!

Bioperl expects the second "word" in the line to be "Contig\d+" where
the number is used as the "contigID".  Is there a reason why
"contigID" must be a number?  Why can't it be the whole second
"word" of the "CO" line?
-- 
Wes Barris
E-Mail: Wes.Barris@csiro.au


From jason at cgt.duhs.duke.edu  Mon Feb 16 21:14:28 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Mon Feb 16 21:20:20 2004
Subject: [Bioperl-l] ace.pm
In-Reply-To: <40314FE6.60906@csiro.au>
References: <40314FE6.60906@csiro.au>
Message-ID: <Pine.LNX.4.50.0402162110150.28176-100000@tenero.duhs.duke.edu>


People write code and modules to support the work they are doing,
sometimes for a specific data set - so I suspect Robson wrote this to
support phrap ace format which has a convention of them being ContigXX.

You are welcome to make changes to code on your local system to get it
working and then post the diffs so they can be incorporated back in.  Why
not try changing the code as you have noticed and seeing if it works.  It
is a collaborative project and these modules are newish, so give a try
fixing things and then getting feedback on your fixes.

-jason

On Tue, 17 Feb 2004, Wes Barris wrote:

> Hi,
>
> ACE files generated by an application called tgicl have "CO"
> lines of the form:
>
> CO CL15Contig2 794 4 0 U
>
> This line is not parsed properly by the ace.pm bioperl module.
> Notice this line from Bio/Assembly/IO/ace.pm .
>
>          (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New
> contig found!
>
> Bioperl expects the second "word" in the line to be "Contig\d+" where
> the number is used as the "contigID".  Is there a reason why
> "contigID" must be a number?  Why can't it be the whole second
> "word" of the "CO" line?
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From john.herbert at clinical-pharmacology.oxford.ac.uk  Thu Feb 12 06:50:45 2004
From: john.herbert at clinical-pharmacology.oxford.ac.uk (john herbert)
Date: Mon Feb 16 21:22:34 2004
Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM
Message-ID: <s02b68a5.072@gwmail.jr2.ox.ac.uk>

Hello BioPerl.
I was thinking of using BioPerl to calculate the Tm of my primers. I
looked from the documentation and found the method for getting a
primer's Tm. my $tm = $primer->Tm;

In the notes of this method, the author is confused by the fact that a
BioPerl calculated Tm never matches that of Primer3's prediction. This
is because Primer3 uses a different method to calculate a Primers TM
than the method used in the BioPerl.
 
 BioPerl $primer->Tm = Calculated using: Tm = 81.5 + 16.6(log10([Na+]))
+ .41*(%GC) - 600/length
(Sambrook, Fritsch and Maniatis, Molecular Cloning, p 11.46 (1989, CSHL
Press).
 
Primer3 primer TM = Primer3 uses the oligo melting temperature formula
given in Rychlik, Spencer and Rhoads, Nucleic Acids Research, vol 18,
num 12, pp 6409-6412 and Breslauer, Frank, Bloeker and Marky, Proc.
Natl. Acad. Sci. USA, vol 83, pp 3746-3750. This method uses
thermodynamics to calc TM.
 
Primer3 does use the Sambrook method to predict the TM of the PCR
products but not the Primer TM.
 
Would it be possible to update the BioPerl to match how Primer3
calculates the Primer TM?
 
Kind regards,
 
John Herbert
 
Cancer Research UK.

From walsh at cenix-bioscience.com  Mon Feb 16 11:16:14 2004
From: walsh at cenix-bioscience.com (Andrew Walsh)
Date: Mon Feb 16 21:26:43 2004
Subject: [Bioperl-l] get sequence failure
In-Reply-To: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com>
References: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com>
Message-ID: <4030ECCE.8090608@cenix-bioscience.com>

Hi,

I think the problem could be that you are trying to retrieve a contig 
(NT number). I remember Bio::DB::GenBank used to have problems with 
those, but I'm not sure about now.

And from reading the Bio::Perl::get_sequence POD, I see:

         get_sequence

         Title   : get_sequence
         Usage   : $seq_object = get_sequence('swiss',"ROA1_HUMAN");
...

         Args    : database type - one of swiss, embl, genbank or refseq
                   identifier or accession number


Are you sure 'nr' is a valid database type?

Cheers,

Andrew


william ritchie wrote:
> Hi,
> 
> I m trying to retrieve a sequence with the following
> code:
> 
> use Bio::SearchIO;
> use Bio::Perl;
> use Bio::SeqIO;
> 
> my $seq_object = get_sequence("nr","NT_039208.2");
> 
> and I m getting this error:
> 
> MSG: id does not exist
> STACK Bio::DB::WebDBSeqI::get_Seq_by_id
> /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:155
> STACK Bio::Perl::get_sequence
> /usr/lib/perl5/site_perl/5.8.0/Bio/Perl.pm:513
> STACK toplevel getseq.pl:9
> 
> this ID does exist, using the ncbi interface I can
> retrieve it!
> Help!!!!Please!!
> 
> 
> 
> 	
> 
> 	
> 		
> Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout ! 
> Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 


-- 
------------------------------------------------------------------
Andrew Walsh, M.Sc.
Bioinformatics Software Engineer
IT Unit
Cenix BioScience GmbH
Pfotenhauerstr. 108
01307 Dresden, Germany
Tel. +49(351)210-2699
Fax  +49(351)210-1309

public key: http://www.cenix-bioscience.com/public_keys/walsh.gpg
------------------------------------------------------------------


From wes.barris at csiro.au  Mon Feb 16 17:42:54 2004
From: wes.barris at csiro.au (Wes Barris)
Date: Mon Feb 16 21:26:44 2004
Subject: [Bioperl-l] Bioperl and ACE files
In-Reply-To: <Pine.LNX.4.50.0402160735290.23581-100000@tenero.duhs.duke.edu>
References: <Pine.LNX.4.50.0402160735290.23581-100000@tenero.duhs.duke.edu>
Message-ID: <4031476E.9020900@csiro.au>

Jason Stajich wrote:

> As always, more code and information as to how you got here makes it
> easier for someone to answer.

I have attached the perl script that I am using.
The sample ACE file is available here:

	http://www.livestockgenomics.csiro.au/junk.ace

You might run this script like this:

acetest.pl junk.ace outdir

> 
> Not really sure how you are getting to the point where you have created
> Bio::LocatableSeq objects - presumably you are trying to do an assembly
> so I'll guess you got there from Bio::Assembly::IO.
> 
> You may need to get help from Robson about what the format is supposed to
> support.  A start of 0 is not really proper in Bioperl -
> sequences/features start at 1 in our system, so the assembly code needs to
> adjust for that.  presumably those numbers are offsets not actual start
> positions so the parsing code may need some looking at.
> 
> -jason
> 
> 
> On Mon, 16 Feb 2004, Wes Barris wrote:
> 
>  > Hi,
>  >
>  > I have an ACE file that I am trying to process with bioperl.  A portion
>  > of the ACE file looks like this:
>  >
>  > AF CB429506 U 2
>  > AF CB428704 U 6
>  > AF CB430643 U 1
>  > AF CB431187 U 0
>  > AF CB430639 U -7
>  > AF CB430480 C 24
>  > AF CB430055 U 10
>  >
>  > Notice the line in the middle that shows a starting position of '0'
>  > (zero)?  When bioperl tries to process this sequence, an error is
>  > thrown.  I have found the port of the bioperl code that throws the
>  > error:
>  > Bio/LocatableSeq.pm:
>  > sub get_nse{
>  >     my ($self,$char1,$char2) = @_;
>  >
>  >     $char1 ||= "/";
>  >     $char2 ||= "-";
>  >
>  >     $self->throw("Attribute id not set") unless $self->id();
>  >     $self->throw("Attribute start not set") unless $self->start();<-----
>  >     $self->throw("Attribute end not set") unless $self->end();
>  >
>  >     return $self->id() . $char1 . $self->start . $char2 . $self->end ;
>  >
>  > }
>  >
>  > Notice how "$self->start()" is tested.  When it encounters a sequence
>  > whose start is set to zero, an error is thrown.
>  >
>  > I don't know much about the ACE file format.  Do I have a questionable
>  > ACE file or is this test incomplete?
>  >
> 
> -- 
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> 


-- 
Wes Barris
E-Mail: Wes.Barris@csiro.au
-------------- next part --------------
#!/usr/local/bin/perl -w
#
#
use strict;
use Bio::Assembly::IO;
use Bio::AlignIO;
use Bio::SeqIO;
#
my $usage = "Usage: $0 <infile.ace> <outdir>\n";
my $infile = shift or die $usage;
my $outdir = shift or die $usage;
my $prefix = 'cn';
my $ext = 'msf';
mkdir $outdir, 0755 if (! -d $outdir);
my $io = new Bio::Assembly::IO(-file=>$infile, -format=>'ace');
my $assembly = $io->next_assembly;	# Bio::Assembly::ScaffoldI

foreach my $contig ($assembly->all_contigs()) {	# Bio::Assembly::Contig
   my $contigName = $prefix.($contig->id);
#
# Write the consensus to a file.
#
   my $consensusSeq = new Bio::Seq(
			-seq=>$contig->get_consensus_sequence->seq,
			-id=>$contigName);
   my $seqout = new Bio::SeqIO(-file=>">$outdir/$contigName.fa", -format=>'fasta');
   $seqout->write_seq($consensusSeq);
#
# Make the consensus the first sequence of the simple align object.
#
   my $aln = new Bio::SimpleAlign();
   $aln->id('alignment.msf');
   $contig->get_consensus_sequence->id($contigName);
   $aln->add_seq($contig->get_consensus_sequence);
#
# Loop through each sequence in the contig adding it to the new alignment.
#
   foreach my $seq ($contig->each_seq) {
      my $id;
      if ($seq->display_id =~ /\|/) {
         my @junk = split(/[\|\.]/, $seq->display_id);
         $id = $junk[3];
         }
      else {
         $id = $seq->display_id;
         }
      my $lseq = new Bio::LocatableSeq(
				-seq=>$seq->seq,
				-id=>$id,
				-start=>$contig->get_seq_coord($seq)->start,
				-end=>$contig->get_seq_coord($seq)->end,
				);
      &alignSeq($lseq,$contig->get_consensus_sequence->length);
      $aln->add_seq($lseq);
      }
   $aln->set_displayname_flat;
   my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">$outdir/$contigName.$ext");
   $outstream->write_aln($aln);
   undef $outstream;
   }


exit;

sub alignSeq {
   my ($lseq, $cnlength) = @_;
#
# Clip any sequence that begins before the consensus.
#
   if ($lseq->start <= 0) {
      my $offset = -$lseq->start + 2;
      $lseq->seq($lseq->subseq($offset,$lseq->length));
      print($lseq->display_id," was clipped at the beginning by $offset\n");
      }
#
# Pad each sequence so it aligns with the consensus.
#
   my $before = $lseq->start - 1;
   my $after = $cnlength - $lseq->end;
   my $alignedSequence = '-' x $before . $lseq->seq . '-' x $after;
#
# Trim any sequence that extends beyond the consensus.
#
   if (length($alignedSequence) > $cnlength) {
      $alignedSequence = substr($alignedSequence, 0 ,$cnlength);
      print($lseq->display_id," was clipped at the end\n");
      }
   $lseq->seq($alignedSequence);

   return;
   }
From lhaifeng at dso.org.sg  Mon Feb 16 22:14:19 2004
From: lhaifeng at dso.org.sg (Liu Haifeng)
Date: Mon Feb 16 22:31:03 2004
Subject: [Bioperl-l] align and profile_align
Message-ID: <001a01c3f504$225cd790$706712ac@GENETHON>

Hi,

Anybody can help me understand the difference between the methods "align"
and "profile_align" when using Bio::Tools::Run::Alignment::Clustalw to align
multiple protein sequences?

If I have 5 protein sequences, would the 2 options below be same?
1. align the 5 sequences together
2. align the first 4 sequence, then profile_align the alignment obtained
with the last sequence together

I have tested the two options, the consensus_strings obtained are different.
So can anybody tell me how profile_align can be useful?   It seems that
profile_align may save some computation.

Thanks!

Haifeng Liu

From wes.barris at csiro.au  Mon Feb 16 22:35:49 2004
From: wes.barris at csiro.au (Wes Barris)
Date: Mon Feb 16 22:41:46 2004
Subject: [Bioperl-l] ace.pm
In-Reply-To: <Pine.LNX.4.50.0402162110150.28176-100000@tenero.duhs.duke.edu>
References: <Pine.LNX.4.50.0402162110150.28176-100000@tenero.duhs.duke.edu>
Message-ID: <40318C15.8020903@csiro.au>

Jason Stajich wrote:

> 
> People write code and modules to support the work they are doing,
> sometimes for a specific data set - so I suspect Robson wrote this to
> support phrap ace format which has a convention of them being ContigXX.
> 
> You are welcome to make changes to code on your local system to get it
> working and then post the diffs so they can be incorporated back in.  Why
> not try changing the code as you have noticed and seeing if it works.  It
> is a collaborative project and these modules are newish, so give a try
> fixing things and then getting feedback on your fixes.

I have modified one line in Bio/Assembly/IO/ace.pm as shown below:

         # Loading contig sequence (COntig sequence field)
#       (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New contig 
found!
         (/^CO (\w+) (\d+) (\d+) (\d+) (\w+)/) && do { # New contig found!

The change will cause the contigID to be whatever the second field of
this line is (CO CL15Contig1 794 4 0 U).  In this case, it would be
set to "CL15Contig1".


> 
> -jason
> 
> On Tue, 17 Feb 2004, Wes Barris wrote:
> 
>  > Hi,
>  >
>  > ACE files generated by an application called tgicl have "CO"
>  > lines of the form:
>  >
>  > CO CL15Contig2 794 4 0 U
>  >
>  > This line is not parsed properly by the ace.pm bioperl module.
>  > Notice this line from Bio/Assembly/IO/ace.pm .
>  >
>  >          (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New
>  > contig found!
>  >
>  > Bioperl expects the second "word" in the line to be "Contig\d+" where
>  > the number is used as the "contigID".  Is there a reason why
>  > "contigID" must be a number?  Why can't it be the whole second
>  > "word" of the "CO" line?
>  >
> 
> -- 
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> 


-- 
Wes Barris
E-Mail: Wes.Barris@csiro.au

From redwards at utmem.edu  Mon Feb 16 22:40:40 2004
From: redwards at utmem.edu (Rob Edwards)
Date: Mon Feb 16 22:46:24 2004
Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM
In-Reply-To: <s02b68a5.072@gwmail.jr2.ox.ac.uk>
References: <s02b68a5.072@gwmail.jr2.ox.ac.uk>
Message-ID: <0E84084E-60FB-11D8-B87B-000A959E1622@utmem.edu>

I wrote this implementation of the Bio::SeqFeature::Primer module, and 
I am guilty as charged. I was confused by the primer3 documentation, 
but hey, at least I was honest in the docs - I said that I couldn't 
figure out what was going on.

I don't know what the formula is that Primer3 used. If you have some 
code for this I can update the module.

Rob


On Feb 12, 2004, at 5:50 AM, john herbert wrote:

> Hello BioPerl.
> I was thinking of using BioPerl to calculate the Tm of my primers. I
> looked from the documentation and found the method for getting a
> primer's Tm. my $tm = $primer->Tm;
>
> In the notes of this method, the author is confused by the fact that a
> BioPerl calculated Tm never matches that of Primer3's prediction. This
> is because Primer3 uses a different method to calculate a Primers TM
> than the method used in the BioPerl.
>
>  BioPerl $primer->Tm = Calculated using: Tm = 81.5 + 16.6(log10([Na+]))
> + .41*(%GC) - 600/length
> (Sambrook, Fritsch and Maniatis, Molecular Cloning, p 11.46 (1989, CSHL
> Press).
>
> Primer3 primer TM = Primer3 uses the oligo melting temperature formula
> given in Rychlik, Spencer and Rhoads, Nucleic Acids Research, vol 18,
> num 12, pp 6409-6412 and Breslauer, Frank, Bloeker and Marky, Proc.
> Natl. Acad. Sci. USA, vol 83, pp 3746-3750. This method uses
> thermodynamics to calc TM.
>
> Primer3 does use the Sambrook method to predict the TM of the PCR
> products but not the Primer TM.
>
> Would it be possible to update the BioPerl to match how Primer3
> calculates the Primer TM?
>
> Kind regards,
>
> John Herbert
>
> Cancer Research UK.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From hlapp at gmx.net  Tue Feb 17 00:03:45 2004
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue Feb 17 00:09:33 2004
Subject: [Bioperl-l] Bug in GCG SeqIO Formatting?
In-Reply-To: <Pine.LNX.4.44.0402161732190.12085-100000@biosysadmin.net>
Message-ID: <AA27C97E-6106-11D8-846B-000A959EB4C4@gmx.net>

Rule #1: If your code doesn't work the way you think it should, or 
fails with an exception, and you do want help from the mailing list, 
then be sure to send along the *complete* output, in particular the 
stack trace if there was any.

Rule #2: Double check that you followed rule #1.

Rule #3: Check again that you followed rule #1.

There really aren't any other rules here. If you choose not to follow 
rule #1 you indicate that you're not actually interested in getting 
help.

	-hilmar

On Monday, February 16, 2004, at 02:49  PM, Tex Thompson wrote:

> Hello Mailing List,
>
> I have a user complaining that the following code isn't working on his
> GCG-formatted sequence files:
>
> #!/usr/bin/perl
>
> use strict;
>
> use Bio::SeqIO;
> my $io  = Bio::SeqIO->new( -file => "af317472.gbpln3", -format => 
> "gcg");
> my $out = Bio::SeqIO->new( -fh => \*STDOUT, -format => "fasta" );
>
> while ( my $seq = $io->next_seq ) {
>    $out->write_seq( $seq );
> }
>
> Here's an example sequence file:
>
> !!NA_SEQUENCE 1.0
> LOCUS       AF317472                2679 bp    DNA     linear   PLN 
> 07-DEC-2000
> DEFINITION  Candida albicans cAMP-dependent protein kinase regulatory 
> subunit
>             (PKA-R) gene, complete cds.
> ACCESSION   AF317472
> VERSION     AF317472.1  GI:11596392
> KEYWORDS    .
> SOURCE      Candida albicans
>   ORGANISM  Candida albicans
>             Eukaryota; Fungi; Ascomycota; Saccharomycotina; 
> Saccharomycetes;
>             Saccharomycetales; mitosporic Saccharomycetales; Candida.
> REFERENCE   1  (bases 1 to 2679)
>   AUTHORS   Giasson,L. and Parrot,M.
>   TITLE     Sequence of the Candida albicans cAMP-dependent protein 
> kinase
>             regulatory subunit
>   JOURNAL   Unpublished
> REFERENCE   2  (bases 1 to 2679)
>   AUTHORS   Giasson,L. and Parrot,M.
>   TITLE     Direct Submission
>   JOURNAL   Submitted (27-OCT-2000) School of Dentistry, Laval 
> University,
>             GREB, Ste-Foy, Quebec G1K 7P4, Canada
> FEATURES             Location/Qualifiers
>      source          1. .2679
>                      /organism="Candida albicans"
>                      /mol_type="genomic DNA"
>                      /strain="CAI4"
>                      /db_xref="taxon:5476"
>      gene            <977. .>2356
>                      /gene="PKA-R"
>      mRNA            <977. .>2356
>                      /gene="PKA-R"
>                      /product="cAMP-dependent protein kinase regulatory
>                      subunit"
>      CDS             977. .2356
>                      /gene="PKA-R"
>                      /codon_start=1
>                      /transl_table=12
>                      /product="cAMP-dependent protein kinase regulatory
>                      subunit"
>                      /protein_id="AAG38599.1"
>                      /db_xref="GI:11596393"
>                      
> /translation="MSNPQQQFISDELSQLQKEIISKNPQDVLQFCANYFNTKLQAQR
>                      
> SELWSQQAKAEAAGIDLFPSVDHVNVNSSGVSIVNDRQPSFKSPFGVNDPHSNHDEDP
>                      
> HAKDTKTDTAAAAVGGGIFKSNFDVKKSASNPPTKEVDPDDPSKPSSSSQPNQQSASA
>                      
> SSKTPSSKIPVAFNANRRTSVSAEALNPAKLKLDSWKPPVNNLSITEEETLANNLKNN
>                      
> FLFKQLDANSKKTVIAALQQKSFAKDTVIIQQGDEGDFFYIIETGTVDFYVNDAKVSS
>                      
> SSEGSSFGELALMYNSPRAATAVAATDVVCWALDRLTFRRILLEGTFNKRLMYEDFLK
>                      
> DIEVLKSLSDHARSKLADALSTEMYHKGDKIVTEGEQGENFYLIESGNCQVYNEKLGN
>                      
> IKQLTKGDYFGELALIKDLPRQATVEALDNVIVATLGKSGFQRLLGPVVEVLKEQDPT
>                      KSQDPTAGH"
> ORIGIN
>
> AF317472  Length: 2679  February 16, 2004 17:02  Type: N  Check: 9369  
> ..
>
>        1  GAATTCAAAA AATCAAAAAA ATCAAAAAAA AACCGTGGAA GGTAAGTTGT
>
>       51  ATATTTATAA ATCAACGTGA ATAATTTTCA ACACTGTGTC AACATCTGTG
>
>      101  AAAAAAACCT GTGTGTACTG CATATAGGAC CTCACCTATT ACGTAGAATA
>
>      151  TACTAGAAAT AGTTACAACC ATAAAAAGAT TAATTGTGCT TACGTGGCAA
>
>      201  CTTTGAGATT TTTCTTTTTT CTGTTTCTTT CTTTCTTTTT TTGGCTTAAA
>
>      251  CAACAAATGT CGCAAATTAT ACAAACGACA TTTGCTGCCC ATGTCATTTT
>
>      301  GTCGTTATCA CGTGAAGTGT CGCAGATTTA TGTATTCTCA CTTCATTTCT
>
>      351  ATGGTCATCA ATTGTTCATT CATTCTCTAT CTTCAAAAAT CTGTGATTTG
>
>      401  ATGATTTTGA TTAAAAGAAA GCAAAGAGAA TACTGAAAAA AAGCAAAGAG
>
>      451  AATATAGAAA AGAAACAATA AAAGAATAGT TTCTAAGTTA CTTTGGAGTC
>
>      501  TGCTATTACC ATGTATCTAT GTGATTGCCC TATCAAATTG GACAATACGG
>
>      551  GTTTTTGTTT AGTCACGATA ATCACAAACT TCCCCCAGCA ATGACATACG
>
>      601  TAGCAAGTAA TATTTATATC TCTTCTATTT TTTTGATCTT ACATAATCTG
>
>      651  TCGTGTTTTT TTAAGTTGTT GTTATGAAGA AGTAATTTCA TAATGATCAA
>
>      701  GTGTGTAACT GAAATTTCAT CGCAATTTTA AACAAACAAG CTAATAATTA
>
>      751  TTATTATTAA TAGTTAATTT GCTAAGTTGA GTAAAATTTG CTTTTCTTGA
>
>      801  GAAAAAGGAG AAATTACTTT GGGAGTGAGT TTGAAGAGAG AAACTAAAGT
>
>      851  AAGTAAATGA GTGAGAGGGA GAGACAGAGA GCGAGAGGGG GAGTAAAAAA
>
>      901  AAAAGTTGCC CACAAACAAA TTGTGATACC GGTCTTTTAG CATATATCTT
>
>      951  CTACTCTTCA ATCAACATCT TTACCAATGT CTAATCCTCA ACAACAATTC
>
>     1001  ATATCTGATG AATTGTCGCA GTTACAGAAA GAAATAATTT CCAAAAACCC
>
>     1051  GCAAGATGTC TTACAGTTTT GCGCCAACTA TTTCAACACC AAGTTACAAG
>
>     1101  CTCAAAGAAG TGAGTTATGG TCGCAACAAG CTAAAGCAGA AGCCGCAGGC
>
>     1151  ATCGACTTAT TCCCATCTGT TGATCATGTG AATGTTAATT CTAGTGGTGT
>
>     1201  GAGCATTGTG AATGATAGAC AACCAAGTTT TAAATCACCT TTTGGTGTTA
>
>     1251  ATGATCCACA TCTGAATCAC GACGAAGATC CCCATGCCAA AGATACCAAA
>
>     1301  ACAGATACTG CTGCTGCTGC TGTTGGTGGG GGTATTTTCA AATCAAATTT
>
>     1351  TGATGTTAAA AAGAGTGCTT CTAATCCTCC AACCAAGGAA GTAGATCCAG
>
>     1401  ATGACCCATC AAAACCATCG TCATCGAGCC AACCAAATCA ACAATCAGCA
>
>     1451  TCAGCATCAT CAAAAACGCC ATCATCAAAG ATCCCAGTTG CTTTCAACGC
>
>     1501  TAATAGAAGA ACATCTGTAT CTGCTGAAGC CTTGAATCCA GCAAAATTGA
>
>     1551  AATTAGATAG TTGGAAACCT CCAGTTAATA ATTTGAGCAT TACCGAAGAA
>
>     1601  GAAACATTAG CCAACAATTT AAAGAACAAT TTCCTTTTCA AACAATTGGA
>
>     1651  CGCAAACTCT AAGAAAACTG TGATTGCTGC TTTACAACAA AAATCATTTG
>
>     1701  CTAAAGATAC AGTAATTATC CAACAAGGTG ATGAAGGGGA CTTTTTTTAC
>
>     1751  ATTATTGAAA CTGGTACAGT TGATTTCTAT GTTAATGATG CTAAAGTAAG
>
>     1801  TTCCAGTAGC GAAGGGTCAT CTTTTGGGGA ATTGGCTTTG ATGTATAATT
>
>     1851  CACCAAGAGC TGCTACGGCA GTTGCTGCCA CCGATGTTGT CTGTTGGGCA
>
>     1901  TTGGACCGTT TGACATTCCG TCGAATTCTT TTGGAAGGTA CTTTTAACAA
>
>     1951  GAGATTGATG TACGAGGATT TCTTAAAAGA TATTGAGGTT TTGAAATCTC
>
>     2001  TTTCGGATCA TGCACGTTCA AAATTGGCAG ATGCATTGAG CACAGAAATG
>
>     2051  TATCACAAGG GTGATAAAAT AGTCACTGAA GGTGAACAAG GAGAGAACTT
>
>     2101  TTATTTAATA GAAAGTGGAA ACTGTCAAGT TTACAATGAA AAGTTGGGCA
>
>     2151  ATATCAAACA ATTAACAAAA GGTGATTATT TTGGTGAGCT TGCATTAATA
>
>     2201  AAAGACTTAC CAAGACAAGC TACTGTGGAA GCATTGGATA ATGTAATCGT
>
>     2251  TGCCACATTA GGTAAATCCG GGTTCCAAAG ATTATTGGGT CCTGTTGTGG
>
>     2301  AGGTATTGAA AGAACAAGAC CCTACAAAGA GTCAAGACCC AACTGCTGGT
>
>     2351  CATTAAGTGT ACAATAAGTA GTTGTTTATT ATCTTATATT GTTTTATGTT
>
>     2401  AGTATATTCT ATCTTTTTTT TTTTGGCTTA CTCACCTTCT GGTGTTTTCG
>
>     2451  TTGCGATTTT GATAATGGAT GGTTGGTGCA AAAGTTCAAC TACATTTCTT
>
>     2501  GTTGTCAGGT ATATACGAGA TGGCAGCATG AACGAGCTCA CCATGGGTTG
>
>     2551  AACATTATTG AAGTTATCCG GCCGTGCCTT TTGCGAAACA TGGTAACTAA
>
>     2601  TATATTGCAA ACTTGGCTTC TACAGAAAAT ATACAATCTA ATACCTTGAG
>
>     2651  GAATTTCCTC TATATATAAT AGAGAATTC
>
> I'm not a GCG expert, but is this a correctly formatted GCG file in 
> the first
> place? If not, is this an error in the SeqIO parser?  I've found this 
> behavior
> to be the same on Solaris 8 and on Linux, both running BioPerl 1.4 and 
> Perl
> 5.8.1.
>
> Thanks a bunch,
>
> Tex Thompson
> RIT Bioinformatics
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From tex at biosysadmin.com  Tue Feb 17 01:21:51 2004
From: tex at biosysadmin.com (Tex Thompson)
Date: Tue Feb 17 01:17:14 2004
Subject: [Bioperl-l] Bug in GCG SeqIO Formatting?
In-Reply-To: <AA27C97E-6106-11D8-846B-000A959EB4C4@gmx.net>
Message-ID: <Pine.LNX.4.44.0402170115070.6653-100000@biosysadmin.net>

Hilmar,

Thanks for the tip. There are no stack errors, but here is the output from the
test program shown below:


>AF317472 !!NA_SEQUENCE 1.0LOCUS       AF317472                2679 bp    DNA     linear   PLN 07-DEC-2000DEFINITION  Candida albicans cAMP-dependent protein kinase regulatory subunit            (PKA-R) gene, complete cds.ACCESSION   AF317472VERSION     AF317472.1  GI:11596392KEYWORDS    .SOURCE      Candida albicans  ORGANISM  Candida albicans            Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes;            Saccharomycetales; mitosporic Saccharomycetales; Candida.REFERENCE   1  (bases 1 to 2679)  AUTHORS   Giasson,L. and Parrot,M.  TITLE     Sequence of the Candida albicans cAMP-dependent protein kinase            regulatory subunit  JOURNAL   UnpublishedREFERENCE   2  (bases 1 to 2679)  AUTHORS   Giasson,L. and Parrot,M.  TITLE     Direct Submission  JOURNAL   Submitted (27-OCT-2000) School of Dentistry, Laval University,            GREB, Ste-Foy, Quebec G1K 7P4, CanadaFEATURES             Location/Qualifiers     source          1. .2679               
       /organism="Candida albicans"                     /mol_type="genomic DNA"                     /strain="CAI4"                     /db_xref="taxon:5476"     gene            <977. .>2356                     /gene="PKA-R"     mRNA            <977. .>2356                     /gene="PKA-R"                     /product="cAMP-dependent protein kinase regulatory                     subunit"     CDS             977. .2356                     /gene="PKA-R"                     /codon_start=1                     /transl_table=12                     /product="cAMP-dependent protein kinase regulatory                     subunit"                     /protein_id="AAG38599.1"                     /db_xref="GI:11596393"                     /translation="MSNPQQQFISDELSQLQKEIISKNPQDVLQFCANYFNTKLQAQR                     SELWSQQAKAEAAGIDLFPSVDHVNVNSSGVSIVNDRQPSFKSPFGVNDPHSNHDEDP                     HAKDTKTDTAAAAVGGGIFKSNFDVKKSASNPPTKEVDPDDPSKPSSSSQPNQQSASA                     SSKTPSSKIPVAFNANR
 RTSVSAEALNPAKLKLDSWKPPVNNLSITEEETLANNLKNN                     FLFKQLDANSKKTVIAALQQKSFAKDTVIIQQGDEGDFFYIIETGTVDFYVNDAKVSS                     SSEGSSFGELALMYNSPRAATAVAATDVVCWALDRLTFRRILLEGTFNKRLMYEDFLK                     DIEVLKSLSDHARSKLADALSTEMYHKGDKIVTEGEQGENFYLIESGNCQVYNEKLGN                     IKQLTKGDYFGELALIKDLPRQATVEALDNVIVATLGKSGFQRLLGPVVEVLKEQDPT                     KSQDPTAGH"ORIGIN
GAATTCAAAAAATCAAAAAAATCAAAAAAAAACCGTGGAAGGTAAGTTGTATATTTATAA
ATCAACGTGAATAATTTTCAACACTGTGTCAACATCTGTGAAAAAAACCTGTGTGTACTG
CATATAGGACCTCACCTATTACGTAGAATATACTAGAAATAGTTACAACCATAAAAAGAT
TAATTGTGCTTACGTGGCAACTTTGAGATTTTTCTTTTTTCTGTTTCTTTCTTTCTTTTT
TTGGCTTAAACAACAAATGTCGCAAATTATACAAACGACATTTGCTGCCCATGTCATTTT
GTCGTTATCACGTGAAGTGTCGCAGATTTATGTATTCTCACTTCATTTCTATGGTCATCA
ATTGTTCATTCATTCTCTATCTTCAAAAATCTGTGATTTGATGATTTTGATTAAAAGAAA
GCAAAGAGAATACTGAAAAAAAGCAAAGAGAATATAGAAAAGAAACAATAAAAGAATAGT
TTCTAAGTTACTTTGGAGTCTGCTATTACCATGTATCTATGTGATTGCCCTATCAAATTG
GACAATACGGGTTTTTGTTTAGTCACGATAATCACAAACTTCCCCCAGCAATGACATACG
TAGCAAGTAATATTTATATCTCTTCTATTTTTTTGATCTTACATAATCTGTCGTGTTTTT
TTAAGTTGTTGTTATGAAGAAGTAATTTCATAATGATCAAGTGTGTAACTGAAATTTCAT
CGCAATTTTAAACAAACAAGCTAATAATTATTATTATTAATAGTTAATTTGCTAAGTTGA
GTAAAATTTGCTTTTCTTGAGAAAAAGGAGAAATTACTTTGGGAGTGAGTTTGAAGAGAG
AAACTAAAGTAAGTAAATGAGTGAGAGGGAGAGACAGAGAGCGAGAGGGGGAGTAAAAAA
AAAAGTTGCCCACAAACAAATTGTGATACCGGTCTTTTAGCATATATCTTCTACTCTTCA
ATCAACATCTTTACCAATGTCTAATCCTCAACAACAATTCATATCTGATGAATTGTCGCA
GTTACAGAAAGAAATAATTTCCAAAAACCCGCAAGATGTCTTACAGTTTTGCGCCAACTA
TTTCAACACCAAGTTACAAGCTCAAAGAAGTGAGTTATGGTCGCAACAAGCTAAAGCAGA
AGCCGCAGGCATCGACTTATTCCCATCTGTTGATCATGTGAATGTTAATTCTAGTGGTGT
GAGCATTGTGAATGATAGACAACCAAGTTTTAAATCACCTTTTGGTGTTAATGATCCACA
TCTGAATCACGACGAAGATCCCCATGCCAAAGATACCAAAACAGATACTGCTGCTGCTGC
TGTTGGTGGGGGTATTTTCAAATCAAATTTTGATGTTAAAAAGAGTGCTTCTAATCCTCC
AACCAAGGAAGTAGATCCAGATGACCCATCAAAACCATCGTCATCGAGCCAACCAAATCA
ACAATCAGCATCAGCATCATCAAAAACGCCATCATCAAAGATCCCAGTTGCTTTCAACGC
TAATAGAAGAACATCTGTATCTGCTGAAGCCTTGAATCCAGCAAAATTGAAATTAGATAG
TTGGAAACCTCCAGTTAATAATTTGAGCATTACCGAAGAAGAAACATTAGCCAACAATTT
AAAGAACAATTTCCTTTTCAAACAATTGGACGCAAACTCTAAGAAAACTGTGATTGCTGC
TTTACAACAAAAATCATTTGCTAAAGATACAGTAATTATCCAACAAGGTGATGAAGGGGA
CTTTTTTTACATTATTGAAACTGGTACAGTTGATTTCTATGTTAATGATGCTAAAGTAAG
TTCCAGTAGCGAAGGGTCATCTTTTGGGGAATTGGCTTTGATGTATAATTCACCAAGAGC
TGCTACGGCAGTTGCTGCCACCGATGTTGTCTGTTGGGCATTGGACCGTTTGACATTCCG
TCGAATTCTTTTGGAAGGTACTTTTAACAAGAGATTGATGTACGAGGATTTCTTAAAAGA
TATTGAGGTTTTGAAATCTCTTTCGGATCATGCACGTTCAAAATTGGCAGATGCATTGAG
CACAGAAATGTATCACAAGGGTGATAAAATAGTCACTGAAGGTGAACAAGGAGAGAACTT
TTATTTAATAGAAAGTGGAAACTGTCAAGTTTACAATGAAAAGTTGGGCAATATCAAACA
ATTAACAAAAGGTGATTATTTTGGTGAGCTTGCATTAATAAAAGACTTACCAAGACAAGC
TACTGTGGAAGCATTGGATAATGTAATCGTTGCCACATTAGGTAAATCCGGGTTCCAAAG
ATTATTGGGTCCTGTTGTGGAGGTATTGAAAGAACAAGACCCTACAAAGAGTCAAGACCC
AACTGCTGGTCATTAAGTGTACAATAAGTAGTTGTTTATTATCTTATATTGTTTTATGTT
AGTATATTCTATCTTTTTTTTTTTGGCTTACTCACCTTCTGGTGTTTTCGTTGCGATTTT
GATAATGGATGGTTGGTGCAAAAGTTCAACTACATTTCTTGTTGTCAGGTATATACGAGA
TGGCAGCATGAACGAGCTCACCATGGGTTGAACATTATTGAAGTTATCCGGCCGTGCCTT
TTGCGAAACATGGTAACTAATATATTGCAAACTTGGCTTCTACAGAAAATATACAATCTA
ATACCTTGAGGAATTTCCTCTATATATAATAGAGAATTC

It looks like a lot of the header information is all stuck on 
that first line. Looking at it more carefully it looks like a 
valid FASTA file, but is this really desired behavior?

Thanks for the help,

Tex Thompson
RIT Bioinformatics

On Mon, 16 Feb 2004, Hilmar Lapp wrote:

> Rule #1: If your code doesn't work the way you think it should, or 
> fails with an exception, and you do want help from the mailing list, 
> then be sure to send along the *complete* output, in particular the 
> stack trace if there was any.
> 
> Rule #2: Double check that you followed rule #1.
> 
> Rule #3: Check again that you followed rule #1.
> 
> There really aren't any other rules here. If you choose not to follow 
> rule #1 you indicate that you're not actually interested in getting 
> help.
> 
> 	-hilmar
> 
> On Monday, February 16, 2004, at 02:49  PM, Tex Thompson wrote:
> 
> > Hello Mailing List,
> >
> > I have a user complaining that the following code isn't working on his
> > GCG-formatted sequence files:
> >
> > #!/usr/bin/perl
> >
> > use strict;
> >
> > use Bio::SeqIO;
> > my $io  = Bio::SeqIO->new( -file => "af317472.gbpln3", -format => 
> > "gcg");
> > my $out = Bio::SeqIO->new( -fh => \*STDOUT, -format => "fasta" );
> >
> > while ( my $seq = $io->next_seq ) {
> >    $out->write_seq( $seq );
> > }
> >
> > Here's an example sequence file:
> >
> > !!NA_SEQUENCE 1.0
> > LOCUS       AF317472                2679 bp    DNA     linear   PLN 
> > 07-DEC-2000
> > DEFINITION  Candida albicans cAMP-dependent protein kinase regulatory 
> > subunit
> >             (PKA-R) gene, complete cds.
> > ACCESSION   AF317472
> > VERSION     AF317472.1  GI:11596392
> > KEYWORDS    .
> > SOURCE      Candida albicans
> >   ORGANISM  Candida albicans
> >             Eukaryota; Fungi; Ascomycota; Saccharomycotina; 
> > Saccharomycetes;
> >             Saccharomycetales; mitosporic Saccharomycetales; Candida.
> > REFERENCE   1  (bases 1 to 2679)
> >   AUTHORS   Giasson,L. and Parrot,M.
> >   TITLE     Sequence of the Candida albicans cAMP-dependent protein 
> > kinase
> >             regulatory subunit
> >   JOURNAL   Unpublished
> > REFERENCE   2  (bases 1 to 2679)
> >   AUTHORS   Giasson,L. and Parrot,M.
> >   TITLE     Direct Submission
> >   JOURNAL   Submitted (27-OCT-2000) School of Dentistry, Laval 
> > University,
> >             GREB, Ste-Foy, Quebec G1K 7P4, Canada
> > FEATURES             Location/Qualifiers
> >      source          1. .2679
> >                      /organism="Candida albicans"
> >                      /mol_type="genomic DNA"
> >                      /strain="CAI4"
> >                      /db_xref="taxon:5476"
> >      gene            <977. .>2356
> >                      /gene="PKA-R"
> >      mRNA            <977. .>2356
> >                      /gene="PKA-R"
> >                      /product="cAMP-dependent protein kinase regulatory
> >                      subunit"
> >      CDS             977. .2356
> >                      /gene="PKA-R"
> >                      /codon_start=1
> >                      /transl_table=12
> >                      /product="cAMP-dependent protein kinase regulatory
> >                      subunit"
> >                      /protein_id="AAG38599.1"
> >                      /db_xref="GI:11596393"
> >                      
> > /translation="MSNPQQQFISDELSQLQKEIISKNPQDVLQFCANYFNTKLQAQR
> >                      
> > SELWSQQAKAEAAGIDLFPSVDHVNVNSSGVSIVNDRQPSFKSPFGVNDPHSNHDEDP
> >                      
> > HAKDTKTDTAAAAVGGGIFKSNFDVKKSASNPPTKEVDPDDPSKPSSSSQPNQQSASA
> >                      
> > SSKTPSSKIPVAFNANRRTSVSAEALNPAKLKLDSWKPPVNNLSITEEETLANNLKNN
> >                      
> > FLFKQLDANSKKTVIAALQQKSFAKDTVIIQQGDEGDFFYIIETGTVDFYVNDAKVSS
> >                      
> > SSEGSSFGELALMYNSPRAATAVAATDVVCWALDRLTFRRILLEGTFNKRLMYEDFLK
> >                      
> > DIEVLKSLSDHARSKLADALSTEMYHKGDKIVTEGEQGENFYLIESGNCQVYNEKLGN
> >                      
> > IKQLTKGDYFGELALIKDLPRQATVEALDNVIVATLGKSGFQRLLGPVVEVLKEQDPT
> >                      KSQDPTAGH"
> > ORIGIN
> >
> > AF317472  Length: 2679  February 16, 2004 17:02  Type: N  Check: 9369  
> > ..
> >
> >        1  GAATTCAAAA AATCAAAAAA ATCAAAAAAA AACCGTGGAA GGTAAGTTGT
> >
> >       51  ATATTTATAA ATCAACGTGA ATAATTTTCA ACACTGTGTC AACATCTGTG
> >
> >      101  AAAAAAACCT GTGTGTACTG CATATAGGAC CTCACCTATT ACGTAGAATA
> >
> >      151  TACTAGAAAT AGTTACAACC ATAAAAAGAT TAATTGTGCT TACGTGGCAA
> >
> >      201  CTTTGAGATT TTTCTTTTTT CTGTTTCTTT CTTTCTTTTT TTGGCTTAAA
> >
> >      251  CAACAAATGT CGCAAATTAT ACAAACGACA TTTGCTGCCC ATGTCATTTT
> >
> >      301  GTCGTTATCA CGTGAAGTGT CGCAGATTTA TGTATTCTCA CTTCATTTCT
> >
> >      351  ATGGTCATCA ATTGTTCATT CATTCTCTAT CTTCAAAAAT CTGTGATTTG
> >
> >      401  ATGATTTTGA TTAAAAGAAA GCAAAGAGAA TACTGAAAAA AAGCAAAGAG
> >
> >      451  AATATAGAAA AGAAACAATA AAAGAATAGT TTCTAAGTTA CTTTGGAGTC
> >
> >      501  TGCTATTACC ATGTATCTAT GTGATTGCCC TATCAAATTG GACAATACGG
> >
> >      551  GTTTTTGTTT AGTCACGATA ATCACAAACT TCCCCCAGCA ATGACATACG
> >
> >      601  TAGCAAGTAA TATTTATATC TCTTCTATTT TTTTGATCTT ACATAATCTG
> >
> >      651  TCGTGTTTTT TTAAGTTGTT GTTATGAAGA AGTAATTTCA TAATGATCAA
> >
> >      701  GTGTGTAACT GAAATTTCAT CGCAATTTTA AACAAACAAG CTAATAATTA
> >
> >      751  TTATTATTAA TAGTTAATTT GCTAAGTTGA GTAAAATTTG CTTTTCTTGA
> >
> >      801  GAAAAAGGAG AAATTACTTT GGGAGTGAGT TTGAAGAGAG AAACTAAAGT
> >
> >      851  AAGTAAATGA GTGAGAGGGA GAGACAGAGA GCGAGAGGGG GAGTAAAAAA
> >
> >      901  AAAAGTTGCC CACAAACAAA TTGTGATACC GGTCTTTTAG CATATATCTT
> >
> >      951  CTACTCTTCA ATCAACATCT TTACCAATGT CTAATCCTCA ACAACAATTC
> >
> >     1001  ATATCTGATG AATTGTCGCA GTTACAGAAA GAAATAATTT CCAAAAACCC
> >
> >     1051  GCAAGATGTC TTACAGTTTT GCGCCAACTA TTTCAACACC AAGTTACAAG
> >
> >     1101  CTCAAAGAAG TGAGTTATGG TCGCAACAAG CTAAAGCAGA AGCCGCAGGC
> >
> >     1151  ATCGACTTAT TCCCATCTGT TGATCATGTG AATGTTAATT CTAGTGGTGT
> >
> >     1201  GAGCATTGTG AATGATAGAC AACCAAGTTT TAAATCACCT TTTGGTGTTA
> >
> >     1251  ATGATCCACA TCTGAATCAC GACGAAGATC CCCATGCCAA AGATACCAAA
> >
> >     1301  ACAGATACTG CTGCTGCTGC TGTTGGTGGG GGTATTTTCA AATCAAATTT
> >
> >     1351  TGATGTTAAA AAGAGTGCTT CTAATCCTCC AACCAAGGAA GTAGATCCAG
> >
> >     1401  ATGACCCATC AAAACCATCG TCATCGAGCC AACCAAATCA ACAATCAGCA
> >
> >     1451  TCAGCATCAT CAAAAACGCC ATCATCAAAG ATCCCAGTTG CTTTCAACGC
> >
> >     1501  TAATAGAAGA ACATCTGTAT CTGCTGAAGC CTTGAATCCA GCAAAATTGA
> >
> >     1551  AATTAGATAG TTGGAAACCT CCAGTTAATA ATTTGAGCAT TACCGAAGAA
> >
> >     1601  GAAACATTAG CCAACAATTT AAAGAACAAT TTCCTTTTCA AACAATTGGA
> >
> >     1651  CGCAAACTCT AAGAAAACTG TGATTGCTGC TTTACAACAA AAATCATTTG
> >
> >     1701  CTAAAGATAC AGTAATTATC CAACAAGGTG ATGAAGGGGA CTTTTTTTAC
> >
> >     1751  ATTATTGAAA CTGGTACAGT TGATTTCTAT GTTAATGATG CTAAAGTAAG
> >
> >     1801  TTCCAGTAGC GAAGGGTCAT CTTTTGGGGA ATTGGCTTTG ATGTATAATT
> >
> >     1851  CACCAAGAGC TGCTACGGCA GTTGCTGCCA CCGATGTTGT CTGTTGGGCA
> >
> >     1901  TTGGACCGTT TGACATTCCG TCGAATTCTT TTGGAAGGTA CTTTTAACAA
> >
> >     1951  GAGATTGATG TACGAGGATT TCTTAAAAGA TATTGAGGTT TTGAAATCTC
> >
> >     2001  TTTCGGATCA TGCACGTTCA AAATTGGCAG ATGCATTGAG CACAGAAATG
> >
> >     2051  TATCACAAGG GTGATAAAAT AGTCACTGAA GGTGAACAAG GAGAGAACTT
> >
> >     2101  TTATTTAATA GAAAGTGGAA ACTGTCAAGT TTACAATGAA AAGTTGGGCA
> >
> >     2151  ATATCAAACA ATTAACAAAA GGTGATTATT TTGGTGAGCT TGCATTAATA
> >
> >     2201  AAAGACTTAC CAAGACAAGC TACTGTGGAA GCATTGGATA ATGTAATCGT
> >
> >     2251  TGCCACATTA GGTAAATCCG GGTTCCAAAG ATTATTGGGT CCTGTTGTGG
> >
> >     2301  AGGTATTGAA AGAACAAGAC CCTACAAAGA GTCAAGACCC AACTGCTGGT
> >
> >     2351  CATTAAGTGT ACAATAAGTA GTTGTTTATT ATCTTATATT GTTTTATGTT
> >
> >     2401  AGTATATTCT ATCTTTTTTT TTTTGGCTTA CTCACCTTCT GGTGTTTTCG
> >
> >     2451  TTGCGATTTT GATAATGGAT GGTTGGTGCA AAAGTTCAAC TACATTTCTT
> >
> >     2501  GTTGTCAGGT ATATACGAGA TGGCAGCATG AACGAGCTCA CCATGGGTTG
> >
> >     2551  AACATTATTG AAGTTATCCG GCCGTGCCTT TTGCGAAACA TGGTAACTAA
> >
> >     2601  TATATTGCAA ACTTGGCTTC TACAGAAAAT ATACAATCTA ATACCTTGAG
> >
> >     2651  GAATTTCCTC TATATATAAT AGAGAATTC
> >
> > I'm not a GCG expert, but is this a correctly formatted GCG file in 
> > the first
> > place? If not, is this an error in the SeqIO parser?  I've found this 
> > behavior
> > to be the same on Solaris 8 and on Linux, both running BioPerl 1.4 and 
> > Perl
> > 5.8.1.
> >
> > Thanks a bunch,
> >
> > Tex Thompson
> > RIT Bioinformatics
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> 

From Sebastien.Moretti at igs.cnrs-mrs.fr  Tue Feb 17 02:44:45 2004
From: Sebastien.Moretti at igs.cnrs-mrs.fr (Sebastien Moretti)
Date: Tue Feb 17 02:45:25 2004
Subject: [Bioperl-l] get sequence failure
In-Reply-To: <4030ECCE.8090608@cenix-bioscience.com>
References: <20040216154238.4435.qmail@web25209.mail.ukl.yahoo.com>
	<4030ECCE.8090608@cenix-bioscience.com>
Message-ID: <200402170844.45672.Sebastien.Moretti@igs.cnrs-mrs.fr>

Hi,
I use 
	use Bio::DB::GenBank;
	use Bio::DB::Query::GenBank;
	use Bio::SeqIO;
and I can get RefSeq, nr and GenPept files with accession numbers.

> Hi,
>
> I think the problem could be that you are trying to retrieve a contig
> (NT number). I remember Bio::DB::GenBank used to have problems with
> those, but I'm not sure about now.
>
> And from reading the Bio::Perl::get_sequence POD, I see:
>
>          get_sequence
>
>          Title   : get_sequence
>          Usage   : $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> ...
>
>          Args    : database type - one of swiss, embl, genbank or refseq
>                    identifier or accession number
>
>
> Are you sure 'nr' is a valid database type?
>
> Cheers,
>
> Andrew
>
> william ritchie wrote:
> > Hi,
> >
> > I m trying to retrieve a sequence with the following
> > code:
> >
> > use Bio::SearchIO;
> > use Bio::Perl;
> > use Bio::SeqIO;
> >
> > my $seq_object = get_sequence("nr","NT_039208.2");
> >
> > and I m getting this error:
> >
> > MSG: id does not exist
> > STACK Bio::DB::WebDBSeqI::get_Seq_by_id
> > /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:155
> > STACK Bio::Perl::get_sequence
> > /usr/lib/perl5/site_perl/5.8.0/Bio/Perl.pm:513
> > STACK toplevel getseq.pl:9
> >
> > this ID does exist, using the ncbi interface I can
> > retrieve it!
> > Help!!!!Please!!

-- 
Sebastien MORETTI
CNRS - IGS
31 chemin Joseph Aiguier
13402 Marseille cedex 20, FRANCE
tel. +334 91 16 44 55 - +336 61 88 59 00
From Marc.Logghe at devgen.com  Tue Feb 17 03:03:00 2004
From: Marc.Logghe at devgen.com (Marc Logghe)
Date: Tue Feb 17 03:09:22 2004
Subject: [Bioperl-l] ace.pm
Message-ID: <BEE28BF86078B6429D6C780635718E21904B97@morelia.be.devgen.com>


> -----Original Message-----
> From: Wes Barris [mailto:wes.barris@csiro.au]
> Sent: dinsdag 17 februari 2004 0:19
> To: Bioperl Mailing List
> Subject: [Bioperl-l] ace.pm
> 
> 
> Hi,
> 
> ACE files generated by an application called tgicl have "CO"
> lines of the form:
> 
> CO CL15Contig2 794 4 0 U
> 
> This line is not parsed properly by the ace.pm bioperl module.
> Notice this line from Bio/Assembly/IO/ace.pm .
> 
>          (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New 
> contig found!
> 
A long time ago I've adapted the code in order to handle tgicl (cap3) generated ACE files. Don't know however, if it made it to CVS. I'll have a look,  I'll send you the patch as soon as I traced it 
HTH,
Marc

From pvh at egenetics.com  Tue Feb 17 03:45:16 2004
From: pvh at egenetics.com (Peter van Heusden)
Date: Tue Feb 17 03:52:09 2004
Subject: [Bioperl-l] ace.pm
In-Reply-To: <BEE28BF86078B6429D6C780635718E21904B97@morelia.be.devgen.com>
References: <BEE28BF86078B6429D6C780635718E21904B97@morelia.be.devgen.com>
Message-ID: <4031D49C.9060402@egenetics.com>

Marc Logghe wrote:

>  
>
>>-----Original Message-----
>>From: Wes Barris [mailto:wes.barris@csiro.au]
>>Sent: dinsdag 17 februari 2004 0:19
>>To: Bioperl Mailing List
>>Subject: [Bioperl-l] ace.pm
>>
>>
>>Hi,
>>
>>ACE files generated by an application called tgicl have "CO"
>>lines of the form:
>>
>>CO CL15Contig2 794 4 0 U
>>
>>This line is not parsed properly by the ace.pm bioperl module.
>>Notice this line from Bio/Assembly/IO/ace.pm .
>>
>>         (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New 
>>contig found!
>>
>>    
>>
>A long time ago I've adapted the code in order to handle tgicl (cap3) generated ACE files. Don't know however, if it made it to CVS. I'll have a look,  I'll send you the patch as soon as I traced it 
>HTH,
>Marc
>  
>
Modern versions of phrap can generate two types of ACE files: I think 
the one with the CO instead of Contig is generated when you pass phrap 
the -old_ace command line parameter. The CO/Contig difference is only 
one of many, but I assume your code deals with the other changes as 
well. I know we had to handle these different formats in stackPACK.

Peter
From john.herbert at clinical-pharmacology.oxford.ac.uk  Tue Feb 17 04:07:44 2004
From: john.herbert at clinical-pharmacology.oxford.ac.uk (john herbert)
Date: Tue Feb 17 04:13:40 2004
Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer
	TM
Message-ID: <s031d9ea.083@gwmail.jr2.ox.ac.uk>

Hello Rob,
Thanks for the reply. I agree about Primer3 documentation, had me
puzzled for a while before I sussed it.

At the moment I don't have the code but I guess the easiest way could
be to look at the source code of Primer3 and see what the C code looks
like (I am assuming it is C) for calculating the Primer TM. Then try to
mimic it in Perl. If I get time soon, I will have a look. Alternatively
we could look at the references.

If there are TM Gurus  out there who have a better suggestion, please
mail us. 

Kind regards,

John.

>>> Rob Edwards <redwards@utmem.edu> 17/02/2004 03:40:40 >>>
I wrote this implementation of the Bio::SeqFeature::Primer module, and

I am guilty as charged. I was confused by the primer3 documentation, 
but hey, at least I was honest in the docs - I said that I couldn't 
figure out what was going on.

I don't know what the formula is that Primer3 used. If you have some 
code for this I can update the module.

Rob


On Feb 12, 2004, at 5:50 AM, john herbert wrote:

> Hello BioPerl.
> I was thinking of using BioPerl to calculate the Tm of my primers. I
> looked from the documentation and found the method for getting a
> primer's Tm. my $tm = $primer->Tm;
>
> In the notes of this method, the author is confused by the fact that
a
> BioPerl calculated Tm never matches that of Primer3's prediction.
This
> is because Primer3 uses a different method to calculate a Primers TM
> than the method used in the BioPerl.
>
>  BioPerl $primer->Tm = Calculated using: Tm = 81.5 +
16.6(log10([Na+]))
> + .41*(%GC) - 600/length
> (Sambrook, Fritsch and Maniatis, Molecular Cloning, p 11.46 (1989,
CSHL
> Press).
>
> Primer3 primer TM = Primer3 uses the oligo melting temperature
formula
> given in Rychlik, Spencer and Rhoads, Nucleic Acids Research, vol
18,
> num 12, pp 6409-6412 and Breslauer, Frank, Bloeker and Marky, Proc.
> Natl. Acad. Sci. USA, vol 83, pp 3746-3750. This method uses
> thermodynamics to calc TM.
>
> Primer3 does use the Sambrook method to predict the TM of the PCR
> products but not the Primer TM.
>
> Would it be possible to update the BioPerl to match how Primer3
> calculates the Primer TM?
>
> Kind regards,
>
> John Herbert
>
> Cancer Research UK.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org 
> http://portal.open-bio.org/mailman/listinfo/bioperl-l 

From Marc.Logghe at devgen.com  Tue Feb 17 04:36:58 2004
From: Marc.Logghe at devgen.com (Marc Logghe)
Date: Tue Feb 17 13:48:27 2004
Subject: [Bioperl-l] ace.pm
Message-ID: <BEE28BF86078B6429D6C780635718E21904B98@morelia.be.devgen.com>

The thread (including patch) can be found here:
http://bioperl.org/pipermail/bioperl-l/2002-December/010677.html
The patch should still work, cvs version of the package did not change in bioperl.
HTH,
Marc


> -----Original Message-----
> From: Wes Barris [mailto:wes.barris@csiro.au]
> Sent: dinsdag 17 februari 2004 4:36
> To: Jason Stajich
> Cc: Bioperl Mailing List
> Subject: Re: [Bioperl-l] ace.pm
> 
> 
> Jason Stajich wrote:
> 
> > 
> > People write code and modules to support the work they are doing,
> > sometimes for a specific data set - so I suspect Robson 
> wrote this to
> > support phrap ace format which has a convention of them 
> being ContigXX.
> > 
> > You are welcome to make changes to code on your local 
> system to get it
> > working and then post the diffs so they can be incorporated 
> back in.  Why
> > not try changing the code as you have noticed and seeing if 
> it works.  It
> > is a collaborative project and these modules are newish, so 
> give a try
> > fixing things and then getting feedback on your fixes.
> 
> I have modified one line in Bio/Assembly/IO/ace.pm as shown below:
> 
>          # Loading contig sequence (COntig sequence field)
> #       (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # 
> New contig 
> found!
>          (/^CO (\w+) (\d+) (\d+) (\d+) (\w+)/) && do { # New 
> contig found!
> 
> The change will cause the contigID to be whatever the second field of
> this line is (CO CL15Contig1 794 4 0 U).  In this case, it would be
> set to "CL15Contig1".
> 
> 
> > 
> > -jason
> > 
> > On Tue, 17 Feb 2004, Wes Barris wrote:
> > 
> >  > Hi,
> >  >
> >  > ACE files generated by an application called tgicl have "CO"
> >  > lines of the form:
> >  >
> >  > CO CL15Contig2 794 4 0 U
> >  >
> >  > This line is not parsed properly by the ace.pm bioperl module.
> >  > Notice this line from Bio/Assembly/IO/ace.pm .
> >  >
> >  >          (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && 
> do { # New
> >  > contig found!
> >  >
> >  > Bioperl expects the second "word" in the line to be 
> "Contig\d+" where
> >  > the number is used as the "contigID".  Is there a reason why
> >  > "contigID" must be a number?  Why can't it be the whole second
> >  > "word" of the "CO" line?
> >  >
> > 
> > -- 
> > Jason Stajich
> > Duke University
> > jason at cgt.mc.duke.edu
> > 
> 
> 
> -- 
> Wes Barris
> E-Mail: Wes.Barris@csiro.au
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 

From Jan.Aerts at wur.nl  Tue Feb 17 09:42:03 2004
From: Jan.Aerts at wur.nl (Aerts, Jan)
Date: Tue Feb 17 13:52:10 2004
Subject: [Bioperl-l] get_Seq_by_id: CONTIG found
Message-ID: <7D030487F1A3D143A76F2A1E91F57035EF3D2E@scomp0010>

Hi all,

I'm trying to download a bunch of sequences from GenBank using the ID and get_Seq_by_id (see script below). This method works great, except when it hits a sequence that in fact is a scaffold (e.g. http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=38322681). The message I get is:
<BEGIN MESSAGE>
-------------------- WARNING ---------------------
MSG: CONTIG found. GenBank get_Stream_by_acc about to run.
---------------------------------------------------
Warning: unable to close filehandle FETCH properly.
<END MESSAGE>

Is there a way to test if the ID refers to an ID refers to a contig instead of a 'regular' sequence?

Thanks a lot,
Jan Aerts

<BEGIN EXAMPLE CODE>
use Bio::DB::GenBank;

my @ids = (38524490,31745019,38322681);
my $db = Bio::DB::GenBank->new();
foreach ( @ids ) {
  my $seq = $db->get_Seq_by_id($_);
  print ">", $seq->accession, '|', $seq->description, '|', $seq->keywords, "\n";
}
<END EXAMPLE CODE>

From chapmanb at uga.edu  Tue Feb 17 13:44:53 2004
From: chapmanb at uga.edu (Brad Chapman)
Date: Tue Feb 17 13:57:50 2004
Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM
In-Reply-To: <Pine.LNX.4.50.0402170745450.31848-100000@tenero.duhs.duke.edu>
References: <s031d9ea.083@gwmail.jr2.ox.ac.uk>
	<Pine.LNX.4.50.0402170745450.31848-100000@tenero.duhs.duke.edu>
Message-ID: <20040217184453.GH42800@evostick.agtec.uga.edu>

Hey all;

John:
> > At the moment I don't have the code but I guess the easiest way could
> > be to look at the source code of Primer3 and see what the C code looks
> > like (I am assuming it is C) for calculating the Primer TM. 

Jason:
> I believe biopython guys have dealt with this recently -- brad - can you
> point us to working code/documentation for Tm calculation?

Yes, Sebastian Bassi was working on this code for a while. He did
come up with a version, but it turned out to not be the most current
parameters, I believe. I don't think he finished the new version
yet. But, there has been plenty of talk about it -- here's the
relevant mails:

Sebastian's original message:
http://www.biopython.org/pipermail/biopython/2003-November/001701.html

Implementations (for DNA and RNA):
http://www.biopython.org/pipermail/biopython/2003-November/001742.html
http://www.biopython.org/pipermail/biopython/2003-November/001745.html

Mail from Peter Slickers with the information about the most current
accepted parameter sets:
http://www.biopython.org/pipermail/biopython/2003-November/001747.html

Hope this helps you guys -- and reminds Sebastian we'd still
eventually like to see this in Biopython :-).
Brad
From jason at cgt.duhs.duke.edu  Tue Feb 17 07:46:25 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Tue Feb 17 14:09:56 2004
Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer TM
In-Reply-To: <s031d9ea.083@gwmail.jr2.ox.ac.uk>
References: <s031d9ea.083@gwmail.jr2.ox.ac.uk>
Message-ID: <Pine.LNX.4.50.0402170745450.31848-100000@tenero.duhs.duke.edu>

I believe biopython guys have dealt with this recently -- brad - can you
point us to working code/documentation for Tm calculation?

-jason
On Tue, 17 Feb 2004, john herbert wrote:

> Hello Rob,
> Thanks for the reply. I agree about Primer3 documentation, had me
> puzzled for a while before I sussed it.
>
> At the moment I don't have the code but I guess the easiest way could
> be to look at the source code of Primer3 and see what the C code looks
> like (I am assuming it is C) for calculating the Primer TM. Then try to
> mimic it in Perl. If I get time soon, I will have a look. Alternatively
> we could look at the references.
>
> If there are TM Gurus  out there who have a better suggestion, please
> mail us.
>
> Kind regards,
>
> John.
>
> >>> Rob Edwards <redwards@utmem.edu> 17/02/2004 03:40:40 >>>
> I wrote this implementation of the Bio::SeqFeature::Primer module, and
>
> I am guilty as charged. I was confused by the primer3 documentation,
> but hey, at least I was honest in the docs - I said that I couldn't
> figure out what was going on.
>
> I don't know what the formula is that Primer3 used. If you have some
> code for this I can update the module.
>
> Rob
>
>
>
>
> On Feb 12, 2004, at 5:50 AM, john herbert wrote:
>
> > Hello BioPerl.
> > I was thinking of using BioPerl to calculate the Tm of my primers. I
> > looked from the documentation and found the method for getting a
> > primer's Tm. my $tm = $primer->Tm;
> >
> > In the notes of this method, the author is confused by the fact that
> a
> > BioPerl calculated Tm never matches that of Primer3's prediction.
> This
> > is because Primer3 uses a different method to calculate a Primers TM
> > than the method used in the BioPerl.
> >
> >  BioPerl $primer->Tm = Calculated using: Tm = 81.5 +
> 16.6(log10([Na+]))
> > + .41*(%GC) - 600/length
> > (Sambrook, Fritsch and Maniatis, Molecular Cloning, p 11.46 (1989,
> CSHL
> > Press).
> >
> > Primer3 primer TM = Primer3 uses the oligo melting temperature
> formula
> > given in Rychlik, Spencer and Rhoads, Nucleic Acids Research, vol
> 18,
> > num 12, pp 6409-6412 and Breslauer, Frank, Bloeker and Marky, Proc.
> > Natl. Acad. Sci. USA, vol 83, pp 3746-3750. This method uses
> > thermodynamics to calc TM.
> >
> > Primer3 does use the Sambrook method to predict the TM of the PCR
> > products but not the Primer TM.
> >
> > Would it be possible to update the BioPerl to match how Primer3
> > calculates the Primer TM?
> >
> > Kind regards,
> >
> > John Herbert
> >
> > Cancer Research UK.
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From Annie.Law at nrc-cnrc.gc.ca  Tue Feb 17 14:42:41 2004
From: Annie.Law at nrc-cnrc.gc.ca (Law, Annie)
Date: Tue Feb 17 14:49:03 2004
Subject: [Bioperl-l] New GO Parser and errors loading biosql database
Message-ID: <10C94843061E094A98C02EB77CFC328722FE05@nrcmrdex1d.imsb.nrc.ca>

Hi,

I would appreciate help with the following.
I have installed the newest bioperl-db and biosql schema from cvs.
I tried to load the database with information from godatabase.org and got
some errors listed further below (the
Tables did not fill at all).
Next I tried to load the database with Locuslink data from NCBI.

1)I got the LL file from NCBI and tried to load an empty datbase with a
LL_tmpl file (for human) and 
It seemed to load properly and the tables were filling up but then it
stopped after about 900 bioentries.
I'm not sure what went wrong.  There seem to be a complaint about duplicate
entry but I don't think I should
Modify the source file.

[root@ data]# perl /root/bioperl-db/scripts/biosql/load_seqdatabase.pl
--dbuser=root 
 --dbpass=mss22 --dbname bioseqdb --namespace "LocusLink" -format locuslink
/var/lib/mysql/LL_
_tmpl --dbpass=bioinf1 --dbname bioseqdb --namespace "LocusLink" -format
locuslink /var/lib/mysql/LL_tmpl
Loading /var/lib/mysql/LL_tmpl ...

-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were
("GO:0005699","kinetochore","","") FKs (6)
Duplicate entry 'kinetochore-6' for key 2
---------------------------------------------------
Could not store 1063: 
------------- EXCEPTION  -------------
MSG: create: object (Bio::Annotation::OntologyTerm) failed to insert or to
be found by unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253
STACK Bio::DB::Persistent::PersistentObject::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270
STACK Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:
219
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:215
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253
STACK Bio::DB::Persistent::PersistentObject::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270
STACK Bio::DB::BioSQL::SeqAdaptor::store_children
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/SeqAdaptor.pm:226
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:215
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253
STACK Bio::DB::Persistent::PersistentObject::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270
STACK (eval) /root/bioperl-db/scripts/biosql/load_seqdatabase.pl:517
STACK toplevel /root/bioperl-db/scripts/biosql/load_seqdatabase.pl:500

--------------------------------------


2) Updating GO parser
I saw that the GO parser was updated recently and I have located the code
version 1.17.2.1 I downloaded the new version.  I am using bioperl 1.4.
Should I just take the new dagflat.pm and replace the old one or are there
more steps involved?  When I download whole Modules I need to use make
commands.

Also I saw that dagflat.pm requires graph.pm.  Is this graph.pm part of the
bioperl 1.4 package I couldn't seem to find it or do I need to download and
install from CPAN.  I searched CPAN for graph.pm and got several hits.  Is
this the one I need? http://search.cpan.org/~mverb/GDGraph-1.43/
Do I also need GD.pm? I think I saw somewhere that it is required?
http://search.cpan.org/~lds/GD-2.11/GD.pm
Although this could be a mistake

Where is the best place to install GD and graph.pm (with dagflat.pm or the
main perl library)?   
I'm not sure whether the main perl library is /usr/lib/perl5/5.8.0 or
/usr/lib/perl5/site_perl/5.8.0/Bio


3) I installed Bioperl-db and downloaded the biosql schema successfully but
when I tried to use the Load_ontology.pl I got some errors which seem to be
saying that I am missing some main modules such as goflat (I recorded a
script of the output). But I have goflat.pm.  
Am I calling the perl script incorrectly?  Or are there still some modules I
need to install? I'm not sure that I am using the correct
Syntax for the format field.

Thanks very much,
Annie.

[root@ data]# perl /root/bioperl-db/scripts/biosql/load_ontology.pl
--dbuser=root --d
dbpass=mss22 --dbname bioseqdb --noobsolete --namespace "Gene Ontology"
--fmtargs "-defs_file,/root/Go.defs" 
--format goflat ./component.ontology ./process.ontology ./function.ontology

Bio::OntologyIO: goflat cannot be found
Exception 
------------- EXCEPTION  -------------
MSG: Failed to load module Bio::OntologyIO::goflat. Can't locate
Graph/Directed.pm in @INC (@INC contains:
/usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0
/usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl
/usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 .) at
/usr/lib/perl5/site_perl/5.8.0/Bio/Ontology/SimpleGOEngine.pm line 95.
BEGIN failed--compilation aborted at
/usr/lib/perl5/site_perl/5.8.0/Bio/Ontology/SimpleGOEngine.pm line 95.
Compilation failed in require at
/usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO/dagflat.pm line 105.
BEGIN failed--compilation aborted at
/usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO/dagflat.pm line 105.
Compilation failed in require at
/usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO/goflat.pm line 105.
BEGIN failed--compilation aborted at
/usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO/goflat.pm line 105.
Compilation failed in require at
/usr/lib/perl5/site_perl/5.8.0/Bio/Root/Root.pm line 394.

STACK Bio::Root::Root::_load_module
/usr/lib/perl5/site_perl/5.8.0/Bio/Root/Root.pm:396
STACK (eval) /usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO.pm:255
STACK Bio::OntologyIO::_load_format_module
/usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO.pm:254
STACK Bio::OntologyIO::new
/usr/lib/perl5/site_perl/5.8.0/Bio/OntologyIO.pm:165
STACK toplevel /root/bioperl-db/scripts/biosql/load_ontology.pl:449

--------------------------------------

For more information about the OntologyIO system please see the docs.
This includes ways of checking for formats at compile time, not run time
Parsing input ...
Can't call method "next_ontology" on an undefined value at
/root/bioperl-db/scripts/biosql/load_ontology.pl line 455.
From jason at cgt.duhs.duke.edu  Tue Feb 17 16:03:27 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Tue Feb 17 16:09:45 2004
Subject: [BioPython] Re: [Bioperl-l] Bio::SeqFeature::Primer Calculating
	the Primer TM (fwd)
Message-ID: <Pine.LNX.4.50.0402171602460.1173-100000@tenero.duhs.duke.edu>

Sebastian's comments cross-posted.

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu

---------- Forwarded message ----------
Date: Tue, 17 Feb 2004 18:10:43 -0300
From: Sebastian Bassi <sbassi@asalup.org>
To: Brad Chapman <chapmanb@uga.edu>
Cc: biopython@biopython.org
Subject: Re: [BioPython] Re: [Bioperl-l] Bio::SeqFeature::Primer
    Calculating the Primer TM

Brad Chapman <chapmanb@uga.edu> Escribio
> Hope this helps you guys -- and reminds Sebastian we'd still
> eventually like to see this in Biopython :-).

I now. I am a little busy right now (have you hear of DNALinux?, I'm working
on it and some academic projects also!).
But I want to code the function. What is holding me now (appart from lack of
time) is that I didn't find a working source code of Santalucia's formulae. I
tried to follow the paper but I couldn't. An implamentation, even in C or
pseudocode, would help me. Another thing that will be usful, would be some
pre-made Tm calculation using Santalucia parameters. This will be useful to
test my code.
What I did was very close since it worked as EMBOSS DAN, but Santalucia is
the way to go.

Sebastian Bassi. PGP Key available.
_________________________________________________________


_______________________________________________
BioPython mailing list  -  BioPython@biopython.org
http://biopython.org/mailman/listinfo/biopython
From pm66 at nyu.edu  Tue Feb 17 15:18:23 2004
From: pm66 at nyu.edu (Philip MacMenamin)
Date: Tue Feb 17 16:58:21 2004
Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem, new wormbase models.
Message-ID: <200402172152.i1HLq31G015646@mx1.nyu.edu>

I cannot get things to aggregate properly in the new wormbase models. 
Previously this worked:
$aggregator = Bio::DB::GFF::Aggregator->new(-method    => 'test',
					       -sub_parts => ['UTR','CDS:curated']
					      );
(Or I could of more simply used the 'processed_transcript' aggregator that 
Linoln wrote.)

And one feature, with the UTRs "aggregated" together with the curated CDS's, 
was returned:
test:5_UTR(AH6.5), start->9524078 end->9526248
This could be fed to some kind of glyph, and would draw a nice picture of 
UTRs hanging off a coding seq.

[--------------]    [----------]  [---->
(You have to use your imaginatio a bit here)

With the new models of wormbase, this is not the case, so I am re-writing 
code to accomadate these changes. 

Now I am returned for example 3 features: a 3 prime UTR, a prime, and a CDS.
test:UTR(5_UTR:AH6.5)9524078 9524086
test:UTR(3_UTR:AH6.5)9525782 9526248
test:curated(AH6.5)9524087 9525781

The glyph then draws these things on two planes.
[]				[------->
 [---------]  [------------] [----------]

My assumption is that a single feature can be rendered as a single glyph. And 
if you have >1 feature, then >1 glyphs are needed. Therefore, I assume the 
problem stems from my failure to aggregate UTR and CDS features.

I have noticed that the new SQL GFF Db used to have two fmethods for UTRs, 
one for both 3 and 5 primes. The new one has only one. There are other 
changes of course, but I think this is important.

So, essentially I want to draw UTRs with CDS's attached... with the new 
wormbase models. If anyone knows how to do this, i'd like to hear from them.

Thanks :)

-- 
Philip MacMenamin
From todd.harris at cshl.edu  Tue Feb 17 17:10:48 2004
From: todd.harris at cshl.edu (Todd Harris)
Date: Tue Feb 17 17:17:14 2004
Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem, new wormbase models.
In-Reply-To: <200402172152.i1HLq31G015646@mx1.nyu.edu>
Message-ID: <BC57ED88.BB42%todd.harris@cshl.org>

Hi Phillip - 

You need to aggregate the separate parts of the CDS.  Create a wormbase_cds
(or whatever you wish to call it), aggregating the following features using
the CDS group: coding_exon,5_UTR,3_UTR.

The following stanza should do the trick.

$dbgff = (-adaptor => 'dbi::mysql',
          -dsn     => 'dbi:mysql:database=your_database;host=your_host',
          -aggregators => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})],
          -user    => 'your_username',
          -pass    => 'your_dbgff_pass');

This should do the trick for properly aggregating genes under the new
WormBase CDS class.

Todd Harris

> On 2/17/04 2:18 PM, Philip MacMenamin wrote:

> I cannot get things to aggregate properly in the new wormbase models.
> Previously this worked:
> $aggregator = Bio::DB::GFF::Aggregator->new(-method    => 'test',
>       -sub_parts => ['UTR','CDS:curated']
>      );
> (Or I could of more simply used the 'processed_transcript' aggregator that
> Linoln wrote.)
> 
> And one feature, with the UTRs "aggregated" together with the curated CDS's,
> was returned:
> test:5_UTR(AH6.5), start->9524078 end->9526248
> This could be fed to some kind of glyph, and would draw a nice picture of
> UTRs hanging off a coding seq.
> 
> [--------------]    [----------]  [---->
> (You have to use your imaginatio a bit here)
> 
> With the new models of wormbase, this is not the case, so I am re-writing
> code to accomadate these changes.
> 
> Now I am returned for example 3 features: a 3 prime UTR, a prime, and a CDS.
> test:UTR(5_UTR:AH6.5)9524078 9524086
> test:UTR(3_UTR:AH6.5)9525782 9526248
> test:curated(AH6.5)9524087 9525781
> 
> The glyph then draws these things on two planes.
> []                [------->
> [---------]  [------------] [----------]
> 
> My assumption is that a single feature can be rendered as a single glyph. And
> if you have >1 feature, then >1 glyphs are needed. Therefore, I assume the
> problem stems from my failure to aggregate UTR and CDS features.
> 
> I have noticed that the new SQL GFF Db used to have two fmethods for UTRs,
> one for both 3 and 5 primes. The new one has only one. There are other
> changes of course, but I think this is important.
> 
> So, essentially I want to draw UTRs with CDS's attached... with the new
> wormbase models. If anyone knows how to do this, i'd like to hear from them.
> 
> Thanks :)

From wes.barris at csiro.au  Tue Feb 17 17:56:52 2004
From: wes.barris at csiro.au (Wes Barris)
Date: Tue Feb 17 18:03:21 2004
Subject: [Bioperl-l] ace.pm
In-Reply-To: <BEE28BF86078B6429D6C780635718E21904B98@morelia.be.devgen.com>
References: <BEE28BF86078B6429D6C780635718E21904B98@morelia.be.devgen.com>
Message-ID: <40329C34.9030203@csiro.au>

Marc Logghe wrote:

> The thread (including patch) can be found here:
> http://bioperl.org/pipermail/bioperl-l/2002-December/010677.html
> The patch should still work, cvs version of the package did not change 
> in bioperl.

Will these changes make their way into the bioperl distribution?

> HTH,
> Marc
> 
> 
>  > -----Original Message-----
>  > From: Wes Barris [mailto:wes.barris@csiro.au]
>  > Sent: dinsdag 17 februari 2004 4:36
>  > To: Jason Stajich
>  > Cc: Bioperl Mailing List
>  > Subject: Re: [Bioperl-l] ace.pm
>  >
>  >
>  > Jason Stajich wrote:
>  >
>  > >
>  > > People write code and modules to support the work they are doing,
>  > > sometimes for a specific data set - so I suspect Robson
>  > wrote this to
>  > > support phrap ace format which has a convention of them
>  > being ContigXX.
>  > >
>  > > You are welcome to make changes to code on your local
>  > system to get it
>  > > working and then post the diffs so they can be incorporated
>  > back in.  Why
>  > > not try changing the code as you have noticed and seeing if
>  > it works.  It
>  > > is a collaborative project and these modules are newish, so
>  > give a try
>  > > fixing things and then getting feedback on your fixes.
>  >
>  > I have modified one line in Bio/Assembly/IO/ace.pm as shown below:
>  >
>  >          # Loading contig sequence (COntig sequence field)
>  > #       (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { #
>  > New contig
>  > found!
>  >          (/^CO (\w+) (\d+) (\d+) (\d+) (\w+)/) && do { # New
>  > contig found!
>  >
>  > The change will cause the contigID to be whatever the second field of
>  > this line is (CO CL15Contig1 794 4 0 U).  In this case, it would be
>  > set to "CL15Contig1".
>  >
>  >
>  > >
>  > > -jason
>  > >
>  > > On Tue, 17 Feb 2004, Wes Barris wrote:
>  > >
>  > >  > Hi,
>  > >  >
>  > >  > ACE files generated by an application called tgicl have "CO"
>  > >  > lines of the form:
>  > >  >
>  > >  > CO CL15Contig2 794 4 0 U
>  > >  >
>  > >  > This line is not parsed properly by the ace.pm bioperl module.
>  > >  > Notice this line from Bio/Assembly/IO/ace.pm .
>  > >  >
>  > >  >          (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) &&
>  > do { # New
>  > >  > contig found!
>  > >  >
>  > >  > Bioperl expects the second "word" in the line to be
>  > "Contig\d+" where
>  > >  > the number is used as the "contigID".  Is there a reason why
>  > >  > "contigID" must be a number?  Why can't it be the whole second
>  > >  > "word" of the "CO" line?
>  > >  >
>  > >
>  > > --
>  > > Jason Stajich
>  > > Duke University
>  > > jason at cgt.mc.duke.edu
>  > >
>  >
>  >
>  > --
>  > Wes Barris
>  > E-Mail: Wes.Barris@csiro.au
>  >
>  > _______________________________________________
>  > Bioperl-l mailing list
>  > Bioperl-l@portal.open-bio.org
>  > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>  >
> 


-- 
Wes Barris
E-Mail: Wes.Barris@csiro.au

From rrouse at biomail.ucsd.edu  Tue Feb 17 18:14:23 2004
From: rrouse at biomail.ucsd.edu (Richard Rouse)
Date: Tue Feb 17 18:20:34 2004
Subject: [Bioperl-l] searchio scripts
Message-ID: <ADEFIHIJAHBGLBNJCIADMEAECEAA.rrouse@biomail.ucsd.edu>

I recent installed bioperl-1.4 and am having problems with the blast report
parsers in /examples/searchio/


When I run:
perl hitwriter.pl blastreport
I get:

Using SearchIO->new()

0 Blast report(s) processed.
Output sent to file: >hitwriter.out

I get the same result with rawwriter.pl, hspwriter.pl and custom_writer.pl
although the htmlwriter.pl and the blast_example.pl work fine.

Has anyone else encountered this problem and figured out how to fix it?

Thanks,

Richard


From redwards at utmem.edu  Tue Feb 17 20:21:02 2004
From: redwards at utmem.edu (Rob Edwards)
Date: Tue Feb 17 20:27:28 2004
Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer	TM
In-Reply-To: <4032B065.2000908@genetics.utah.edu>
References: <s031d9ea.083@gwmail.jr2.ox.ac.uk>
	<4032B065.2000908@genetics.utah.edu>
Message-ID: <B7673D6E-61B0-11D8-8A23-000A959E1622@utmem.edu>

Barry,

Thanks for this code. I really appreciate having an alternate method 
for calculating this. When I wrote the module I just couldn't figure 
out why the numbers wouldn't match. It seems Tm calculations are not as 
straightforward as they should (?) be. I always use 60C for my PCR 
annealing step and it works 99% of the time :-)

I added your code to Bio::SeqFeature::Primer in CVS and updated 
t/Primer.pm too so that it passes.

I left the original calculation in the module, but renamed it 
Tm_estimate so that it can be there for comparative purposes.

Rob

From wes.barris at csiro.au  Tue Feb 17 20:56:04 2004
From: wes.barris at csiro.au (Wes Barris)
Date: Tue Feb 17 21:02:28 2004
Subject: [Bioperl-l] msf output
Message-ID: <4032C634.3000507@csiro.au>

Hi,

Msf files produced by StackPACK have coordinates listed above each
group of sequence data.  Bioperl msf files do not have this.  I have
tested a one-line addition that would fix this:

Bio/AlignIO/msf.pm:
         while( $count < $length ) {
             # there is another block to go!
+           $self->_print (sprintf("%22s%-27d%27d\n",' ',$count+1,$count+50));
             foreach $name  ( @arr ) {
                 $self->_print (sprintf("%-20s  ",$name));

If there is a more formal way of submitting suggestions, please let
me know.
-- 
Wes Barris
E-Mail: Wes.Barris@csiro.au


From jason at cgt.duhs.duke.edu  Tue Feb 17 21:02:19 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Tue Feb 17 21:08:38 2004
Subject: [Bioperl-l] searchio scripts
In-Reply-To: <ADEFIHIJAHBGLBNJCIADMEAECEAA.rrouse@biomail.ucsd.edu>
References: <ADEFIHIJAHBGLBNJCIADMEAECEAA.rrouse@biomail.ucsd.edu>
Message-ID: <Pine.LNX.4.50.0402172059450.3929-100000@tenero.duhs.duke.edu>

presumably the report in blastreport has a hit which is better than the
signif cutoff of 0.1

my $in     = Bio::SearchIO->new( -format => 'blast',
                                 -signif => 0.1,
                                 -verbose=> 0 );


-jason
On Tue, 17 Feb 2004, Richard Rouse wrote:

> I recent installed bioperl-1.4 and am having problems with the blast report
> parsers in /examples/searchio/
>
>
> When I run:
> perl hitwriter.pl blastreport
> I get:
>
> Using SearchIO->new()
>
> 0 Blast report(s) processed.
> Output sent to file: >hitwriter.out
>
> I get the same result with rawwriter.pl, hspwriter.pl and custom_writer.pl
> although the htmlwriter.pl and the blast_example.pl work fine.
>
> Has anyone else encountered this problem and figured out how to fix it?
>
> Thanks,
>
> Richard
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From barry.moore at genetics.utah.edu  Tue Feb 17 19:23:01 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue Feb 17 21:12:11 2004
Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the Primer	TM
In-Reply-To: <s031d9ea.083@gwmail.jr2.ox.ac.uk>
References: <s031d9ea.083@gwmail.jr2.ox.ac.uk>
Message-ID: <4032B065.2000908@genetics.utah.edu>

John, Rob and others,

Well I certainly am not a Tm guru, but I'll reply none the less.  I have 
written a Tm calculator that uses thermodynamic parameters rather than 
"rule of thumb" calculations.  Mine follows the formula (and  
modifications) described by Integrated DNA Technologies on their web 
site 
(http://www.idtdna.com/program/techbulletins/Calculating_Tm_(melting_temperature).asp 
<http://www.idtdna.com/program/techbulletins/Calculating_Tm_%28melting_temperature%29.asp>).  
This code uses the equation: Tm = (dH / (dS +R * ln(C)) - 273.15  found 
in Breslauer (1986) and apparently used by GCG and the Python guys where 
R is the molar gas constant, and C is the molar concentration of the 
oligo.  It adds to that an adjustment of Na+ concentration as per Santa 
Lucia (1996) with some additional tweaking as described on the IDT web 
page above to give: Tm = (dH / (dS +R * ln(C)) - 273.15 + 12.0 * 
log[Na+] .  It uses the nearest-neighbor thermodynamic parameter set of 
Allawi and SantaLucia (1997), but it looks like maybe it should be 
updated to the SantaLucia (1998) parameter set.

I haven't read all the papers discussed in the various posts today, only 
the couple that my code is based on (and had plenty of trouble 
understanding all that was in those!), so I don't want to imply that the 
equations that I use are based on a thorough review of oligo 
thermodynamics literature, but the code seems to work, and it gives good 
Tm values. Theoretically I should get the exact same values as IDT's web 
calculator, but I don't.  My values are always very close, but off by a 
fraction to a couple degrees.  It may be due to a difference in 
parameter sets, although I'm using the same one that IDT references on 
their site.

Rob, I morphed my code into your Primer Tm method, and tried it out.  It 
seems to work fine.  It requires one extra parameter (oligo 
concentration) that I just defaulted.  If you want to use the code as is 
(or as a starting point) it is yours to do with as you see fit.  I can 
update the "thermodynamic parameters" hash to the SantaLucia (1998) 
values if this code looks promising and there is general agreement those 
values are the better.  I don't have CVS access, so I'll just post the 
modified method code at the very end of this message.

I did a quick and dirty test to see how Tm values differ between your 
Primer Tm method, my code, and IDT's web calculator.  They tend towards 
Tm(Rob) > Tm(IDT) > Tm(Barry).  Here's the result:

Oligo 	Primer.pm (Rob) 	Primer.pm (Barry) 	IDT
ACCGATACCG 	34.49709793 	29.41129054 	31.3
ACCCGATCTAGTAGA 	49.03043126 	41.9210458 	43.3
CATGGAGAGGGTGCAAATCC 	62.44709793 	55.72210633 	56.8
AAAGTAACCGAGAGAATCTGGAACA 	62.29709793 	56.7940798 	57.7
GGCTTTTGAAGTGGCAGAAAGACTGGGGGT 	71.76376459 	67.19994018 	68
CACTCGCCTGCTGGATGCAGAAGATGTGGATGTGC 	76.18281221 	70.5586708 	71.2
CTCTCCAGATGAAAAGTCTGTAATCACTTATGTGTCTTCG 	71.29709793 	63.54486627 	64.1
ATTTATGATGCCTTCCCTAAAGTTCCTGAGGGTGGAGAAGGGATC 	75.69709793 	69.5961597 
70.1
AGTGCTACGGAAGTGGACTCCAGGTGGCAAGAATACCAAAGCCGAGTGGA 	80.03709793 
75.41693338 	75.9


Here are the papers I've referenced:

    * Breslauer, K.J., Frank, R., Blocker, H., Marky, L.A. (1986)
      "Predicting DNA duplex stability from the base sequence."
      Proc.Natl. Acad. Sci. USA 83:3746-3750
    * SantaLucia, Jr., J.S, Allawi, H.T., Seneviratne, P.A. (1996)
      "Improved nearest-neighbor parameters for predicting DNA duplex
      stability" Biochemistry 35:3555-3562.
    * Allawi, H.T., SantaLucia, J. Jr. (1997) "Thermodynamics and NMR of
      internal G.T mismatches in DNA." Biochemistry 36: 10581-10594
    * SantaLucia J. Jr. (1998) "A unified view of polymer, dumbbell, and
      oligonucleotide DNA nearest-neighbor thermodynamics" PNAS 95:
      1460-1465.


Here is the new Tm method code:

=head2 Tm()

 Title   : Tm()
 Usage   : $tm = $primer->Tm(-salt=>'0.05')
 Function: Calculates and returns the Tm (melting temperature) of the primer
 Returns : A scalar containing the Tm.
 Args    : -salt set the Na+ concentration on which to base the calculation.
           (A parameter should be added to allow the oligo concentration 
to be set.)
 Notes   : Calculation of Tm as per Allawi et. al Biochemistry 1997 
36:10581-10594.  Also see
           documentation at http://biotools.idtdna.com/analyzer/ as they 
use this formula and
           have a couple nice help pages.  These Tm values will be about 
are about 0.5-3 degrees
           off from those of the idtdna web tool.  I don't know why.
=cut

sub Tm  {
    my ($self, %args) = @_;
    my $salt_conc = 0.05; #salt concentration (molar units)
    my $oligo_conc = 0.00000025; #oligo concentration (molar units)
    if ($args{'-salt'}) {$salt_conc = $args{'-salt'}} #accept object 
defined salt concentration
    #if ($args{'-oligo'}) {$oligo_conc = $args{'-oligo'}} #accept object 
defined oligo concentration
    my $seqobj = $self->seq();
    my $length = $seqobj->length();
    my $sequence = uc $seqobj->seq();
    my @dinucleotides;
    my $enthalpy;
    my $entropy;
    #Break sequence string into an array of all possible dinucleotides
    while ($sequence =~ /(.)(?=(.))/g) {
        push @dinucleotides, $1.$2;
    }
    #Build a hash with the thermodynamic values
    my %thermo_values = ('AA' => {'enthalpy' => -7.9,
                                  'entropy'  => -22.2},
                         'AC' => {'enthalpy' => -8.4,
                                  'entropy'  => -22.4},
                         'AG' => {'enthalpy' => -7.8,
                                  'entropy'  => -21},
                         'AT' => {'enthalpy' => -7.2,
                                  'entropy'  => -20.4},
                         'CA' => {'enthalpy' => -8.5,
                                  'entropy'  => -22.7},
                         'CC' => {'enthalpy' => -8,
                                  'entropy'  => -19.9},
                         'CG' => {'enthalpy' => -10.6,
                                  'entropy'  => -27.2},
                         'CT' => {'enthalpy' => -7.8,
                                  'entropy'  => -21},
                         'GA' => {'enthalpy' => -8.2,
                                  'entropy'  => -22.2},
                         'GC' => {'enthalpy' => -9.8,
                                  'entropy'  => -24.4},
                         'GG' => {'enthalpy' => -8,
                                  'entropy'  => -19.9},
                         'GT' => {'enthalpy' => -8.4,
                                  'entropy'  => -22.4},
                         'TA' => {'enthalpy' => -7.2,
                                  'entropy'  => -21.3},
                         'TC' => {'enthalpy' => -8.2,
                                  'entropy'  => -22.2},
                         'TG' => {'enthalpy' => -8.5,
                                  'entropy'  => -22.7},
                         'TT' => {'enthalpy' => -7.9,
                                  'entropy'  => -22.2},
                         'A' =>  {'enthalpy' => 2.3,
                                  'entropy'  => 4.1},
                         'C' =>  {'enthalpy' => 0.1,
                                  'entropy'  => -2.8},
                         'G' =>  {'enthalpy' => 0.1,
                                  'entropy'  => -2.8},
                         'T' =>  {'enthalpy' => 2.3,
                                  'entropy'  => 4.1}
                        );
    #Loop through dinucleotides and calculate cumulative enthalpy and 
entropy values
    for (@dinucleotides) {
       $enthalpy += $thermo_values{$_}{enthalpy};
       $entropy += $thermo_values{$_}{entropy};
    }
    #Account for initiation parameters
    $enthalpy += $thermo_values{substr($sequence, 0, 1)}{enthalpy};
    $entropy += $thermo_values{substr($sequence, 0, 1)}{entropy};
    $enthalpy += $thermo_values{substr($sequence, -1, 1)}{enthalpy};
    $entropy += $thermo_values{substr($sequence, -1, 1)}{entropy};
    #Symmetry correction
    $entropy -= 1.4;
    my $r = 1.987; #molar gas constant
    my $tm = ($enthalpy * 1000 / ($entropy + ($r * log($oligo_conc))) - 
273.15 + (12* (log($salt_conc)/log(10))));
    $self->{'Tm'}=$tm;
    return $tm;
}

From jason at cgt.duhs.duke.edu  Tue Feb 17 21:16:45 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Tue Feb 17 21:23:08 2004
Subject: [Bioperl-l] msf output
In-Reply-To: <4032C634.3000507@csiro.au>
References: <4032C634.3000507@csiro.au>
Message-ID: <Pine.LNX.4.50.0402172116180.3929-100000@tenero.duhs.duke.edu>

Done.  Thanks Wes!

-jason
On Wed, 18 Feb 2004, Wes Barris wrote:

> Hi,
>
> Msf files produced by StackPACK have coordinates listed above each
> group of sequence data.  Bioperl msf files do not have this.  I have
> tested a one-line addition that would fix this:
>
> Bio/AlignIO/msf.pm:
>          while( $count < $length ) {
>              # there is another block to go!
> +           $self->_print (sprintf("%22s%-27d%27d\n",' ',$count+1,$count+50));
>              foreach $name  ( @arr ) {
>                  $self->_print (sprintf("%-20s  ",$name));
>
> If there is a more formal way of submitting suggestions, please let
> me know.
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu

From hlapp at gmx.net  Wed Feb 18 04:19:30 2004
From: hlapp at gmx.net (Hilmar Lapp)
Date: Wed Feb 18 04:25:46 2004
Subject: [Bioperl-l] New GO Parser and errors loading biosql database
In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE05@nrcmrdex1d.imsb.nrc.ca>
Message-ID: <8E53E59D-61F3-11D8-97D5-000A959EB4C4@gmx.net>


On Tuesday, February 17, 2004, at 11:42  AM, Law, Annie wrote:

> Hi,
>
> I would appreciate help with the following.
> I have installed the newest bioperl-db and biosql schema from cvs.
> I tried to load the database with information from godatabase.org and 
> got
> some errors listed further below (the
> Tables did not fill at all).
> Next I tried to load the database with Locuslink data from NCBI.
>
> 1)I got the LL file from NCBI and tried to load an empty datbase with a
> LL_tmpl file (for human) and
> It seemed to load properly and the tables were filling up but then it
> stopped after about 900 bioentries.
> I'm not sure what went wrong.  There seem to be a complaint about 
> duplicate
> entry but I don't think I should
> Modify the source file.
>

It should never be necessary to modify the source file.

First of all, unless you're testing or debugging something and actually 
*want* to get thrown out upon the first error, you should always 
specify --safe, for load_ontology.pl as well as for 
load_seqdatabase.pl. This will roll back a sequence that fails to load, 
but will otherwise keep going.


> [root@ data]# perl /root/bioperl-db/scripts/biosql/load_seqdatabase.pl
> --dbuser=root
>  --dbpass=mss22 --dbname bioseqdb --namespace "LocusLink" -format 
> locuslink
> /var/lib/mysql/LL_
> _tmpl --dbpass=bioinf1 --dbname bioseqdb --namespace "LocusLink" 
> -format
> locuslink /var/lib/mysql/LL_tmpl
> Loading /var/lib/mysql/LL_tmpl ...
>
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values 
> were
> ("GO:0005699","kinetochore","","") FKs (6)
> Duplicate entry 'kinetochore-6' for key 2
> ---------------------------------------------------

This basically means that there is already another term 'kinetochore' 
in the same ontology, but with a different GO id. I.e., the look-up by 
GO id failed for this one, prompting the system to insert the term as a 
new one, which then (unexpectedly) fails too because of the unique key 
violation.

This is not atypical for annotation being a work in progress. Also, and 
actually more likely, you may have had remnants of previous data loads 
there. GO terms get merged with others, and then previously primary IDs 
either get retired or become secondary IDs. A database of annotated 
genes like LL may not be immediately up to date.

Especially for LL the best thing is to always pre-load GO and other 
ontologies that sequences are associated (annotated) with. Also, it's 
not a bad idea to pre-load the NCBI taxonomy database using the script 
load_ncbi_taxonomy.pl in the biosql repository.

>
> 2) Updating GO parser
> I saw that the GO parser was updated recently and I have located the 
> code
> version 1.17.2.1 I downloaded the new version.  I am using bioperl 1.4.
> Should I just take the new dagflat.pm and replace the old one or are 
> there
> more steps involved?

Not really. There is also an updated test but you don't need that.

>   When I download whole Modules I need to use make
> commands.
>
> Also I saw that dagflat.pm requires graph.pm.

It's not actually dagflat that requires it but the OntologyEngineI 
implementation it populates behind the scenes 
(Bio::Ontology::SimpleGOEngine if you're curious).

But as a consequence of all this, yes if you use the dagflat parser 
(and goflat and soflat are basically just other names for the same 
parser) you do need Graph.pm from CPAN.

>   Is this graph.pm part of the
> bioperl 1.4 package I couldn't seem to find it or do I need to 
> download and
> install from CPAN.

You get it from CPAN. The name is Graph.pm. If the CPAN shell doesn't 
understand that, ask it to install Graph::Directed.


>   I searched CPAN for graph.pm and got several hits.  Is
> this the one I need? http://search.cpan.org/~mverb/GDGraph-1.43/
> Do I also need GD.pm? I think I saw somewhere that it is required?
> http://search.cpan.org/~lds/GD-2.11/GD.pm
> Although this could be a mistake
>

You do not need GD (or GDGraph or whatever) for bioperl-db.


> Where is the best place to install GD and graph.pm (with dagflat.pm or 
> the
> main perl library)?
> I'm not sure whether the main perl library is /usr/lib/perl5/5.8.0 or
> /usr/lib/perl5/site_perl/5.8.0/Bio

The CPAN shell will do that automatically. Also, if you just say 'make 
install' in a perl module's root source directory, it will be installed 
in the right place. The only think to be careful about is to use the 
same perl for running 'perl Makefile.PL' that you otherwise use for 
running perl scripts.

>
>
> 3) I installed Bioperl-db and downloaded the biosql schema 
> successfully but
> when I tried to use the Load_ontology.pl I got some errors which seem 
> to be
> saying that I am missing some main modules such as goflat (I recorded a
> script of the output). But I have goflat.pm.
> Am I calling the perl script incorrectly?  Or are there still some 
> modules I
> need to install? I'm not sure that I am using the correct
> Syntax for the format field.

The reason it is failing is because you don't have Graph.pm installed 
as the stack trace states:

> Bio::OntologyIO: goflat cannot be found
> Exception
> ------------- EXCEPTION  -------------
> MSG: Failed to load module Bio::OntologyIO::goflat. Can't locate
> Graph/Directed.pm in @INC (@INC contains:

The initial message that goflat.pm cannot be found is just a (wrong in 
this case) interpretation of the failure to dynamically load the module.

	-hilmar

-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From lstein at cshl.edu  Wed Feb 18 04:23:45 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Wed Feb 18 04:45:36 2004
Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem, new wormbase models.
In-Reply-To: <BC57ED88.BB42%todd.harris@cshl.org>
References: <BC57ED88.BB42%todd.harris@cshl.org>
Message-ID: <200402181123.45491.lstein@cshl.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'll make a new prepackaged aggregator for this as soon as WormBase 
makes the final and long-anticipated transition to real genes.

Lincoln

On Wednesday 18 February 2004 12:10 am, Todd Harris wrote:
> Hi Phillip -
>
> You need to aggregate the separate parts of the CDS.  Create a
> wormbase_cds (or whatever you wish to call it), aggregating the
> following features using the CDS group: coding_exon,5_UTR,3_UTR.
>
> The following stanza should do the trick.
>
> $dbgff = (-adaptor => 'dbi::mysql',
>           -dsn     =>
> 'dbi:mysql:database=your_database;host=your_host', -aggregators =>
> [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})], -user    =>
> 'your_username',
>           -pass    => 'your_dbgff_pass');
>
> This should do the trick for properly aggregating genes under the
> new WormBase CDS class.
>
> Todd Harris
>
> > On 2/17/04 2:18 PM, Philip MacMenamin wrote:
> >
> > I cannot get things to aggregate properly in the new wormbase
> > models. Previously this worked:
> > $aggregator = Bio::DB::GFF::Aggregator->new(-method    => 'test',
> >       -sub_parts => ['UTR','CDS:curated']
> >      );
> > (Or I could of more simply used the 'processed_transcript'
> > aggregator that Linoln wrote.)
> >
> > And one feature, with the UTRs "aggregated" together with the
> > curated CDS's, was returned:
> > test:5_UTR(AH6.5), start->9524078 end->9526248
> > This could be fed to some kind of glyph, and would draw a nice
> > picture of UTRs hanging off a coding seq.
> >
> > [--------------]    [----------]  [---->
> > (You have to use your imaginatio a bit here)
> >
> > With the new models of wormbase, this is not the case, so I am
> > re-writing code to accomadate these changes.
> >
> > Now I am returned for example 3 features: a 3 prime UTR, a prime,
> > and a CDS. test:UTR(5_UTR:AH6.5)9524078 9524086
> > test:UTR(3_UTR:AH6.5)9525782 9526248
> > test:curated(AH6.5)9524087 9525781
> >
> > The glyph then draws these things on two planes.
> > []                [------->
> > [---------]  [------------] [----------]
> >
> > My assumption is that a single feature can be rendered as a
> > single glyph. And if you have >1 feature, then >1 glyphs are
> > needed. Therefore, I assume the problem stems from my failure to
> > aggregate UTR and CDS features.
> >
> > I have noticed that the new SQL GFF Db used to have two fmethods
> > for UTRs, one for both 3 and 5 primes. The new one has only one.
> > There are other changes of course, but I think this is important.
> >
> > So, essentially I want to draw UTRs with CDS's attached... with
> > the new wormbase models. If anyone knows how to do this, i'd like
> > to hear from them.
> >
> > Thanks :)
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

- -- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFAMy8h0CIvUP7P+AkRAtCSAJwPVPdXqs9rXSFYdCD8lVhsB/5wkACdEOrr
EwcJXiat61tP3F5XJXA1j+c=
=P0TB
-----END PGP SIGNATURE-----
From lstein at cshl.edu  Wed Feb 18 04:57:06 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Wed Feb 18 05:28:28 2004
Subject: [Bioperl-l] Re: BOSC 2004
In-Reply-To: <4031E4C5.5020409@egenetics.com>
References: <4031E4C5.5020409@egenetics.com>
Message-ID: <200402181157.06253.lstein@cshl.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

http://open-bio.org/bosc2004/

If you scroll down the open bio page, you'll see a huge banner 
advertisement for BOSC.  Maybe you have images turned off?

Best,

Lincoln

On Tuesday 17 February 2004 11:54 am, you wrote:
> Hi Lincoln
>
> Where can I find info about BOSC 2004? Seems there isn't anything
> on open-bio.org - I'm looking for info on deadlines for poster
> submissions, etc.
>
> Peter

- -- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFAMzby0CIvUP7P+AkRAi/ZAJ9x6zi5v3Ehe45S81wAjDXTNp4eqACeO4Vu
Relp13w9sW06zEVJcBpkFZ4=
=0YAy
-----END PGP SIGNATURE-----
From steve_chervitz at affymetrix.com  Wed Feb 18 05:27:57 2004
From: steve_chervitz at affymetrix.com (Steve Chervitz)
Date: Wed Feb 18 05:34:11 2004
Subject: [Bioperl-l] searchio scripts
In-Reply-To: <ADEFIHIJAHBGLBNJCIADMEAECEAA.rrouse@biomail.ucsd.edu>
References: <ADEFIHIJAHBGLBNJCIADMEAECEAA.rrouse@biomail.ucsd.edu>
Message-ID: <1EC43881-61FD-11D8-A988-000A95765236@affymetrix.com>

Looks like there was a change in the Root::IO.pm module that affects 
the way these scripts process command-line arguments. As of 
bioperl-1.303, the SearchIO::blast module appears to be unable to read 
data from STDIN or files listed in @ARGV. This affects the scripts in 
examples/searchio and scripts/searchio.

As a workaround, I'd recommend you iterate over @ARGV in your script 
and initialize the SearchIO object using the -file option to new(), as 
in:

while (my $file = shift @ARGV) {
     my $in = Bio::SearchIO->new( -format => 'blast',
                                  -file => $file
                                );
     while ( my $result = $in->next_result() ) {
         # process result...
     }
}

As far as tracking down the cause, I've pinpointed the following change 
in Bio::Root::IO::_readline():

     my $fh = $self->_fh or return;   # revision 1.50 (bioperl-1.303)

formerly this was:

     my $fh = $self->_fh || \*ARGV;   # revision 1.49 (bioperl-1.302)

This also appears to break SeqIO reading from STDIN. Try executing this 
at the top-level distribution dir for the 1.302 and 1.303 releases:

     perl -I. ./scripts/seq/translate_seq.PLS -format fasta < 
t/data/dna1.fa

According to Lincoln's commit log, the Root::IO::_readline() change was 
necessary to get the GFF, SeqFeature, and Registry regression tests 
working. I tested these tests with the 1.49 version of IO.pm and the 
only one that was affected was SeqFeature.t. Specifically, test #6 
which calls SeqFeature::Generic::gff_string() hangs and waits for input 
before proceeding. I'm not sure why this is... (getting late).

BTW, platforms tested: Perl 5.6.1 and 5.8.0 on Linux (RH9) and Perl 
5.8.1-RC3 on MacOS X (10.3.2).

Steve

On Feb 17, 2004, at 3:14 PM, Richard Rouse wrote:

> I recent installed bioperl-1.4 and am having problems with the blast 
> report
> parsers in /examples/searchio/
>
>
> When I run:
> perl hitwriter.pl blastreport
> I get:
>
> Using SearchIO->new()
>
> 0 Blast report(s) processed.
> Output sent to file: >hitwriter.out
>
> I get the same result with rawwriter.pl, hspwriter.pl and 
> custom_writer.pl
> although the htmlwriter.pl and the blast_example.pl work fine.
>
> Has anyone else encountered this problem and figured out how to fix it?
>
> Thanks,
>
> Richard
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

From Alexandre.Irrthum at icr.ac.uk  Wed Feb 18 07:31:27 2004
From: Alexandre.Irrthum at icr.ac.uk (Alexandre Irrthum)
Date: Wed Feb 18 07:37:50 2004
Subject: [Bioperl-l] AlignIO warning
Message-ID: <s0335b26.071@icr.ac.uk>

Hi there,

The snippet of code shown below works fine (with bioperl 1.4), but it
issues this warning when next_aln() is called:


-------------------- WARNING ---------------------
MSG: Must provide which type of BLAST was run (blastp,blastn, tblastn,
tblastx, blastx) if you want strand information to get set properly for
DNA query or subjects


#!/usr/bin/perl
 
use warnings;
use strict;
use Bio::Seq;
use Bio::Tools::Run::StandAloneBlast;
use Bio::AlignIO;
 
my $seq1 = Bio::Seq->new(-display_id => 'Sequence1', -seq =>
'AGGATAGGGCGGATAGGTAGCGCCGATTTACGCGATACGCG');
my $seq2 = Bio::Seq->new(-display_id => 'Sequence2', -seq =>
'AGGATAGGGCAGATAGGTAGCGCCGATTTACGTGATACGCG');
my $factory = Bio::Tools::Run::StandAloneBlast->new(program => 'blastn',
outfile => 'bl2seq.out');
$factory->bl2seq($seq1, $seq2);
my $str = Bio::AlignIO->new(-file => 'bl2seq.out', -format => 'bl2seq');
my $aln = $str->next_aln(); ###### Warning issued here ######
foreach my $seq ($aln->each_seq()) {
    print $seq->seq(), "\n";
}

How am I supposed to provide program name ?

Thank you for your help.

Alex 
From lstein at cshl.edu  Wed Feb 18 07:57:24 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Wed Feb 18 08:04:17 2004
Subject: [Bioperl-l] searchio scripts
In-Reply-To: <1EC43881-61FD-11D8-A988-000A95765236@affymetrix.com>
References: <ADEFIHIJAHBGLBNJCIADMEAECEAA.rrouse@biomail.ucsd.edu>
	<1EC43881-61FD-11D8-A988-000A95765236@affymetrix.com>
Message-ID: <200402181457.24179.lstein@cshl.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Or do this:

	my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV);
	while (my $result = $in->next_result()) {
		...
	}

That might even be easier.

Lincoln

On Wednesday 18 February 2004 12:27 pm, Steve Chervitz wrote:
> Looks like there was a change in the Root::IO.pm module that
> affects the way these scripts process command-line arguments. As of
> bioperl-1.303, the SearchIO::blast module appears to be unable to
> read data from STDIN or files listed in @ARGV. This affects the
> scripts in examples/searchio and scripts/searchio.
>
> As a workaround, I'd recommend you iterate over @ARGV in your
> script and initialize the SearchIO object using the -file option to
> new(), as in:
>
> while (my $file = shift @ARGV) {
>      my $in = Bio::SearchIO->new( -format => 'blast',
>                                   -file => $file
>                                 );
>      while ( my $result = $in->next_result() ) {
>          # process result...
>      }
> }
>
> As far as tracking down the cause, I've pinpointed the following
> change in Bio::Root::IO::_readline():
>
>      my $fh = $self->_fh or return;   # revision 1.50
> (bioperl-1.303)
>
> formerly this was:
>
>      my $fh = $self->_fh || \*ARGV;   # revision 1.49
> (bioperl-1.302)
>
> This also appears to break SeqIO reading from STDIN. Try executing
> this at the top-level distribution dir for the 1.302 and 1.303
> releases:
>
>      perl -I. ./scripts/seq/translate_seq.PLS -format fasta <
> t/data/dna1.fa
>
> According to Lincoln's commit log, the Root::IO::_readline() change
> was necessary to get the GFF, SeqFeature, and Registry regression
> tests working. I tested these tests with the 1.49 version of IO.pm
> and the only one that was affected was SeqFeature.t. Specifically,
> test #6 which calls SeqFeature::Generic::gff_string() hangs and
> waits for input before proceeding. I'm not sure why this is...
> (getting late).
>
> BTW, platforms tested: Perl 5.6.1 and 5.8.0 on Linux (RH9) and Perl
> 5.8.1-RC3 on MacOS X (10.3.2).
>
> Steve
>
> On Feb 17, 2004, at 3:14 PM, Richard Rouse wrote:
> > I recent installed bioperl-1.4 and am having problems with the
> > blast report
> > parsers in /examples/searchio/
> >
> >
> > When I run:
> > perl hitwriter.pl blastreport
> > I get:
> >
> > Using SearchIO->new()
> >
> > 0 Blast report(s) processed.
> > Output sent to file: >hitwriter.out
> >
> > I get the same result with rawwriter.pl, hspwriter.pl and
> > custom_writer.pl
> > although the htmlwriter.pl and the blast_example.pl work fine.
> >
> > Has anyone else encountered this problem and figured out how to
> > fix it?
> >
> > Thanks,
> >
> > Richard
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

- -- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFAM2E00CIvUP7P+AkRAurTAJ9gwb4Os0M5uDWhlE40JphLRIAG+gCfQ5Ji
zXHLGwtfDAB2Np2nKBZkuw0=
=IsKs
-----END PGP SIGNATURE-----
From sdavis2 at mail.nih.gov  Wed Feb 18 08:17:37 2004
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed Feb 18 08:13:35 2004
Subject: [Bioperl-l] Blast results question
Message-ID: <BC58D021.4A28%sdavis2@mail.nih.gov>

I have a large number of blast results against the (human) genome.  The
query was a large number of oligos (from microarray) taken from various ESTs
or full-length transcripts.  I have many that are "broken" by splice sites
in the genome resulting in two different "hits" near each other.  Is there
someone who has code or suggestions about how to "stitch" these hits back
together? 

Thanks,
Sean

From nathanhaigh at ukonline.co.uk  Wed Feb 18 08:39:36 2004
From: nathanhaigh at ukonline.co.uk (Nathan Haigh)
Date: Wed Feb 18 08:45:58 2004
Subject: [Bioperl-l] gap characters in SimpleAlign objects
Message-ID: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA+EsXQZcrCEGeBpZF7/IE7sKAAAAQAAAA1stSz41OY0iZy5H5SfNn5QEAAAAA@ukonline.co.uk>

I've been using the clustalw module for creating alignment, and I've just
realised that when you output the alignment the gap character is a "." not a
"-".
This is most annoying because I am adding support to this module for
generating trees via clustalw, and clustalw removes these "." characters. Is
there a method for changing these gap characters to "-". I have seen the
gap_char method in the SimpleAlign module, but this seems only to designate
a particular character as a gap character, and does not actually change the
character.

Any ideas on how to do this substitution, and where in BioPerl does this
assignment get made in the first place, since the default gap char for
clustalw output is "-" not "."

Thanks
Nathan


From john.herbert at clinical-pharmacology.oxford.ac.uk  Wed Feb 18 08:48:34 2004
From: john.herbert at clinical-pharmacology.oxford.ac.uk (john herbert)
Date: Wed Feb 18 08:55:02 2004
Subject: [Bioperl-l] Bio::SeqFeature::Primer Calculating the
	Primer	TM
Message-ID: <s0336d3e.042@gwmail.jr2.ox.ac.uk>

Hello All.
Thanks for all you suggestions and code. I did have a go myself but it
was not until Barry supplied this code that I actually got it.
Kind regards,
John.

>>> Barry Moore <barry.moore@genetics.utah.edu> 18/02/2004 00:23:01
>>>
John, Rob and others,

Well I certainly am not a Tm guru, but I'll reply none the less.  I
have 
written a Tm calculator that uses thermodynamic parameters rather than

"rule of thumb" calculations.  Mine follows the formula (and  
modifications) described by Integrated DNA Technologies on their web 
site 
(http://www.idtdna.com/program/techbulletins/Calculating_Tm_(melting_temperature).asp

<http://www.idtdna.com/program/techbulletins/Calculating_Tm_%28melting_temperature%29.asp>).
 
This code uses the equation: Tm = (dH / (dS +R * ln(C)) - 273.15  found

in Breslauer (1986) and apparently used by GCG and the Python guys
where 
R is the molar gas constant, and C is the molar concentration of the 
oligo.  It adds to that an adjustment of Na+ concentration as per Santa

Lucia (1996) with some additional tweaking as described on the IDT web

page above to give: Tm = (dH / (dS +R * ln(C)) - 273.15 + 12.0 * 
log[Na+] .  It uses the nearest-neighbor thermodynamic parameter set of

Allawi and SantaLucia (1997), but it looks like maybe it should be 
updated to the SantaLucia (1998) parameter set.

I haven't read all the papers discussed in the various posts today,
only 
the couple that my code is based on (and had plenty of trouble 
understanding all that was in those!), so I don't want to imply that
the 
equations that I use are based on a thorough review of oligo 
thermodynamics literature, but the code seems to work, and it gives
good 
Tm values. Theoretically I should get the exact same values as IDT's
web 
calculator, but I don't.  My values are always very close, but off by a

fraction to a couple degrees.  It may be due to a difference in 
parameter sets, although I'm using the same one that IDT references on

their site.

Rob, I morphed my code into your Primer Tm method, and tried it out. 
It 
seems to work fine.  It requires one extra parameter (oligo 
concentration) that I just defaulted.  If you want to use the code as
is 
(or as a starting point) it is yours to do with as you see fit.  I can

update the "thermodynamic parameters" hash to the SantaLucia (1998) 
values if this code looks promising and there is general agreement
those 
values are the better.  I don't have CVS access, so I'll just post the

modified method code at the very end of this message.

I did a quick and dirty test to see how Tm values differ between your 
Primer Tm method, my code, and IDT's web calculator.  They tend towards

Tm(Rob) > Tm(IDT) > Tm(Barry).  Here's the result:

Oligo 	Primer.pm (Rob) 	Primer.pm (Barry) 	IDT
ACCGATACCG 	34.49709793 	29.41129054 	31.3
ACCCGATCTAGTAGA 	49.03043126 	41.9210458 	43.3
CATGGAGAGGGTGCAAATCC 	62.44709793 	55.72210633 	56.8
AAAGTAACCGAGAGAATCTGGAACA 	62.29709793 	56.7940798 	57.7
GGCTTTTGAAGTGGCAGAAAGACTGGGGGT 	71.76376459 	67.19994018
	68
CACTCGCCTGCTGGATGCAGAAGATGTGGATGTGC 	76.18281221 	70.5586708
	71.2
CTCTCCAGATGAAAAGTCTGTAATCACTTATGTGTCTTCG 	71.29709793
	63.54486627 	64.1
ATTTATGATGCCTTCCCTAAAGTTCCTGAGGGTGGAGAAGGGATC 	75.69709793
	69.5961597 
70.1
AGTGCTACGGAAGTGGACTCCAGGTGGCAAGAATACCAAAGCCGAGTGGA 	80.03709793 
75.41693338 	75.9


Here are the papers I've referenced:

    * Breslauer, K.J., Frank, R., Blocker, H., Marky, L.A. (1986)
      "Predicting DNA duplex stability from the base sequence."
      Proc.Natl. Acad. Sci. USA 83:3746-3750
    * SantaLucia, Jr., J.S, Allawi, H.T., Seneviratne, P.A. (1996)
      "Improved nearest-neighbor parameters for predicting DNA duplex
      stability" Biochemistry 35:3555-3562.
    * Allawi, H.T., SantaLucia, J. Jr. (1997) "Thermodynamics and NMR
of
      internal G.T mismatches in DNA." Biochemistry 36: 10581-10594
    * SantaLucia J. Jr. (1998) "A unified view of polymer, dumbbell,
and
      oligonucleotide DNA nearest-neighbor thermodynamics" PNAS 95:
      1460-1465.


Here is the new Tm method code:

=head2 Tm()

 Title   : Tm()
 Usage   : $tm = $primer->Tm(-salt=>'0.05')
 Function: Calculates and returns the Tm (melting temperature) of the
primer
 Returns : A scalar containing the Tm.
 Args    : -salt set the Na+ concentration on which to base the
calculation.
           (A parameter should be added to allow the oligo
concentration 
to be set.)
 Notes   : Calculation of Tm as per Allawi et. al Biochemistry 1997 
36:10581-10594.  Also see
           documentation at http://biotools.idtdna.com/analyzer/ as
they 
use this formula and
           have a couple nice help pages.  These Tm values will be
about 
are about 0.5-3 degrees
           off from those of the idtdna web tool.  I don't know why.
=cut

sub Tm  {
    my ($self, %args) = @_;
    my $salt_conc = 0.05; #salt concentration (molar units)
    my $oligo_conc = 0.00000025; #oligo concentration (molar units)
    if ($args{'-salt'}) {$salt_conc = $args{'-salt'}} #accept object 
defined salt concentration
    #if ($args{'-oligo'}) {$oligo_conc = $args{'-oligo'}} #accept
object 
defined oligo concentration
    my $seqobj = $self->seq();
    my $length = $seqobj->length();
    my $sequence = uc $seqobj->seq();
    my @dinucleotides;
    my $enthalpy;
    my $entropy;
    #Break sequence string into an array of all possible dinucleotides
    while ($sequence =~ /(.)(?=(.))/g) {
        push @dinucleotides, $1.$2;
    }
    #Build a hash with the thermodynamic values
    my %thermo_values = ('AA' => {'enthalpy' => -7.9,
                                  'entropy'  => -22.2},
                         'AC' => {'enthalpy' => -8.4,
                                  'entropy'  => -22.4},
                         'AG' => {'enthalpy' => -7.8,
                                  'entropy'  => -21},
                         'AT' => {'enthalpy' => -7.2,
                                  'entropy'  => -20.4},
                         'CA' => {'enthalpy' => -8.5,
                                  'entropy'  => -22.7},
                         'CC' => {'enthalpy' => -8,
                                  'entropy'  => -19.9},
                         'CG' => {'enthalpy' => -10.6,
                                  'entropy'  => -27.2},
                         'CT' => {'enthalpy' => -7.8,
                                  'entropy'  => -21},
                         'GA' => {'enthalpy' => -8.2,
                                  'entropy'  => -22.2},
                         'GC' => {'enthalpy' => -9.8,
                                  'entropy'  => -24.4},
                         'GG' => {'enthalpy' => -8,
                                  'entropy'  => -19.9},
                         'GT' => {'enthalpy' => -8.4,
                                  'entropy'  => -22.4},
                         'TA' => {'enthalpy' => -7.2,
                                  'entropy'  => -21.3},
                         'TC' => {'enthalpy' => -8.2,
                                  'entropy'  => -22.2},
                         'TG' => {'enthalpy' => -8.5,
                                  'entropy'  => -22.7},
                         'TT' => {'enthalpy' => -7.9,
                                  'entropy'  => -22.2},
                         'A' =>  {'enthalpy' => 2.3,
                                  'entropy'  => 4.1},
                         'C' =>  {'enthalpy' => 0.1,
                                  'entropy'  => -2.8},
                         'G' =>  {'enthalpy' => 0.1,
                                  'entropy'  => -2.8},
                         'T' =>  {'enthalpy' => 2.3,
                                  'entropy'  => 4.1}
                        );
    #Loop through dinucleotides and calculate cumulative enthalpy and 
entropy values
    for (@dinucleotides) {
       $enthalpy += $thermo_values{$_}{enthalpy};
       $entropy += $thermo_values{$_}{entropy};
    }
    #Account for initiation parameters
    $enthalpy += $thermo_values{substr($sequence, 0, 1)}{enthalpy};
    $entropy += $thermo_values{substr($sequence, 0, 1)}{entropy};
    $enthalpy += $thermo_values{substr($sequence, -1, 1)}{enthalpy};
    $entropy += $thermo_values{substr($sequence, -1, 1)}{entropy};
    #Symmetry correction
    $entropy -= 1.4;
    my $r = 1.987; #molar gas constant
    my $tm = ($enthalpy * 1000 / ($entropy + ($r * log($oligo_conc))) -

273.15 + (12* (log($salt_conc)/log(10))));
    $self->{'Tm'}=$tm;
    return $tm;
}

From birney at ebi.ac.uk  Wed Feb 18 08:54:30 2004
From: birney at ebi.ac.uk (Ewan Birney)
Date: Wed Feb 18 09:00:46 2004
Subject: [Bioperl-l] gap characters in SimpleAlign objects
In-Reply-To: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA+EsXQZcrCEGeBpZF7/IE7sKAAAAQAAAA1stSz41OY0iZy5H5SfNn5QEAAAAA@ukonline.co.uk>
Message-ID: <Pine.LNX.4.44.0402181352250.28969-100000@pigeon.ebi.ac.uk>

On Wed, 18 Feb 2004, Nathan Haigh wrote:

> I've been using the clustalw module for creating alignment, and I've just
> realised that when you output the alignment the gap character is a "." not a
> "-".
> This is most annoying because I am adding support to this module for
> generating trees via clustalw, and clustalw removes these "." characters. Is
> there a method for changing these gap characters to "-". I have seen the
> gap_char method in the SimpleAlign module, but this seems only to designate
> a particular character as a gap character, and does not actually change the
> character.
> 
> Any ideas on how to do this substitution, and where in BioPerl does this
> assignment get made in the first place, since the default gap char for
> clustalw output is "-" not "."

To fix (short term): Loop over the sequences making a new SimpleAlign
object with LocatableSeqs and s/\./-/ on the seq strings


How are you reading in Clustalw alignments? The Bio::AlignIO::clustalw 
doesn't touch the gap characters:


    foreach my $name ( sort { $order{$a} <=> $order{$b} } keys %alignments 
) {
        if( $name =~ /(\S+):(\d+)-(\d+)/ ) {
            ($sname,$start,$end) = ($1,$2,$3);      
        } else {
            ($sname, $start) = ($name,1);
            my $str  = $alignments{$name};
            $str =~ s/[^A-Za-z]//g;
            $end = length($str);
        }
        my $seq = new Bio::LocatableSeq('-seq'   => $alignments{$name},
                                         '-id'    => $sname,
                                         '-start' => $start,
                                         '-end'   => $end);
 

($alignments{$name} has no regex put on it earlier either)


> 
> Thanks
> Nathan
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------

From jason at cgt.duhs.duke.edu  Wed Feb 18 09:09:02 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb 18 09:15:19 2004
Subject: [Bioperl-l] get sequence failure (fwd)
Message-ID: <Pine.LNX.4.50.0402180909001.8084-100000@tenero.duhs.duke.edu>


--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu

---------- Forwarded message ----------
Date: Wed, 18 Feb 2004 09:42:21 +0100 (CET)
From: "[iso-8859-1] william ritchie" <billthebrute@yahoo.fr>
To: Jason Stajich <jason@cgt.duhs.duke.edu>
Subject: Re: [Bioperl-l] get sequence failure

sorry, even by using refseq, I m getting ID unknown!!
Thanks


Yahoo! Mail : votre e-mail personnel et gratuit qui vous suit partout !
Cr?ez votre Yahoo! Mail sur http://fr.benefits.yahoo.com/

From jason at cgt.duhs.duke.edu  Wed Feb 18 09:14:18 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb 18 09:21:09 2004
Subject: [Bioperl-l] gap characters in SimpleAlign objects
In-Reply-To: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA+EsXQZcrCEGeBpZF7/IE7sKAAAAQAAAA1stSz41OY0iZy5H5SfNn5QEAAAAA@ukonline.co.uk>
References: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA+EsXQZcrCEGeBpZF7/IE7sKAAAAQAAAA1stSz41OY0iZy5H5SfNn5QEAAAAA@ukonline.co.uk>
Message-ID: <Pine.LNX.4.50.0402180913040.8084-100000@tenero.duhs.duke.edu>

$aln->map_chars('\.','-');

On Wed, 18 Feb 2004, Nathan Haigh wrote:

> I've been using the clustalw module for creating alignment, and I've just
> realised that when you output the alignment the gap character is a "." not a
> "-".
> This is most annoying because I am adding support to this module for
> generating trees via clustalw, and clustalw removes these "." characters. Is
> there a method for changing these gap characters to "-". I have seen the
> gap_char method in the SimpleAlign module, but this seems only to designate
> a particular character as a gap character, and does not actually change the
> character.
>
> Any ideas on how to do this substitution, and where in BioPerl does this
> assignment get made in the first place, since the default gap char for
> clustalw output is "-" not "."
>
> Thanks
> Nathan
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From jason at cgt.duhs.duke.edu  Wed Feb 18 09:15:27 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb 18 09:21:51 2004
Subject: [Bioperl-l] gap characters in SimpleAlign objects
In-Reply-To: <Pine.LNX.4.44.0402181352250.28969-100000@pigeon.ebi.ac.uk>
References: <Pine.LNX.4.44.0402181352250.28969-100000@pigeon.ebi.ac.uk>
Message-ID: <Pine.LNX.4.50.0402180914410.8084-100000@tenero.duhs.duke.edu>

It's easier than this - not sure why gaps are becoming '.' but I had to
work around this in other places as well Coordinate::Pair.
 $aln->map_chars('\.','-')

--jason

On Wed, 18 Feb 2004, Ewan Birney wrote:

> On Wed, 18 Feb 2004, Nathan Haigh wrote:
>
> > I've been using the clustalw module for creating alignment, and I've just
> > realised that when you output the alignment the gap character is a "." not a
> > "-".
> > This is most annoying because I am adding support to this module for
> > generating trees via clustalw, and clustalw removes these "." characters. Is
> > there a method for changing these gap characters to "-". I have seen the
> > gap_char method in the SimpleAlign module, but this seems only to designate
> > a particular character as a gap character, and does not actually change the
> > character.
> >
> > Any ideas on how to do this substitution, and where in BioPerl does this
> > assignment get made in the first place, since the default gap char for
> > clustalw output is "-" not "."
>
> To fix (short term): Loop over the sequences making a new SimpleAlign
> object with LocatableSeqs and s/\./-/ on the seq strings
>
>
>
> How are you reading in Clustalw alignments? The Bio::AlignIO::clustalw
> doesn't touch the gap characters:
>
>
>     foreach my $name ( sort { $order{$a} <=> $order{$b} } keys %alignments
> ) {
>         if( $name =~ /(\S+):(\d+)-(\d+)/ ) {
>             ($sname,$start,$end) = ($1,$2,$3);
>         } else {
>             ($sname, $start) = ($name,1);
>             my $str  = $alignments{$name};
>             $str =~ s/[^A-Za-z]//g;
>             $end = length($str);
>         }
>         my $seq = new Bio::LocatableSeq('-seq'   => $alignments{$name},
>                                          '-id'    => $sname,
>                                          '-start' => $start,
>                                          '-end'   => $end);
>
>
>
> ($alignments{$name} has no regex put on it earlier either)
>
>
>
> >
> > Thanks
> > Nathan
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
>
> -----------------------------------------------------------------
> Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
> <birney@ebi.ac.uk>.
> -----------------------------------------------------------------
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From jason at cgt.duhs.duke.edu  Wed Feb 18 09:18:43 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb 18 09:25:05 2004
Subject: [Bioperl-l] Blast results question
In-Reply-To: <BC58D021.4A28%sdavis2@mail.nih.gov>
References: <BC58D021.4A28%sdavis2@mail.nih.gov>
Message-ID: <Pine.LNX.4.50.0402180915360.8084-100000@tenero.duhs.duke.edu>

You might want to try using est2genome/sim4/spidey/exonerate on the
region where you have your multiple hits if you care about the splice
sites/predicting a gene structure.  use the start & end methods from a
Search::Hit object to get the min/max location of the hits in the hit
sequence $hit->start('hit'),$hit->end('hit'), extract this seq, and run
one of the EST->genome aligners.  Or are you just trying to locate this
approximate region of the genome in the first place?

-jason

On Wed, 18 Feb 2004, Sean Davis wrote:

> I have a large number of blast results against the (human) genome.  The
> query was a large number of oligos (from microarray) taken from various ESTs
> or full-length transcripts.  I have many that are "broken" by splice sites
> in the genome resulting in two different "hits" near each other.  Is there
> someone who has code or suggestions about how to "stitch" these hits back
> together?
>
> Thanks,
> Sean
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From sdavis2 at mail.nih.gov  Wed Feb 18 09:33:35 2004
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed Feb 18 09:29:38 2004
Subject: [Bioperl-l] Blast results question
In-Reply-To: <Pine.LNX.4.50.0402180915360.8084-100000@tenero.duhs.duke.edu>
Message-ID: <BC58E1EF.4A3F%sdavis2@mail.nih.gov>

On 2/18/04 9:18 AM, "Jason Stajich" <jason@cgt.duhs.duke.edu> wrote:

> You might want to try using est2genome/sim4/spidey/exonerate on the
> region where you have your multiple hits if you care about the splice
> sites/predicting a gene structure.  use the start & end methods from a
> Search::Hit object to get the min/max location of the hits in the hit
> sequence $hit->start('hit'),$hit->end('hit'), extract this seq, and run
> one of the EST->genome aligners.  Or are you just trying to locate this
> approximate region of the genome in the first place?

You guessed it with the last question.  To clarify, what I need to do is to
determine whether there is a "best" hit against the genome.  I am interested
in hits with just a couple of mismatches or fewer, but over the entire
length of the query sequence.  Therefore, I need to know when I have a query
hitting to a piece of genomic sequence where the only discontinuity is due
to a splice site.  

> -jason
> 
> On Wed, 18 Feb 2004, Sean Davis wrote:
> 
>> I have a large number of blast results against the (human) genome.  The
>> query was a large number of oligos (from microarray) taken from various ESTs
>> or full-length transcripts.  I have many that are "broken" by splice sites
>> in the genome resulting in two different "hits" near each other.  Is there
>> someone who has code or suggestions about how to "stitch" these hits back
>> together?
>> 
>> Thanks,
>> Sean
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> 

From jason at cgt.duhs.duke.edu  Wed Feb 18 09:33:46 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb 18 09:40:06 2004
Subject: [Bioperl-l] AlignIO warning
In-Reply-To: <s0335b26.071@icr.ac.uk>
References: <s0335b26.071@icr.ac.uk>
Message-ID: <Pine.LNX.4.50.0402180918520.8084-100000@tenero.duhs.duke.edu>

Short answer, add -report_type => 'reporttype' - I recognize documentation
was lacking there - I am fixing that.  Now that SearchIO can parse bl2seq
reports, the AlignIO parser is just a shortcut convience for folks, if you
look at the code you'll  see this is the case.

my $str = Bio::AlignIO->new(-file => 'bl2seq.out', -format => 'bl2seq', -report_type => 'blastn');


However you don't have to create the output file and re-feed it to AlignIO.
If you have passed in
'_READMETHOD' => 'BLAST' (which is the default for 1.4 StandAloneBlast)
for initializing the factory object, then you get back a SearchIO object
for bl2seq and blast alignments runs:

my $searchio = $factory->bl2seq($seq1,$seq2);
my $r = $searchio->next_result;
my $hit = $r->next_hit;
for my $hsp ( $hit->hsps ) {
 my $aln = $hsp->get_aln;  # Bio::SimpleAlign object
}

In the TMTOWTDI - you can also get a Bio::Tools::BPbl2seq parser if you
pass in 'BPlite' to _READMETHOD in StandAloneBlast and get back an object
with slightly different API. This is the old way of parsing these reports
now superceeded by SearchIO.

-jason
On Wed, 18 Feb 2004, Alexandre Irrthum wrote:

> Hi there,
>
> The snippet of code shown below works fine (with bioperl 1.4), but it
> issues this warning when next_aln() is called:
>
>
> -------------------- WARNING ---------------------
> MSG: Must provide which type of BLAST was run (blastp,blastn, tblastn,
> tblastx, blastx) if you want strand information to get set properly for
> DNA query or subjects
>
>
>
> #!/usr/bin/perl
>
> use warnings;
> use strict;
> use Bio::Seq;
> use Bio::Tools::Run::StandAloneBlast;
> use Bio::AlignIO;
>
> my $seq1 = Bio::Seq->new(-display_id => 'Sequence1', -seq =>
> 'AGGATAGGGCGGATAGGTAGCGCCGATTTACGCGATACGCG');
> my $seq2 = Bio::Seq->new(-display_id => 'Sequence2', -seq =>
> 'AGGATAGGGCAGATAGGTAGCGCCGATTTACGTGATACGCG');
> my $factory = Bio::Tools::Run::StandAloneBlast->new(program => 'blastn',
> outfile => 'bl2seq.out');
> $factory->bl2seq($seq1, $seq2);
> my $str = Bio::AlignIO->new(-file => 'bl2seq.out', -format => 'bl2seq');
> my $aln = $str->next_aln(); ###### Warning issued here ######
> foreach my $seq ($aln->each_seq()) {
>     print $seq->seq(), "\n";
> }
>
> How am I supposed to provide program name ?
>
> Thank you for your help.
>
> Alex
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From nathanhaigh at ukonline.co.uk  Wed Feb 18 10:44:43 2004
From: nathanhaigh at ukonline.co.uk (Nathan Haigh)
Date: Wed Feb 18 10:51:09 2004
Subject: [Bioperl-l] gap characters in SimpleAlign objects
In-Reply-To: <Pine.LNX.4.50.0402180914410.8084-100000@tenero.duhs.duke.edu>
Message-ID: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAA+EsXQZcrCEGeBpZF7/IE7sKAAAAQAAAAP/1+UKqM80qkJW+4JFA3UwEAAAAA@ukonline.co.uk>

OK, I think I've figured out where my confusion lays:
I thought that the default output format from clustalw would be clustalw
format, but as it turns out it's gcg (MSF) which has '.' as it's gap
characters.
Ok, I've think I've figured out the problem (well at least part of it!):
The are a couple of lines in clustalw.pm that retrieve the alignment that
was generated by clustalw, but I think this may not have been updated
since the addition of more alignment formats to AlignIO. As a result it
defaults to MSF unless you have specified 'phylip' as the output format in
the alignment factory parameters @params.
As a result I have replaced the following lines:

	my $format= $output =~/phylip/i ? "phylip" : "MSF";
	my $in  = Bio::AlignIO->new(-file => $outfile, '-format' =>
$format);
with
	$self->output('MSF') if !$self->output();
	my $in  = Bio::AlignIO->new(-file => $outfile, '-format' =>
$self->output());

This leaves the default file format as MSF (although I think clustalw
would be a more obvious choice) but allows the user to specify any of the
other supported formats.

I will then use $aln->map_chars('\.','-') to change the gap characters
around.

The problem with this is that if you do not specify an output format, the
default MSF is used (which uses '.' as gaps) and then when you create an
output alignment stream in fasta format you get '.' as gaps (I'm pretty sure
fasta format requires '-' as the gap symbol). Therefore, would it not be
safer to check for the correct gap symbol in the fasta AlignIO module?


Thanks
Nathan


> -----Original Message-----
> From: Jason Stajich [mailto:jason@cgt.duhs.duke.edu]
> Sent: 18 February 2004 14:15
> To: Ewan Birney
> Cc: Nathan Haigh; bioperl-l@bioperl.org
> Subject: Re: [Bioperl-l] gap characters in SimpleAlign objects
>
> It's easier than this - not sure why gaps are becoming '.' but I had to
> work around this in other places as well Coordinate::Pair.
>  $aln->map_chars('\.','-')
>
> --jason
>
> On Wed, 18 Feb 2004, Ewan Birney wrote:
>
> > On Wed, 18 Feb 2004, Nathan Haigh wrote:
> >
> > > I've been using the clustalw module for creating alignment, and I've
just
> > > realised that when you output the alignment the gap character is a
"." not a
> > > "-".
> > > This is most annoying because I am adding support to this module for
> > > generating trees via clustalw, and clustalw removes these "."
characters. Is
> > > there a method for changing these gap characters to "-". I have seen
the
> > > gap_char method in the SimpleAlign module, but this seems only to
designate
> > > a particular character as a gap character, and does not actually
change the
> > > character.
> > >
> > > Any ideas on how to do this substitution, and where in BioPerl does
this
> > > assignment get made in the first place, since the default gap char
for
> > > clustalw output is "-" not "."
> >
> > To fix (short term): Loop over the sequences making a new SimpleAlign
> > object with LocatableSeqs and s/\./-/ on the seq strings
> >
> >
> >
> > How are you reading in Clustalw alignments? The Bio::AlignIO::clustalw
> > doesn't touch the gap characters:
> >
> >
> >     foreach my $name ( sort { $order{$a} <=> $order{$b} } keys
%alignments
> > ) {
> >         if( $name =~ /(\S+):(\d+)-(\d+)/ ) {
> >             ($sname,$start,$end) = ($1,$2,$3);
> >         } else {
> >             ($sname, $start) = ($name,1);
> >             my $str  = $alignments{$name};
> >             $str =~ s/[^A-Za-z]//g;
> >             $end = length($str);
> >         }
> >         my $seq = new Bio::LocatableSeq('-seq'   =>
$alignments{$name},
> >                                          '-id'    => $sname,
> >                                          '-start' => $start,
> >                                          '-end'   => $end);
> >
> >
> >
> > ($alignments{$name} has no regex put on it earlier either)
> >
> >
> >
> > >
> > > Thanks
> > > Nathan
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> >
> > -----------------------------------------------------------------
> > Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
> > <birney@ebi.ac.uk>.
> > -----------------------------------------------------------------
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu


From jegreenwood25 at hotmail.com  Wed Feb 18 10:53:21 2004
From: jegreenwood25 at hotmail.com (Jonathan Greenwood)
Date: Wed Feb 18 10:59:37 2004
Subject: [Bioperl-l] Clickable Glyphs...
Message-ID: <LAW11-F21zF1XgBzgr30005cba2@hotmail.com>

Hi, I've submitted my code with the email, what I'm trying to do is to 
render a Genbank file as a png file, I need to make each glyph clickable(I'm 
also displaying this page online)...any help with the new changes to 
Bio::Graphics::Panel would be appreciated...many thanks...

Sincerely,

Jonathan Greenwood
email: jonathon@mgcheo.med.uottawa.ca

code:
#! /usr/local/bin/perl -wT

use strict;
use Bio::Graphics;
use Bio::SeqIO;
use Bio::SeqFeature::Generic;
use CGI;
use CGI::Pretty;

my $file = 'x65306.gb';
my $io = Bio::SeqIO->new(-file=>$file);
my $seq = $io->next_seq;
my $wholeseq = Bio::SeqFeature::Generic->new(-start=>1,
                                                                     
-end=>$seq->length);
my @features = $seq->all_SeqFeatures;
my $q = new CGI;

# sort features by their primary tags
my %sorted_features;
for my $f (@features) {
  my $tag = $f->primary_tag;
  push @{$sorted_features{$tag}},$f;
}

print $q->header( 'text/html' );
print $q->start_html('A Vector Rendering');

my $panel = Bio::Graphics::Panel->new(-length      => $seq->length,
				      -width       => 1000,
				      -pad_left    => 10,
				      -pad_right   => 10,
				      -key_color   => 'white',
				      -key_spacing => 15,
				      -key_style   => 'bottom',
				      -spacing     => -0.25,
				      -box_subparts => 'true'
				      );

my ($url,$map,$mapname) = $panel->image_and_map(-root => 
'/webfiles/cgi-bin',
						-url  => '/tmpimages',
					       );

$panel->add_track($wholeseq,
		  -glyph  => 'arrow',
		  -bump   => +1,
		  -double => 1,
		  -tick   => 2
	          );

$panel->add_track($wholeseq,
		  -glyph   => 'generic',
		  -bgcolor => 'purple',
		  -height  => 12,
		  -key     => 'Whole Sequence',
		  -title   => 'Whole Sequence'
		  );

# special feature
if ($sorted_features{CDS}) {
  $panel->add_track($sorted_features{CDS},
		    -glyph          => 'transcript2',
		    -bgcolor        => 'orange',
		    -bump           =>  +1,
		    -height         => 12,
		    -key            => 'CDS',
		    -label          => \&gene_label,
		    -title          => 'CDS',
		    -link           => 'feature1.html#CDS'
		    );
  delete $sorted_features{'CDS'};
}

#general case
my @colors = qw(wheat blue yellow green cyan chartreuse magenta gray);
my $idx    = 0;
for my $tag (sort keys %sorted_features) {
my $features = $sorted_features{$tag};
$panel->add_track($features,
		  -glyph        =>  'generic',
		  -bgcolor      =>  $colors[$idx++ % @colors],
		  -fgcolor      =>  'black',
		  -font2color   => 'red',
		  -key          => "${tag}s",
		  -bump         => +1,
		  -height       => 12,
                  -label        => \&gene_label,
		  -description  => \&generic_description,
		  -title        => \&gene_label,
		  -link         => 'feature1.html#$tag',
		  );
}

print $q->img({-src=>$url,-usemap=>"#$mapname"});
print $q->$map;
print $q->($panel->png);

print $q->exit_html;

exit;

  sub gene_label {
     my $feature = shift;
     my @notes;
     foreach (qw(product gene)) {
       next unless $feature->has_tag($_);
       @notes = $feature->each_tag_value($_);
       last;
    }
    $notes[0];
  }

  sub generic_description {
    my $feature = shift;
    my $description;
    foreach ($feature->all_tags) {
      my @values = $feature->each_tag_value($_);
      $description .= $_ eq 'note' ? "@values" : "$_=@values; ";
    }
    $description =~ s/; $//; # get rid of last
    $description;
  }

_________________________________________________________________
The new MSN 8: smart spam protection and 2 months FREE*  
http://join.msn.com/?page=features/junkmail  
http://join.msn.com/?page=dept/bcomm&pgmarket=en-ca&RU=http%3a%2f%2fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgmarket%3den-ca

From pm66 at nyu.edu  Wed Feb 18 11:41:46 2004
From: pm66 at nyu.edu (Philip MacMenamin)
Date: Wed Feb 18 11:51:06 2004
Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem,
	new =?iso-8859-1?q?wormbase	models=2E?=
In-Reply-To: <BC57ED88.BB42%todd.harris@cshl.org>
References: <BC57ED88.BB42%todd.harris@cshl.org>
Message-ID: <200402181644.i1IGii05005378@mx2.nyu.edu>

Thanks very much Todd...

What version of wormbase are you using? I am using WS118. I am not able to 
get this aggregator to return me the UTR bits. 

For instance, I connect to the db using your agg:
my $db = new Bio::DB::GFF(-adaptor=>'dbi::mysqlopt',
#			  -dsn=>'dbi:mysql:wormbase110;host=localhost',
			  -dsn=>'dbi:mysql:wormbase118;host=localhost',
			  -user=>$user,
			  -pass=>$passwd,
			  -aggregator => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})],
			 ) or die();

#...ask for a segment in the AH6.5 region:
my $panelSeg = $db->segment(CDS=>'AH6.5');

#...make a searchSegment a little larger to pick everything:
my $searchSeg =$db->segment($panelSeg->sourceseq, 
($panelSeg->abs_start-1000), ($panelSeg->abs_end+1000));

#and, then get the features that wormabse_cds pulls:
my @all_transcripts = $searchSeg->features('wormabse_cds');
foreach my $transcript ( @all_transcripts )
{
    print $transcript, $transcript->abs_start,' ', $transcript->abs_end,"\n";
}

I assume this is the right way to do things. But, the problem is that this 
does not get my UTRs. This does:
my @UTRs = $searchSeg->features('UTR:UTR');
foreach my $UTR (@UTRs)
{
    print $UTR," ",$UTR->start," ",$UTR->end,"\n";
}
But, these are not aggregated obviously.

This is the output of the little script above:

UTR:UTR(5_UTR:AH6.5) 9524078 9524086
UTR:UTR(3_UTR:AH6.5) 9525782 9526248
...
wormabse_cds:curated(AH6.5)9524087 9525781

And you can see that the wormabse_cds does not overlap with the UTRs.

Sorry about this. I have been trying all sorts of things... it just keeps on 
missing the UTRs in the new wormbase models. And we can't re-sync the site 
here till this works.

Philip.
On Tuesday 17 February 2004 05:10 pm, Todd Harris wrote:
> Hi Phillip -
>
> You need to aggregate the separate parts of the CDS.  Create a wormbase_cds
> (or whatever you wish to call it), aggregating the following features using
> the CDS group: coding_exon,5_UTR,3_UTR.
>
> The following stanza should do the trick.
>
> $dbgff = (-adaptor => 'dbi::mysql',
>           -dsn     => 'dbi:mysql:database=your_database;host=your_host',
>           -aggregators => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})],
>           -user    => 'your_username',
>           -pass    => 'your_dbgff_pass');
>
> This should do the trick for properly aggregating genes under the new
> WormBase CDS class.
>
> Todd Harris
From laurichj at bioinfo.ucr.edu  Wed Feb 18 11:49:50 2004
From: laurichj at bioinfo.ucr.edu (Josh Lauricha)
Date: Wed Feb 18 11:56:06 2004
Subject: [Bioperl-l] Clickable Glyphs...
In-Reply-To: <LAW11-F21zF1XgBzgr30005cba2@hotmail.com>
References: <LAW11-F21zF1XgBzgr30005cba2@hotmail.com>
Message-ID: <20040218164950.GA3094@bioinfo.ucr.edu>

Not entirely sure what you are trying to do, but the way I've been
doing the same sort of thing was with two scripts. The first generates
the HTML, the second generates the PNG. To do this you create a panel
as if you were going to make an image in both. But for the HTML you do:
    @boxes = $panel->boxes()
rather than $panel->png().

You could do boxes() and png() on the same object if you don't mind
having temp images laying around (typically insecure). Or have a switch
argument passed via GET telling it to do the HTMl or the PNG:

My scripts are for a webbased tree displayer (kind of like forester),
a sequence displayer that highlights the glyph you click on in the
sequence (changes the text color) and a blast results image with
clickable HSPs. All done basically the same way (well, the tree is done
with graphics modules not yet in bioperl).

On Wed 02/18/04 10:53, Jonathan Greenwood wrote:
> Hi, I've submitted my code with the email, what I'm trying to do is to 
> render a Genbank file as a png file, I need to make each glyph 
> clickable(I'm also displaying this page online)...any help with the new 
> changes to Bio::Graphics::Panel would be appreciated...many thanks...
> 
> Sincerely,
> 
> Jonathan Greenwood
> email: jonathon@mgcheo.med.uottawa.ca
> 
> code:
> #! /usr/local/bin/perl -wT
> 
> use strict;
> use Bio::Graphics;
> use Bio::SeqIO;
> use Bio::SeqFeature::Generic;
> use CGI;
> use CGI::Pretty;
> 
> my $file = 'x65306.gb';
> my $io = Bio::SeqIO->new(-file=>$file);
> my $seq = $io->next_seq;
> my $wholeseq = Bio::SeqFeature::Generic->new(-start=>1,
>                                                                     
> -end=>$seq->length);
> my @features = $seq->all_SeqFeatures;
> my $q = new CGI;
> 
> # sort features by their primary tags
> my %sorted_features;
> for my $f (@features) {
>  my $tag = $f->primary_tag;
>  push @{$sorted_features{$tag}},$f;
> }
> 
> print $q->header( 'text/html' );
> print $q->start_html('A Vector Rendering');
> 
> my $panel = Bio::Graphics::Panel->new(-length      => $seq->length,
> 				      -width       => 1000,
> 				      -pad_left    => 10,
> 				      -pad_right   => 10,
> 				      -key_color   => 'white',
> 				      -key_spacing => 15,
> 				      -key_style   => 'bottom',
> 				      -spacing     => -0.25,
> 				      -box_subparts => 'true'
> 				      );
> 
> my ($url,$map,$mapname) = $panel->image_and_map(-root => 
> '/webfiles/cgi-bin',
> 						-url  => '/tmpimages',
> 					       );
> 
> $panel->add_track($wholeseq,
> 		  -glyph  => 'arrow',
> 		  -bump   => +1,
> 		  -double => 1,
> 		  -tick   => 2
> 	          );
> 
> $panel->add_track($wholeseq,
> 		  -glyph   => 'generic',
> 		  -bgcolor => 'purple',
> 		  -height  => 12,
> 		  -key     => 'Whole Sequence',
> 		  -title   => 'Whole Sequence'
> 		  );
> 
> # special feature
> if ($sorted_features{CDS}) {
>  $panel->add_track($sorted_features{CDS},
> 		    -glyph          => 'transcript2',
> 		    -bgcolor        => 'orange',
> 		    -bump           =>  +1,
> 		    -height         => 12,
> 		    -key            => 'CDS',
> 		    -label          => \&gene_label,
> 		    -title          => 'CDS',
> 		    -link           => 'feature1.html#CDS'
> 		    );
>  delete $sorted_features{'CDS'};
> }
> 
> #general case
> my @colors = qw(wheat blue yellow green cyan chartreuse magenta gray);
> my $idx    = 0;
> for my $tag (sort keys %sorted_features) {
> my $features = $sorted_features{$tag};
> $panel->add_track($features,
> 		  -glyph        =>  'generic',
> 		  -bgcolor      =>  $colors[$idx++ % @colors],
> 		  -fgcolor      =>  'black',
> 		  -font2color   => 'red',
> 		  -key          => "${tag}s",
> 		  -bump         => +1,
> 		  -height       => 12,
>                  -label        => \&gene_label,
> 		  -description  => \&generic_description,
> 		  -title        => \&gene_label,
> 		  -link         => 'feature1.html#$tag',
> 		  );
> }
> 
> print $q->img({-src=>$url,-usemap=>"#$mapname"});
> print $q->$map;
> print $q->($panel->png);
> 
> print $q->exit_html;
> 
> exit;
> 
>  sub gene_label {
>     my $feature = shift;
>     my @notes;
>     foreach (qw(product gene)) {
>       next unless $feature->has_tag($_);
>       @notes = $feature->each_tag_value($_);
>       last;
>    }
>    $notes[0];
>  }
> 
>  sub generic_description {
>    my $feature = shift;
>    my $description;
>    foreach ($feature->all_tags) {
>      my @values = $feature->each_tag_value($_);
>      $description .= $_ eq 'note' ? "@values" : "$_=@values; ";
>    }
>    $description =~ s/; $//; # get rid of last
>    $description;
>  }
> 
> _________________________________________________________________
> The new MSN 8: smart spam protection and 2 months FREE*  
> http://join.msn.com/?page=features/junkmail  
> http://join.msn.com/?page=dept/bcomm&pgmarket=en-ca&RU=http%3a%2f%2fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgmarket%3den-ca
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 

----------------------------
| Josh Lauricha            |
| laurichj@bioinfo.ucr.edu |
| Bioinformatics, UCR      |
|--------------------------|
From steve_chervitz at affymetrix.com  Wed Feb 18 15:16:13 2004
From: steve_chervitz at affymetrix.com (Steve Chervitz)
Date: Wed Feb 18 15:22:24 2004
Subject: [Bioperl-l] searchio scripts
In-Reply-To: <200402181457.24179.lstein@cshl.edu>
References: <ADEFIHIJAHBGLBNJCIADMEAECEAA.rrouse@biomail.ucsd.edu>
	<1EC43881-61FD-11D8-A988-000A95765236@affymetrix.com>
	<200402181457.24179.lstein@cshl.edu>
Message-ID: <4C662059-624F-11D8-A988-000A95765236@affymetrix.com>

Good tip, Lincoln. But regardless, the change in IO::_readline's 
behavior means that any script that depended on its pre-1.303 
default-to-STDIN behavior is now broken. This could be a lot since the 
code in examples and scripts exploited this. I received three messages 
about it yesterday, so I fear there could be many others out there 
scratching their heads, especially considering that the 
default-to-STDIN behavior has been around since the early days of 
SeqIO. From the SeqIO docs:

>    $seqIO = Bio::SeqIO->new(-format => $format);
>   ....
> If neither a filehandle nor a filename is specified, then the module
> will read from the @ARGV array or STDIN, using the familiar <>
> semantics.

Relying on a default behavior of a dependent module (Root::IO) always 
troubled me. It seems a better design to make it explicit in your 
script where you expect your input to come from. Typing "-fh=>\*ARGV" 
or putting an @ARGV loop around your script is extra work, but I think 
it's a change for the better. (BTW, this situation also exposes a 
weakness in the test code which didn't test the default _readline 
behavior -- I guess doing this is difficult within the Perl test 
framework).

The issue remains: What to do about backwards compatibility? Some 
options:

1. Fix all of the scripts, examples, POD docs, bptutorial etc. to not 
rely on default STDIN/@ARGV reading behavior of _readline and release 
these as part of bioperl-1.4.1.

2. Revert _readline to it's old behavior and add a new method in IO.pm 
that has the new behavior (_readline2). Update any module/script that 
needs the new _readline behaviour to use _readline2.

#2 is the backward-compatible route, but uglier from a software 
engineering perspective. #1 breaks backward compatibility. Given the 
legacy of the old _readline behaviour, I'm favoring #2. Just seems more 
politic. We could still update the scripts and docs to discourage the 
old _readline behaviour. Thoughts?

Steve

On Feb 18, 2004, at 4:57 AM, Lincoln Stein wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Or do this:
>
> 	my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV);
> 	while (my $result = $in->next_result()) {
> 		...
> 	}
>
> That might even be easier.
>
> Lincoln
>
> On Wednesday 18 February 2004 12:27 pm, Steve Chervitz wrote:
>> Looks like there was a change in the Root::IO.pm module that
>> affects the way these scripts process command-line arguments. As of
>> bioperl-1.303, the SearchIO::blast module appears to be unable to
>> read data from STDIN or files listed in @ARGV. This affects the
>> scripts in examples/searchio and scripts/searchio.
>>
>> As a workaround, I'd recommend you iterate over @ARGV in your
>> script and initialize the SearchIO object using the -file option to
>> new(), as in:
>>
>> while (my $file = shift @ARGV) {
>>      my $in = Bio::SearchIO->new( -format => 'blast',
>>                                   -file => $file
>>                                 );
>>      while ( my $result = $in->next_result() ) {
>>          # process result...
>>      }
>> }
>>
>> As far as tracking down the cause, I've pinpointed the following
>> change in Bio::Root::IO::_readline():
>>
>>      my $fh = $self->_fh or return;   # revision 1.50
>> (bioperl-1.303)
>>
>> formerly this was:
>>
>>      my $fh = $self->_fh || \*ARGV;   # revision 1.49
>> (bioperl-1.302)
>>
>> This also appears to break SeqIO reading from STDIN. Try executing
>> this at the top-level distribution dir for the 1.302 and 1.303
>> releases:
>>
>>      perl -I. ./scripts/seq/translate_seq.PLS -format fasta <
>> t/data/dna1.fa
>>
>> According to Lincoln's commit log, the Root::IO::_readline() change
>> was necessary to get the GFF, SeqFeature, and Registry regression
>> tests working. I tested these tests with the 1.49 version of IO.pm
>> and the only one that was affected was SeqFeature.t. Specifically,
>> test #6 which calls SeqFeature::Generic::gff_string() hangs and
>> waits for input before proceeding. I'm not sure why this is...
>> (getting late).
>>
>> BTW, platforms tested: Perl 5.6.1 and 5.8.0 on Linux (RH9) and Perl
>> 5.8.1-RC3 on MacOS X (10.3.2).
>>
>> Steve
>>
>> On Feb 17, 2004, at 3:14 PM, Richard Rouse wrote:
>>> I recent installed bioperl-1.4 and am having problems with the
>>> blast report
>>> parsers in /examples/searchio/
>>>
>>>
>>> When I run:
>>> perl hitwriter.pl blastreport
>>> I get:
>>>
>>> Using SearchIO->new()
>>>
>>> 0 Blast report(s) processed.
>>> Output sent to file: >hitwriter.out
>>>
>>> I get the same result with rawwriter.pl, hspwriter.pl and
>>> custom_writer.pl
>>> although the htmlwriter.pl and the blast_example.pl work fine.
>>>
>>> Has anyone else encountered this problem and figured out how to
>>> fix it?
>>>
>>> Thanks,
>>>
>>> Richard
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> - --
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.1 (GNU/Linux)
>
> iD8DBQFAM2E00CIvUP7P+AkRAurTAJ9gwb4Os0M5uDWhlE40JphLRIAG+gCfQ5Ji
> zXHLGwtfDAB2Np2nKBZkuw0=
> =IsKs
> -----END PGP SIGNATURE-----
>

From barry.moore at genetics.utah.edu  Wed Feb 18 16:46:22 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Wed Feb 18 16:52:41 2004
Subject: [Bioperl-l] Pretty Output of Alignments
Message-ID: <4033DD2E.8050304@genetics.utah.edu>

Are there modules in Bioperl that produce shaded and colored output of 
multiple sequence alignments in various formats (PS, RTF, HTML) similar 
to what can be made with tools like BOXSHADE, TEXshade, Alscript.  I'm 
pretty sure that the answer is no, but thought I'd check to be sure.  I 
know that there are Bioperl wrappers for the EMBOSS and PISE  versions 
these tools, but wanted something that could be tweaked for more control 
over the output than those allow.


From barry.moore at genetics.utah.edu  Wed Feb 18 16:49:01 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Wed Feb 18 16:55:17 2004
Subject: [Bioperl-l] Pretty Output of Alignments
Message-ID: <4033DDCD.4000806@genetics.utah.edu>

Oopps!  Forgot to sign that post.

Are there modules in Bioperl that produce shaded and colored output of 
multiple sequence alignments in various formats (PS, RTF, HTML) similar 
to what can be made with tools like BOXSHADE, TEXshade, Alscript.  I'm 
pretty sure that the answer is no, but thought I'd check to be sure.  I 
know that there are Bioperl wrappers for the EMBOSS and PISE  versions 
these tools, but wanted something that could be tweaked for more control 
over the output than those allow.

Barry Moore
Dept. of Human Genetics
University of Utah


From qdong at genome.stanford.edu  Wed Feb 18 17:09:27 2004
From: qdong at genome.stanford.edu (Stan Dong)
Date: Wed Feb 18 17:18:20 2004
Subject: [Bioperl-l] SGD GFF3 file available soon
Message-ID: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu>

Hi,

I am a programmer at Saccharomyces Genome Database ( SGD, 
http://www.yeastgenome.org/ ). I am working on developing a flat file 
in GFF3 format ( http://song.sourceforge.net/gff3-jan04.shtml ) to 
represent sequence features of yeast genome and it will soon be 
released on our ftp site. This is very useful because quite a few open 
source softwares can take this file format as input such as Gbrowse, 
Chado etc.

I would like comments from people who are interested in doing similar 
things and those who have good/not-so-good experience on GFF3 to share 
with. For me, it took a while to get the specification done especially 
make the third column (type) fully compatible with Sequence Ontology 
(SO). One thing I liked about GFF3 is the last column (attributes) 
where you can put all kinds of useful information such as in our case 
GO annotation and a nice description of a feature. An example file of 
SGD GFF3 can be viewed here.

ftp://genome-ftp.stanford.edu/pub/people/curator/GFF3Example.txt

Thanks,

Stan Dong
Programmer, SGD

From jason at cgt.duhs.duke.edu  Wed Feb 18 17:30:45 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb 18 17:37:07 2004
Subject: [Bioperl-l] SGD GFF3 file available soon
In-Reply-To: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu>
References: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu>
Message-ID: <Pine.LNX.4.50.0402181727010.8084-100000@tenero.duhs.duke.edu>

Stan -

I am very much looking forward to this - up till now I have had to
reformat the .tab file myself just to get a working GFF3 where Bio::DB:GFF
aggregators would behave properly.  Looks like what you have for the
gene/CDS sets and will give it a try in my db/scripts.  Will be very happy
to see it all consolidated as you have done so nice work.

-jason
On Wed, 18 Feb 2004, Stan Dong wrote:

> Hi,
>
> I am a programmer at Saccharomyces Genome Database ( SGD,
> http://www.yeastgenome.org/ ). I am working on developing a flat file
> in GFF3 format ( http://song.sourceforge.net/gff3-jan04.shtml ) to
> represent sequence features of yeast genome and it will soon be
> released on our ftp site. This is very useful because quite a few open
> source softwares can take this file format as input such as Gbrowse,
> Chado
etc.
>
> I would like comments from people who are interested in doing similar
> things and those who have good/not-so-good experience on GFF3 to share
> with. For me, it took a while to get the specification done especially
> make the third column (type) fully compatible with Sequence Ontology
> (SO). One thing I liked about GFF3 is the last column (attributes)
> where you can put all kinds of useful information such as in our case
> GO annotation and a nice description of a feature. An example file of
> SGD GFF3 can be viewed here.
>
> ftp://genome-ftp.stanford.edu/pub/people/curator/GFF3Example.txt
>
> Thanks,
>
> Stan Dong
> Programmer, SGD
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From rrouse at biomail.ucsd.edu  Wed Feb 18 18:09:44 2004
From: rrouse at biomail.ucsd.edu (Richard Rouse)
Date: Wed Feb 18 18:15:52 2004
Subject: [Bioperl-l] searchio scripts
In-Reply-To: <4C662059-624F-11D8-A988-000A95765236@affymetrix.com>
Message-ID: <ADEFIHIJAHBGLBNJCIADOEAOCEAA.rrouse@biomail.ucsd.edu>

I tried Steve's suggestion by putting this right above:
  while ( my $blast = $in->next_result() ) {

Then putting another } at the end of the script.

Doing this and then running a large blast output file, I got:

Using SearchIO->new()

Report 1: Bio::Search::Result::BlastResult=HASH(0x87d7fb8)

Report 2: Bio::Search::Result::BlastResult=HASH(0x8dfea60)

------------- EXCEPTION  -------------
MSG: Trouble in ResultTableWriter::_set_row_data_func() eval:
------------- EXCEPTION  -------------
MSG: Can't get identical or conserved data: no data.
STACK Bio::Search::Hit::GenericHit::matches
../..//Bio/Search/Hit/GenericHit.pm:852
STACK Bio::Search::Hit::GenericHit::frac_identical
../..//Bio/Search/Hit/GenericHit.pm:1043
STACK (eval) (eval 310):1
STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__
../..//Bio/SearchIO/Writer/ResultTableWriter.pm:327
STACK Bio::SearchIO::Writer::HitTableWriter::to_string
../..//Bio/SearchIO/Writer/HitTableWriter.pm:267
STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321
STACK Bio::SearchIO::blast::write_result ../..//Bio/SearchIO/blast.pm:1495
STACK toplevel new.mod.hitwriter.pl:106

--------------------------------------


STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__
../..//Bio/SearchIO/Writer/ResultTableWriter.pm:329
STACK Bio::SearchIO::Writer::HitTableWriter::to_string
../..//Bio/SearchIO/Writer/HitTableWriter.pm:267
STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321
STACK Bio::SearchIO::blast::write_result ../..//Bio/SearchIO/blast.pm:1495
STACK toplevel new.mod.hitwriter.pl:106

I tried Lincoln's suggestion as well. In this case I added:

my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV);

above

while ( my $blast = $in->next_result() ) {


This script just runs getting no result.

By the way I am running Suse linux 9.0, perl 5.8.1

Thanks,
Richard

-----Original Message-----
From: Steve Chervitz [mailto:steve_chervitz@affymetrix.com]
Sent: Wednesday, February 18, 2004 12:16 PM
To: Lincoln Stein
Cc: Richard Rouse; Bioperl
Subject: Re: [Bioperl-l] searchio scripts


Good tip, Lincoln. But regardless, the change in IO::_readline's
behavior means that any script that depended on its pre-1.303
default-to-STDIN behavior is now broken. This could be a lot since the
code in examples and scripts exploited this. I received three messages
about it yesterday, so I fear there could be many others out there
scratching their heads, especially considering that the
default-to-STDIN behavior has been around since the early days of
SeqIO. From the SeqIO docs:

>    $seqIO = Bio::SeqIO->new(-format => $format);
>   ....
> If neither a filehandle nor a filename is specified, then the module
> will read from the @ARGV array or STDIN, using the familiar <>
> semantics.

Relying on a default behavior of a dependent module (Root::IO) always
troubled me. It seems a better design to make it explicit in your
script where you expect your input to come from. Typing "-fh=>\*ARGV"
or putting an @ARGV loop around your script is extra work, but I think
it's a change for the better. (BTW, this situation also exposes a
weakness in the test code which didn't test the default _readline
behavior -- I guess doing this is difficult within the Perl test
framework).

The issue remains: What to do about backwards compatibility? Some
options:

1. Fix all of the scripts, examples, POD docs, bptutorial etc. to not
rely on default STDIN/@ARGV reading behavior of _readline and release
these as part of bioperl-1.4.1.

2. Revert _readline to it's old behavior and add a new method in IO.pm
that has the new behavior (_readline2). Update any module/script that
needs the new _readline behaviour to use _readline2.

#2 is the backward-compatible route, but uglier from a software
engineering perspective. #1 breaks backward compatibility. Given the
legacy of the old _readline behaviour, I'm favoring #2. Just seems more
politic. We could still update the scripts and docs to discourage the
old _readline behaviour. Thoughts?

Steve

On Feb 18, 2004, at 4:57 AM, Lincoln Stein wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Or do this:
>
> 	my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV);
> 	while (my $result = $in->next_result()) {
> 		...
> 	}
>
> That might even be easier.
>
> Lincoln
>
> On Wednesday 18 February 2004 12:27 pm, Steve Chervitz wrote:
>> Looks like there was a change in the Root::IO.pm module that
>> affects the way these scripts process command-line arguments. As of
>> bioperl-1.303, the SearchIO::blast module appears to be unable to
>> read data from STDIN or files listed in @ARGV. This affects the
>> scripts in examples/searchio and scripts/searchio.
>>
>> As a workaround, I'd recommend you iterate over @ARGV in your
>> script and initialize the SearchIO object using the -file option to
>> new(), as in:
>>
>> while (my $file = shift @ARGV) {
>>      my $in = Bio::SearchIO->new( -format => 'blast',
>>                                   -file => $file
>>                                 );
>>      while ( my $result = $in->next_result() ) {
>>          # process result...
>>      }
>> }
>>
>> As far as tracking down the cause, I've pinpointed the following
>> change in Bio::Root::IO::_readline():
>>
>>      my $fh = $self->_fh or return;   # revision 1.50
>> (bioperl-1.303)
>>
>> formerly this was:
>>
>>      my $fh = $self->_fh || \*ARGV;   # revision 1.49
>> (bioperl-1.302)
>>
>> This also appears to break SeqIO reading from STDIN. Try executing
>> this at the top-level distribution dir for the 1.302 and 1.303
>> releases:
>>
>>      perl -I. ./scripts/seq/translate_seq.PLS -format fasta <
>> t/data/dna1.fa
>>
>> According to Lincoln's commit log, the Root::IO::_readline() change
>> was necessary to get the GFF, SeqFeature, and Registry regression
>> tests working. I tested these tests with the 1.49 version of IO.pm
>> and the only one that was affected was SeqFeature.t. Specifically,
>> test #6 which calls SeqFeature::Generic::gff_string() hangs and
>> waits for input before proceeding. I'm not sure why this is...
>> (getting late).
>>
>> BTW, platforms tested: Perl 5.6.1 and 5.8.0 on Linux (RH9) and Perl
>> 5.8.1-RC3 on MacOS X (10.3.2).
>>
>> Steve
>>
>> On Feb 17, 2004, at 3:14 PM, Richard Rouse wrote:
>>> I recent installed bioperl-1.4 and am having problems with the
>>> blast report
>>> parsers in /examples/searchio/
>>>
>>>
>>> When I run:
>>> perl hitwriter.pl blastreport
>>> I get:
>>>
>>> Using SearchIO->new()
>>>
>>> 0 Blast report(s) processed.
>>> Output sent to file: >hitwriter.out
>>>
>>> I get the same result with rawwriter.pl, hspwriter.pl and
>>> custom_writer.pl
>>> although the htmlwriter.pl and the blast_example.pl work fine.
>>>
>>> Has anyone else encountered this problem and figured out how to
>>> fix it?
>>>
>>> Thanks,
>>>
>>> Richard
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> - --
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.1 (GNU/Linux)
>
> iD8DBQFAM2E00CIvUP7P+AkRAurTAJ9gwb4Os0M5uDWhlE40JphLRIAG+gCfQ5Ji
> zXHLGwtfDAB2Np2nKBZkuw0=
> =IsKs
> -----END PGP SIGNATURE-----
>


From barry.moore at genetics.utah.edu  Wed Feb 18 19:30:23 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Wed Feb 18 19:36:34 2004
Subject: [Bioperl-l] Bio::Graphics::Browser::Markup
Message-ID: <4034039F.1080107@genetics.utah.edu>

Todd Harris recently pointed me to Bio::Graphics::Browser::Markup, and I 
can't find the damn thing anywhere.  I've looked in the bioperl CVS, 
bioperl online docs and in my local installation.  Google searches don't 
turn up anything, but suggest that I should be finding a 
Bio/Graphics/Browser/Markup.pm module - which I don't.  Someone please 
enlighten the poor simple boy from Utah.

Barry


From jason at cgt.duhs.duke.edu  Wed Feb 18 20:36:05 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Wed Feb 18 20:42:22 2004
Subject: [Bioperl-l] Bio::Graphics::Browser::Markup
In-Reply-To: <4034039F.1080107@genetics.utah.edu>
References: <4034039F.1080107@genetics.utah.edu>
Message-ID: <Pine.LNX.4.50.0402182031300.12117-100000@tenero.duhs.duke.edu>


Hey Barry -

It is part of Gbrowse...
http://sourceforge.net/cvs/?group_id=27707

(pass is empty, just hit return)
cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod login

cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod co Generic-Genome-Browser

-jason
On Wed, 18 Feb 2004, Barry Moore wrote:

> Todd Harris recently pointed me to Bio::Graphics::Browser::Markup, and I
> can't find the damn thing anywhere.  I've looked in the bioperl CVS,
> bioperl online docs and in my local installation.  Google searches don't
> turn up anything, but suggest that I should be finding a
> Bio/Graphics/Browser/Markup.pm module - which I don't.  Someone please
> enlighten the poor simple boy from Utah.
>
> Barry
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From rkh at gene.com  Wed Feb 18 16:31:05 2004
From: rkh at gene.com (Reece Hart)
Date: Wed Feb 18 21:36:35 2004
Subject: [Bioperl-l] Bio::Prospect:: -- a Perl interface to Prospect protein
	threading
Message-ID: <1077139865.3514.687.camel@tallac>

bioperl-l gang:

We're pleased to announce the first public release of Bio::Prospect::, a
Perl API for protein fold recognition with Prospect PRO.  A
mini-manuscript describing the module is available at
http://prdownloads.sourceforge.net/prospect-if/manuscript.pdf?download. 


        Abstract
        We present Bio::Prospect::, an object-oriented Perl Application
        Programming Interface (API) to the PROSPECT protein threading
        application. The Bio::Prospect:: modules facilitate executing
        the program, parsing the results, generating a homology model
        from an alignment, preparing a model for display with RasMol,
        and reconciling multiple pairwise alignments as a single
        multiple sequence alignment. The Bio::Prospect:: modules provide
        for local and remote execution of PROSPECT via a consistent
        interface. PROSPECT results may be represented with the
        full-featured Thread class or as a space-efficient distillation
        of results with the ThreadSummary class; instances of both
        classes may be serialized for network transmission of results
        from remote execution.


LICENSE: The module is released under the  Academic Free License v. 2.0
(http://www.opensource.org/licenses/afl-2.0.php).

AVAILABILITY: The project is hosted on SourceForge
(http://www.sourceforge.net/projects/prospect-if/). A perl install
package is available through CPAN (http://search.cpan.org/~reece/).
Example scripts are included.

REQUIREMENTS: This module requires Prospect PRO
(http://www.bioinformaticssolutions.com/products/prospect.php) and
several other perl modules, all available from CPAN.


Comments, bug reports, patches, and code contributions are encouraged.

Happy Threading,
Reece Hart and David Cavanaugh

-- 
Reece Hart, Ph.D.                       rkh@gene.com, http://www.gene.com/
Genentech, Inc.                         650/225-6133 (voice), -5389 (fax)
Bioinformatics and Protein Engineering
1 DNA Way, MS-93                        http://www.in-machina.com/~reece/
South San Francisco, CA  94080-4990     reece@in-machina.com, GPG: 0x25EC91A0
From neil.saunders at unsw.edu.au  Wed Feb 18 21:55:24 2004
From: neil.saunders at unsw.edu.au (Neil Saunders)
Date: Wed Feb 18 22:01:47 2004
Subject: [Bioperl-l] Re: Bio::Prospect
In-Reply-To: <200402190238.i1J2av9T015567@portal.open-bio.org>
References: <200402190238.i1J2av9T015567@portal.open-bio.org>
Message-ID: <20040219025524.GA15515@psychro>

> We're pleased to announce the first public release of Bio::Prospect::, a
> Perl API for protein fold recognition with Prospect PRO.  A
> Comments, bug reports, patches, and code contributions are encouraged.


This looks very useful.  It would be even more so if it worked with the 
freely-available Prospect 2.0, rather than the commercial Prospect Pro.  
Any chance of this?

Neil
-- 
 School of Biotechnology and Biomolecular Sciences,
 The University of New South Wales,
 Sydney 2052,
 Australia

http://psychro.bioinformatics.unsw.edu.au/neil/index.php
From cain at cshl.org  Wed Feb 18 22:14:58 2004
From: cain at cshl.org (Scott Cain)
Date: Wed Feb 18 22:21:13 2004
Subject: [Bioperl-l] SGD GFF3 file available soon
In-Reply-To: <200402190237.i1J2av9R015567@portal.open-bio.org>
References: <200402190237.i1J2av9R015567@portal.open-bio.org>
Message-ID: <1077160498.1473.72.camel@localhost.localdomain>

Stan,

In your sample GFF, the seqid in the first column has to correspond to
some ID, usually also defined in the same GFF file.  For instance, if
the features in the GFF file are all on chromosome I, the first column
of all of those lines would have the same ID as the ID declared for
chromosome I.  For example:

I	SGD	chromosome	1	230211	.	.	.	ID=I;description=Sequence "I"
I	SGD	telomere	1	801	.	-	0	ID=TEL01L;description=I left telomeric region;db_xref=SGD:S0028862
I	SGD	repeat_family	1	62	.	-	0	ID=TEL01L-TR;name=Telomeric Repeat;description=I left telomere TG(1-3);db_xref=SGD:S0028864
...etc...

Sorry I didn't point that out before--when I looked at the Excel sheet
you sent me before, I didn't see all of it (I am too used to working
with plain text files).

Scott

-------------Original Message---------------
> Date: Wed, 18 Feb 2004 14:09:27 -0800
> From: Stan Dong <qdong@genome.stanford.edu>
> Subject: [Bioperl-l] SGD GFF3 file available soon
> To: bioperl-l@bioperl.org
> Message-ID: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
> 
> Hi,
> 
> I am a programmer at Saccharomyces Genome Database ( SGD, 
> http://www.yeastgenome.org/ ). I am working on developing a flat file 
> in GFF3 format ( http://song.sourceforge.net/gff3-jan04.shtml ) to 
> represent sequence features of yeast genome and it will soon be 
> released on our ftp site. This is very useful because quite a few open 
> source softwares can take this file format as input such as Gbrowse, 
> Chado etc.
> 
> I would like comments from people who are interested in doing similar 
> things and those who have good/not-so-good experience on GFF3 to share 
> with. For me, it took a while to get the specification done especially 
> make the third column (type) fully compatible with Sequence Ontology 
> (SO). One thing I liked about GFF3 is the last column (attributes) 
> where you can put all kinds of useful information such as in our case 
> GO annotation and a nice description of a feature. An example file of 
> SGD GFF3 can be viewed here.
> 
> ftp://genome-ftp.stanford.edu/pub/people/curator/GFF3Example.txt
> 
> Thanks,
> 
> Stan Dong
> Programmer, SGD

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain@cshl.org
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From cain at cshl.org  Wed Feb 18 22:20:48 2004
From: cain at cshl.org (Scott Cain)
Date: Wed Feb 18 22:27:03 2004
Subject: [Bioperl-l] Bio::Graphics::Browser::Markup
In-Reply-To: <200402190237.i1J2av9R015567@portal.open-bio.org>
References: <200402190237.i1J2av9R015567@portal.open-bio.org>
Message-ID: <1077160848.1477.79.camel@localhost.localdomain>

Or if you don't want to deal with the anonymous cvs server (which can be
quite slow at times), you can download a nightly build from CVS at 

  http://www.gmod.org/Generic-Genome-Browser.tar.gz

I would normally suggest that you go to the download page from
http://www.gmod.org/, but I am in the process of preparing a new
release.

Scott


--------------Original Message-------------------
> Date: Wed, 18 Feb 2004 20:36:05 -0500 (EST)
> From: Jason Stajich <jason@cgt.duhs.duke.edu>
> Subject: Re: [Bioperl-l] Bio::Graphics::Browser::Markup
> To: Barry Moore <barry.moore@genetics.utah.edu>
> Cc: bioperl <bioperl-l@bioperl.org>
> Message-ID:
> 	<Pine.LNX.4.50.0402182031300.12117-100000@tenero.duhs.duke.edu>
> Content-Type: TEXT/PLAIN; charset=US-ASCII
> 
> 
> Hey Barry -
> 
> It is part of Gbrowse...
> http://sourceforge.net/cvs/?group_id=27707
> 
> (pass is empty, just hit return)
> cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod login
> 
> cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod co Generic-Genome-Browser
> 
> -jason
> On Wed, 18 Feb 2004, Barry Moore wrote:
> 
> > Todd Harris recently pointed me to Bio::Graphics::Browser::Markup, and I
> > can't find the damn thing anywhere.  I've looked in the bioperl CVS,
> > bioperl online docs and in my local installation.  Google searches don't
> > turn up anything, but suggest that I should be finding a
> > Bio/Graphics/Browser/Markup.pm module - which I don't.  Someone please
> > enlighten the poor simple boy from Utah.
> >
> > Barry
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> 
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> 

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain@cshl.org
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory

From f69p at hotmail.com  Wed Feb 18 16:54:30 2004
From: f69p at hotmail.com (shawn)
Date: Wed Feb 18 23:58:41 2004
Subject: [Bioperl-l] A D-r-u-g more potent than VIAG-RA?!
Message-ID: <1077141270-5270@excite.com>


"I went from about 6in. to over 7.5 in 90days!" - JL - Tulsa, OK

http://drlaurent.com/mm/index.php?pid=eph9106


"NOT only the SIZE increased..but also the feeling!" - ER - Dallas, TX

http://drlaurent.com/mm/index.php?pid=eph9106


"I grew about 2in - and more so..remain rock-hard during love making. I owe it
all to Maxaman! Not bad for a 56 year old."  MT - Pensacola, FL

http://drlaurent.com/mm/index.php?pid=eph9106


buffalo  scicchitano   glkenyon dwgormly felix jeanie collons 
wachtel ccins002 dagreco joon evans    galloway  jambunathan bdudock rswillia volker <I>  lebien nehorayoff sharan bobm  bleich kimbra
formal
Get off this list by writing to getmeoff731@mail.com

From pacers21image at hotmail.com  Thu Feb 19 03:10:26 2004
From: pacers21image at hotmail.com (antoine)
Date: Thu Feb 19 01:23:30 2004
Subject: [Bioperl-l] The Drug that puts VIAGR@ to shame!
Message-ID: <1077178226-22105@excite.com>

Here is an fantastic way to please your lady.

You can be ready for up to thirty-six hours.

The results are far greater than any other product.

http://medsfactory.com/sv/index.php?pid=eph9106


dollars campbellcutie energy larry yomama petunia action
laura oliviercharity e-mail tootsie 
supra cannon research 
From qdong at genome.stanford.edu  Thu Feb 19 01:42:32 2004
From: qdong at genome.stanford.edu (Stan Dong)
Date: Thu Feb 19 01:48:50 2004
Subject: [Bioperl-l] SGD GFF3 file available soon
In-Reply-To: <1077160498.1473.72.camel@localhost.localdomain>
Message-ID: <Pine.GSO.4.21.0402182234520.29676-100000@fafner.Stanford.EDU>

Hi Scott,

In my examples, I use arabic number in the seqid column to indicate
chromosome number. So I should put 'ID=1' in the attribute column of the
first line which represents the whole chromosome. Since these IDs need to
be unique within the scope of the GFF file, I think it's better to  use a
more descriptive name like 'chr01' in this case (and 'ID=chr01' in the
attribute column). 

Thanks a lot for your suggestion,
-Stan


On Wed, 18 Feb 2004, Scott Cain wrote:

> Stan,
> 
> In your sample GFF, the seqid in the first column has to correspond to
> some ID, usually also defined in the same GFF file.  For instance, if
> the features in the GFF file are all on chromosome I, the first column
> of all of those lines would have the same ID as the ID declared for
> chromosome I.  For example:
> 
> I	SGD	chromosome	1	230211	.	.	.	ID=I;description=Sequence "I"
> I	SGD	telomere	1	801	.	-	0	ID=TEL01L;description=I left telomeric region;db_xref=SGD:S0028862
> I	SGD	repeat_family	1	62	.	-	0	ID=TEL01L-TR;name=Telomeric Repeat;description=I left telomere TG(1-3);db_xref=SGD:S0028864
> ...etc...
> 
> Sorry I didn't point that out before--when I looked at the Excel sheet
> you sent me before, I didn't see all of it (I am too used to working
> with plain text files).
> 
> Scott
> 
> -------------Original Message---------------
> > Date: Wed, 18 Feb 2004 14:09:27 -0800
> > From: Stan Dong <qdong@genome.stanford.edu>
> > Subject: [Bioperl-l] SGD GFF3 file available soon
> > To: bioperl-l@bioperl.org
> > Message-ID: <1DE37948-625F-11D8-89C8-000A956A0A36@genome.stanford.edu>
> > Content-Type: text/plain; charset=US-ASCII; format=flowed
> > 
> > Hi,
> > 
> > I am a programmer at Saccharomyces Genome Database ( SGD, 
> > http://www.yeastgenome.org/ ). I am working on developing a flat file 
> > in GFF3 format ( http://song.sourceforge.net/gff3-jan04.shtml ) to 
> > represent sequence features of yeast genome and it will soon be 
> > released on our ftp site. This is very useful because quite a few open 
> > source softwares can take this file format as input such as Gbrowse, 
> > Chado etc.
> > 
> > I would like comments from people who are interested in doing similar 
> > things and those who have good/not-so-good experience on GFF3 to share 
> > with. For me, it took a while to get the specification done especially 
> > make the third column (type) fully compatible with Sequence Ontology 
> > (SO). One thing I liked about GFF3 is the last column (attributes) 
> > where you can put all kinds of useful information such as in our case 
> > GO annotation and a nice description of a feature. An example file of 
> > SGD GFF3 can be viewed here.
> > 
> > ftp://genome-ftp.stanford.edu/pub/people/curator/GFF3Example.txt
> > 
> > Thanks,
> > 
> > Stan Dong
> > Programmer, SGD
> 
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain@cshl.org
> GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
> Cold Spring Harbor Laboratory
> 

From lstein at cshl.edu  Thu Feb 19 04:06:57 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Thu Feb 19 04:13:28 2004
Subject: [Bioperl-l] Bio::Graphics::Browser::Markup
In-Reply-To: <4034039F.1080107@genetics.utah.edu>
References: <4034039F.1080107@genetics.utah.edu>
Message-ID: <200402191106.57903.lstein@cshl.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

It's in Generic Genome Browser. http://www.gmod.org/ggb/

For those who aren't familiar with the module, it lets you markup 
arbitrary strings (alignments, FASTA files, etc) with HTML using a 
simple stylesheet system.  You can markup with colors, text styles, 
and arbitrary text.  Overlapping mark-up works properly (e.g. text 
styles and colors are additive, and overlapping colors mix properly 
using HSV addition).

Lincoln

On Thursday 19 February 2004 02:30 am, Barry Moore wrote:
> Todd Harris recently pointed me to Bio::Graphics::Browser::Markup,
> and I can't find the damn thing anywhere.  I've looked in the
> bioperl CVS, bioperl online docs and in my local installation. 
> Google searches don't turn up anything, but suggest that I should
> be finding a
> Bio/Graphics/Browser/Markup.pm module - which I don't.  Someone
> please enlighten the poor simple boy from Utah.
>
> Barry
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

- -- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFANHyx0CIvUP7P+AkRAlIbAJ49AIjh4pFjtPu3hY9liHWpDnw5sQCgoXz9
X6BwmN2OXLkL3AraWkoqr3A=
=RYHR
-----END PGP SIGNATURE-----
From lstein at cshl.edu  Thu Feb 19 05:08:43 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Thu Feb 19 05:15:06 2004
Subject: [Bioperl-l] Clickable Glyphs...
In-Reply-To: <LAW11-F21zF1XgBzgr30005cba2@hotmail.com>
References: <LAW11-F21zF1XgBzgr30005cba2@hotmail.com>
Message-ID: <200402191208.44083.lstein@cshl.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Get the latest CVS version of bioperl-live and read the section of the 
Bio::Graphics::Panel manual page labeled "Creating Imagemaps."  
Essentially what you need to do is to replace the section after you 
create the panel with this:

	my ($url,$map,$mapname) - $panel->image_and_map(
						-root => '/var/www/html',
						-url   => '/tmpimages');
	print $q->header(),$q->start_html('A Bitmap Rendering');
	print $q->img({-src=>$url,-usemap=>"#$mapname");
	print $map;
	print $q->end_html;

I'm frankly more fond of the function-oriented CGI calls, so I would 
bring in the standard functions and then:

	print header(),
		start_html('A Bitmap Rendering'),
		img({-src=>$url,-usemap=>"#$mapname"),
		$map,
		end_html();

Lincoln

On Wednesday 18 February 2004 05:53 pm, Jonathan Greenwood wrote:
> Hi, I've submitted my code with the email, what I'm trying to do is
> to render a Genbank file as a png file, I need to make each glyph
> clickable(I'm also displaying this page online)...any help with the
> new changes to Bio::Graphics::Panel would be appreciated...many
> thanks...
>
> Sincerely,
>
> Jonathan Greenwood
> email: jonathon@mgcheo.med.uottawa.ca
>
> code:
> #! /usr/local/bin/perl -wT
>
> use strict;
> use Bio::Graphics;
> use Bio::SeqIO;
> use Bio::SeqFeature::Generic;
> use CGI;
> use CGI::Pretty;
>
> my $file = 'x65306.gb';
> my $io = Bio::SeqIO->new(-file=>$file);
> my $seq = $io->next_seq;
> my $wholeseq = Bio::SeqFeature::Generic->new(-start=>1,
>
> -end=>$seq->length);
> my @features = $seq->all_SeqFeatures;
> my $q = new CGI;
>
> # sort features by their primary tags
> my %sorted_features;
> for my $f (@features) {
>   my $tag = $f->primary_tag;
>   push @{$sorted_features{$tag}},$f;
> }
>
> print $q->header( 'text/html' );
> print $q->start_html('A Vector Rendering');
>
> my $panel = Bio::Graphics::Panel->new(-length      => $seq->length,
> 				      -width       => 1000,
> 				      -pad_left    => 10,
> 				      -pad_right   => 10,
> 				      -key_color   => 'white',
> 				      -key_spacing => 15,
> 				      -key_style   => 'bottom',
> 				      -spacing     => -0.25,
> 				      -box_subparts => 'true'
> 				      );
>
> my ($url,$map,$mapname) = $panel->image_and_map(-root =>
> '/webfiles/cgi-bin',
> 						-url  => '/tmpimages',
> 					       );
>
> $panel->add_track($wholeseq,
> 		  -glyph  => 'arrow',
> 		  -bump   => +1,
> 		  -double => 1,
> 		  -tick   => 2
> 	          );
>
> $panel->add_track($wholeseq,
> 		  -glyph   => 'generic',
> 		  -bgcolor => 'purple',
> 		  -height  => 12,
> 		  -key     => 'Whole Sequence',
> 		  -title   => 'Whole Sequence'
> 		  );
>
> # special feature
> if ($sorted_features{CDS}) {
>   $panel->add_track($sorted_features{CDS},
> 		    -glyph          => 'transcript2',
> 		    -bgcolor        => 'orange',
> 		    -bump           =>  +1,
> 		    -height         => 12,
> 		    -key            => 'CDS',
> 		    -label          => \&gene_label,
> 		    -title          => 'CDS',
> 		    -link           => 'feature1.html#CDS'
> 		    );
>   delete $sorted_features{'CDS'};
> }
>
> #general case
> my @colors = qw(wheat blue yellow green cyan chartreuse magenta
> gray); my $idx    = 0;
> for my $tag (sort keys %sorted_features) {
> my $features = $sorted_features{$tag};
> $panel->add_track($features,
> 		  -glyph        =>  'generic',
> 		  -bgcolor      =>  $colors[$idx++ % @colors],
> 		  -fgcolor      =>  'black',
> 		  -font2color   => 'red',
> 		  -key          => "${tag}s",
> 		  -bump         => +1,
> 		  -height       => 12,
>                   -label        => \&gene_label,
> 		  -description  => \&generic_description,
> 		  -title        => \&gene_label,
> 		  -link         => 'feature1.html#$tag',
> 		  );
> }
>
> print $q->img({-src=>$url,-usemap=>"#$mapname"});
> print $q->$map;
> print $q->($panel->png);
>
> print $q->exit_html;
>
> exit;
>
>   sub gene_label {
>      my $feature = shift;
>      my @notes;
>      foreach (qw(product gene)) {
>        next unless $feature->has_tag($_);
>        @notes = $feature->each_tag_value($_);
>        last;
>     }
>     $notes[0];
>   }
>
>   sub generic_description {
>     my $feature = shift;
>     my $description;
>     foreach ($feature->all_tags) {
>       my @values = $feature->each_tag_value($_);
>       $description .= $_ eq 'note' ? "@values" : "$_=@values; ";
>     }
>     $description =~ s/; $//; # get rid of last
>     $description;
>   }
>
> _________________________________________________________________
> The new MSN 8: smart spam protection and 2 months FREE*
> http://join.msn.com/?page=features/junkmail
> http://join.msn.com/?page=dept/bcomm&pgmarket=en-ca&RU=http%3a%2f%2
>fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgmarket%3den-ca
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

- -- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFANIss0CIvUP7P+AkRAmh/AJ9SaY4MIZPS5vW5gE5xzaw7AzrjaQCdHJdE
S+2+MS2vScLrVTd+C3V4mME=
=MBei
-----END PGP SIGNATURE-----
From awitney at sghms.ac.uk  Thu Feb 19 07:05:26 2004
From: awitney at sghms.ac.uk (Adam Witney)
Date: Thu Feb 19 07:12:41 2004
Subject: [Bioperl-l] Subject length using BPlite.pm
Message-ID: <BC5A5706.2E1FF%awitney@sghms.ac.uk>


Hi,

I am using BPlite.pm to parse a BLAST output, is it possible to get the
Subject length? (that?s the length of the whole subject sequence, not just
the part involved in the hsp)

Thanks for any help

Adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From todd.harris at cshl.edu  Thu Feb 19 09:39:12 2004
From: todd.harris at cshl.edu (Todd Harris)
Date: Thu Feb 19 09:45:32 2004
Subject: [Bioperl-l] Bio::Graphics::Browser::Markup
In-Reply-To: <Pine.LNX.4.50.0402182031300.12117-100000@tenero.duhs.duke.edu>
Message-ID: <BC5A26B0.BCB3%todd.harris@cshl.edu>

Whoops!

My apologies.  Thanks for the correction, JS.  Yep, it's part of GBrowse,
not bioperl.

t

> On 2/18/04 7:36 PM, Jason Stajich wrote:

> 
> Hey Barry -
> 
> It is part of Gbrowse...
> http://sourceforge.net/cvs/?group_id=27707
> 
> (pass is empty, just hit return)
> cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod login
> 
> cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/gmod co
> Generic-Genome-Browser
> 
> -jason
> On Wed, 18 Feb 2004, Barry Moore wrote:
> 
>> Todd Harris recently pointed me to Bio::Graphics::Browser::Markup, and I
>> can't find the damn thing anywhere.  I've looked in the bioperl CVS,
>> bioperl online docs and in my local installation.  Google searches don't
>> turn up anything, but suggest that I should be finding a
>> Bio/Graphics/Browser/Markup.pm module - which I don't.  Someone please
>> enlighten the poor simple boy from Utah.
>> 
>> Barry
>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 

From jason at cgt.duhs.duke.edu  Thu Feb 19 10:56:11 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Thu Feb 19 11:03:00 2004
Subject: [Bioperl-l] Subject length using BPlite.pm
In-Reply-To: <BC5A5706.2E1FF%awitney@sghms.ac.uk>
References: <BC5A5706.2E1FF%awitney@sghms.ac.uk>
Message-ID: <Pine.LNX.4.50.0402191053001.17021-100000@tenero.duhs.duke.edu>

BTW SearchIO is the only supported Blast parser so I will always suggest
moving to SearchIO::blast for your parsing needs....

But I don't think Ian/Peter put a method call in there so you have to use
$sbjct->{'LENGTH'} where sbjct came from the nextSbjct call like:

  use Bio::Tools::BPlite;
  my $report = new Bio::Tools::BPlite(-fh=>\*STDIN);
  while(my $sbjct = $report->nextSbjct) {

  }


-jason
On Thu, 19 Feb 2004, Adam Witney wrote:

>
> Hi,
>
> I am using BPlite.pm to parse a BLAST output, is it possible to get the
> Subject length? (that�s the length of the whole subject sequence, not just
> the part involved in the hsp)
>
> Thanks for any help
>
> Adam
>
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu

From jegreenwood25 at hotmail.com  Thu Feb 19 13:50:05 2004
From: jegreenwood25 at hotmail.com (Jonathan Greenwood)
Date: Thu Feb 19 13:56:19 2004
Subject: [Bioperl-l] More help....
Message-ID: <Law11-F23bR4f2hWxin0002ebb4@hotmail.com>

Hi, this combines CGI and BioPerl, what i need to do is open a Genbank file
and then parse(write out) the features. But I need these parsed features to
go into a textbox for editing, and then be able to save the data I have just
edited...Please Help!!! Many thanks...the code is enclosed with the
email....

Jonathan Greenwood
email: jonathon@mgcheo.med.uottawa.ca

Code:
#! /usr/local/bin/perl -wT

use strict;
use CGI qw / :standard /;
use CGI::Pretty;
use Bio::SeqIO;
use Bio::SeqFeature::Generic;
use Bio::Location::Simple;
use Bio::Location::SplitLocationI;

my @features = read_file(param('file')) if param('file');

print header, start_html('Plasmid Feature Editor');

print h1('Plasmid Feature Editor');

print p('Load up a Genbank file to work with, then edit the features in the
text box.');

print start_multipart_form(),
  table({-cellpadding => 10},
	TR({-class=> 'resultsbody'},
	   td(textarea('-name'     => 'editarea',
		       '-value'    => (@features),
		       '-rows'     => 20,
		       '-cols'     => 70,
		       '-override' => (@features) || (param('clear')),
		       ),
	     ),
	  ),
	TR({-class=>'resultstitle'},
	   td(filefield(-name    => 'uploaded_file',
			-length  => 40),
	     ),
	   td(submit(-name  => 'submit_button',
		     -value => 'Click to display features'),
	     ),
	  ),
	TR({-class=>'resultstitle'},
	   td(submit(-name  => 'save_button',
		     -value => 'Click here to save your work'),
	     ),
	   td(reset(),
	     ),
	  ),
       ),
  end_form;

print end_html;

exit;

sub read_file {
my $fh = param('uploaded_file');
my $gb_parser = Bio::SeqIO->new(-fh=>$fh,-format=>'genbank');
my @features;
while (my $seq = $gb_parser->next_seq) {
	push @features,$seq->get_all_SeqFeatures();
} return @features;
}

_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.  
http://join.msn.com/?page=features/virus&pgmarket=en-ca&RU=http%3a%2f%2fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgmarket%3den-ca

From reece at in-machina.com  Thu Feb 19 11:34:55 2004
From: reece at in-machina.com (Reece Hart)
Date: Thu Feb 19 14:32:53 2004
Subject: [Bioperl-l] Re: Bio::Prospect
In-Reply-To: <20040219025524.GA15515@psychro>
References: <200402190238.i1J2av9T015567@portal.open-bio.org>
	<20040219025524.GA15515@psychro>
Message-ID: <1077208495.3514.744.camel@tallac>

On Wed, 2004-02-18 at 18:55, Neil Saunders wrote:

> > We're pleased to announce the first public release of Bio::Prospect::, a
> > Perl API for protein fold recognition with Prospect PRO.  A
> > Comments, bug reports, patches, and code contributions are encouraged.


I'm glad you think it might be useful. Apparently, you should have
received the Bioinformatics Applications Note instead of the obviously
non-perl'ing, non-threading referees who did get it. ;-) Oh well.

I believe that Prospect 2.0 and Pro are really the same product. We
started developing this a long time ago (with version 1, then overhauled
it for a 2.0 prerelease). I'd appreciate knowing if you use it. or don't
use it because it's too difficult to install, etc.

Good luck,
Reece
-- 
Reece Hart, Ph.D.                       rkh@gene.com, http://www.gene.com/
Genentech, Inc.                         650/225-6133 (voice), -5389 (fax)
Bioinformatics and Protein Engineering
1 DNA Way, MS-93                        http://www.in-machina.com/~reece/
South San Francisco, CA  94080-4990     reece@in-machina.com, GPG: 0x25EC91A0
From Annie.Law at nrc-cnrc.gc.ca  Thu Feb 19 15:50:10 2004
From: Annie.Law at nrc-cnrc.gc.ca (Law, Annie)
Date: Thu Feb 19 15:56:54 2004
Subject: [Bioperl-l] New GO Parser and errors loading biosql database
Message-ID: <10C94843061E094A98C02EB77CFC328722FE08@nrcmrdex1d.imsb.nrc.ca>

Hi Hilmar,

Thanks for the tips.  I got the GO to go. I have some questions about the GO
loading result, bioperl-db make test and overall
Order of loading a database.  
1) I installed the Graph module and loading of the GO information into an
empty databse seems to 
run okay in the safe mode.  However, many of the entries are not able to be
inserted (roughly 200).
Mostly complaining about how the column name cannot be null.  However, I'm
not sure if it is related to
The make test errors I am having with bioperl-db that I have listed below or
if this is an acceptable result.  
In general how should a user gauge how successful a load of the database
was?  I guess you can sort
of look at the total number of expected number entries. 

-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were
("BBD_pathwayID:C1cyc","","","") FKs (2)
Column 'name' cannot be null
---------------------------------------------------
Could not store BBD_pathwayID:C1cyc ():

------------- EXCEPTION  -------------
MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be found
by unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253
STACK Bio::DB::Persistent::PersistentObject::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270
STACK (eval) /root/bioperl-db/scripts/biosql/load_ontology.pl:508
STACK toplevel /root/bioperl-db/scripts/biosql/load_ontology.pl:490

--------------------------------------
-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values were
("BBD_pathwayID:abs","","","") FKs (3)
Column 'name' cannot be null
---------------------------------------------------
Could not store BBD_pathwayID:abs ():

------------- EXCEPTION  -------------
MSG: create: object (Bio::Ontology::GOterm) failed to insert or to be found
by unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:207
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253
STACK Bio::DB::Persistent::PersistentObject::store
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Persistent/PersistentObject.pm:270
STACK (eval) /root/bioperl-db/scripts/biosql/load_ontology.pl:508
STACK toplevel /root/bioperl-db/scripts/biosql/load_ontology.pl:490

--------------------------------------


2) I have a question about The make test bioperl-db results which may be
related to the results that I am getting. I seem to be having problems with
the make test for bioperl-db.  I downloaded the tarball from the CVS website
and installed it.
I looked at the documentation and I created User biosql which has been given
all the permissions it needs.  I also renamed the files as stated in the
steps below. In the t directory of bioperl-db $ cd t $ cp
DBHarness.conf.example DBHarness.biosql.conf $ cp DBHarness.conf.example
DBHarness.markerdb.conf

I also put a copy of those file in the bioperl-db in the home directory
since that was documented for the newest version 
Of bioperl-db.
I did a make test in the bioperl-db directory and go the following results.
Most of the tests seem to fail. I am not sure why.

[root@microarray bioperl-db]# maket test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
"test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/cluster.......install_driver(mysql) failed: Can't load
'/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mys
ql.so' for module DBD::mysql:
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mysq
l.so: undefined symbol: mysql_ssl_set at
/usr/lib/perl5/5.8.0/i386-linux-thread-multi/DynaLoader.pm line 229.  at
(eval 4) line 3 Compilation failed in require at (eval 4) line 3. Perhaps a
required shared library or dll isn't installed where expected  at
t/DBTestHarness.pm line 211

 
t/cluster.......dubious
	Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 1-160
	Failed 160/160 tests, 0.00% okay
t/comment.......install_driver(mysql) failed: Can't load
'/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mys
ql.so' for module DBD::mysql:
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mysq
l.so: undefined symbol: mysql_ssl_set at
/usr/lib/perl5/5.8.0/i386-linux-thread-multi/DynaLoader.pm line 229.  at
(eval 4) line 3 Compilation failed in require at (eval 4) line 3. Perhaps a
required shared library or dll isn't installed where expected  at
t/DBTestHarness.pm line 211

 
t/comment.......dubious
	Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 1-11
	Failed 11/11 tests, 0.00% okay
t/dbadaptor.....install_driver(mysql) failed: Can't load
'/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mys
ql.so' for module DBD::mysql:
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/mysql/mysq
l.so: undefined symbol: mysql_ssl_set at
/usr/lib/perl5/5.8.0/i386-linux-thread-multi/DynaLoader.pm line 229.  at
(eval 5) line 3 Compilation failed in require at (eval 5) line 3. Perhaps a
required shared library or dll isn't installed where expected  at
t/DBTestHarness.pm line 211
 

t/swiss.........dubious
	Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 1-52
	Failed 52/52 tests, 0.00% okay
Failed Test    Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
---
t/cluster.t     255 65280   160  160 100.00%  1-160
t/comment.t     255 65280    11   11 100.00%  1-11
t/dbadaptor.t   255 65280     6    6 100.00%  1-6
t/dblink.t      255 65280    18   18 100.00%  1-18
t/ensembl.t     255 65280    15   15 100.00%  1-15
t/fuzzy2.t      255 65280    21   21 100.00%  1-21
t/genbank.t     255 65280    18   18 100.00%  1-18
t/locuslink.t   255 65280   110  110 100.00%  1-110
t/ontology.t    255 65280   302  302 100.00%  1-302
t/remove.t      255 65280    59   59 100.00%  1-59
t/seqfeature.t  255 65280    48   48 100.00%  1-48
t/simpleseq.t   255 65280    27   27 100.00%  1-27
t/species.t     255 65280    65   65 100.00%  1-65
t/swiss.t       255 65280    52   52 100.00%  1-52
Failed 14/15 test scripts, 6.67% okay. 912/930 subtests failed, 1.94% okay.
make: *** [test_dynamic] Error 2

3) Previously when I did a make test for the Bioperl 1.4 installation most
of the tests passed 97% I'm not sure whether the errors are expected or not

Here are the results of the make test.  I only cut out the beginning of the
test and the summary at the end. Installation of bioperl

------------- EXCEPTION  -------------
MSG: Failed to load module Bio::SeqIO::game. Can't locate IO/String.pm in
@INC (@INC contains: . t /root/bioperl-1.4/blib/lib
/root/bioperl-1.4/blib/arch /usr/lib/perl5/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl) at
Bio/SeqIO/game/gameWriter.pm line 63. BEGIN failed--compilation aborted at
Bio/SeqIO/game/gameWriter.pm line 63. Compilation failed in require at
Bio/SeqIO/game.pm line 77. BEGIN failed--compilation aborted at
Bio/SeqIO/game.pm line 77. Compilation failed in require at
/root/bioperl-1.4/blib/lib/Bio/Root/Root.pm line 394.

STACK Bio::Root::Root::_load_module
/root/bioperl-1.4/blib/lib/Bio/Root/Root.pm:396
STACK (eval) /root/bioperl-1.4/blib/lib/Bio/SeqIO.pm:549
STACK Bio::SeqIO::_load_format_module
/root/bioperl-1.4/blib/lib/Bio/SeqIO.pm:548
STACK Bio::SeqIO::new /root/bioperl-1.4/blib/lib/Bio/SeqIO.pm:377
STACK (eval) /root/bioperl-1.4/blib/lib/bptutorial.pl:4027
STACK main::__ANON__ /root/bioperl-1.4/blib/lib/bptutorial.pl:4025
STACK main::run_examples /root/bioperl-1.4/blib/lib/bptutorial.pl:4152
STACK toplevel t/tutorial.t:23

--------------------------------------

For more information about the SeqIO system please see the SeqIO docs. This
includes ways of checking for formats at compile time, not run time Can't
call method "next_seq" on an undefined value at
/root/bioperl-1.4/blib/lib/bptutorial.pl line 4035.
 

Failed Test        Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
---
t/BioDBGFF.t                    133  133 100.00%  1-133
t/ESEfinder.t         2   512    12    0   0.00%  ??
t/GuessSeqFormat.t    2   512    46   46 100.00%  1-46
t/SeqFeature.t       25  6400    74    0   0.00%  ??
t/tutorial.t          2   512    21    3  14.29%  19-21
22 subtests skipped.
Failed 5/179 test scripts, 97.21% okay. 182/8122 subtests failed, 97.76%
okay.
make: *** [test_dynamic] Error 29

#end of installation of bioperl

4) Also, hopefully when I get this all running I would like to know what is
the best order for loading the database. I know you mentionned that the GO
database information should be loaded before the locuslink information. Here
is the list of proposed order of entering information into the database.
Can you use load_seqdatabase.pl for loading unigene information?
1.  load NCBI taxonomy database with load_ncbi_taxonomy.pl
2.  GO information 
3.  load locuslink database information
4.  unigene information which I also had problems with loading information
in
[root@ bioperl-1.4]#perl /root/bioperl-db/scripts/biosql/load_seqdatabase.pl
--dbuser=root --dbpass=ms22 --dbname bioseqdb 
--namespace "Unigene" -format unigene /root/bioperl--1.4/unigenedata/Hs.data
Loading /root/bioperl-1.4/unigenedata/Hs.data ...
Bio::SeqIO: unigene cannot be found
Exception 
------------- EXCEPTION  -------------
MSG: Failed to load module Bio::SeqIO::unigene. Can't locate
Bio/SeqIO/unigene.pm in @INC (@INC contains:
/usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0
/usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl
/usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 .) at
/usr/lib/perl5/site_perl/5.8.0/Bio/Root/Root.pm line 394.

STACK Bio::Root::Root::_load_module
/usr/lib/perl5/site_perl/5.8.0/Bio/Root/Root.pm:396
STACK (eval) /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO.pm:549
STACK Bio::SeqIO::_load_format_module
/usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO.pm:548
STACK Bio::SeqIO::new /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO.pm:377
STACK toplevel /root/bioperl-db/scripts/biosql/load_seqdatabase.pl:436

--------------------------------------

For more information about the SeqIO system please see the SeqIO docs. This
includes ways of checking for formats at compile time, not run time Can't
call method "next_seq" on an undefined value at
/root/bioperl-db/scripts/biosql/load_seqdatabase.pl line 460.


Thanks very much,
Annie.

From barry.moore at genetics.utah.edu  Thu Feb 19 17:46:38 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Thu Feb 19 17:52:55 2004
Subject: [Fwd: Re: [Bioperl-l] Bio::SeqFeature::Primer Calculating the
	Primer TM]
In-Reply-To: <403517B9.6000102@asalup.org>
References: <4033C7A5.8000805@genetics.utah.edu> <403517B9.6000102@asalup.org>
Message-ID: <40353CCE.2080502@genetics.utah.edu>

Sebastian,

These lines account for initiation of duplex formation:

 #Account for initiation parameters
 $enthalpy += $thermo_values{substr($sequence, 0, 1)}{enthalpy};
 $entropy += $thermo_values{substr($sequence, 0, 1)}{entropy};
 $enthalpy += $thermo_values{substr($sequence, -1, 1)}{enthalpy};
 $entropy += $thermo_values{substr($sequence, -1, 1)}{entropy};

However your question made me go back and look at the 1997 and 1998 
SantaLucia papers again, and I realized that I have applied the symmetry 
correction incorrectly.  Symmetry correction should only be applied to 
self complimentary oligos.  The code could be modified to identify these 
and apply symmetry correction, but short of that the correction should 
probably just be  removed since most oligos (especially ones used in 
molecular biology) won't be self complimentary.  Rob, that could be 
fixed by replacing this line:

 $entropy -= 1.4;

with something like this line:

 #$entropy -= 1.4; #Should only be applied to self-complimentary oligos, 
so add code to test self complimentarity before applying this line to Tm 
calculations

Barry Moore
Dept. Human Genetics
University of Utah

Sebastian Bassi wrote:

> Barry Moore wrote:
>
>> Sebastian,
>>
>> You may have picked this up from the bioperl list if  you follow 
>> that, but since it sounded like you were on the python side, I 
>> thought I'd pass it along.
>
>
> Yes. I am working on this in BioPython.
> What it seems this code is lacking is some correction for "douplex 
> inicialization". That is stated in Santalucia paper.
>

From steve at trutane.net  Thu Feb 19 19:33:33 2004
From: steve at trutane.net (Steve Chervitz Trutane)
Date: Thu Feb 19 19:39:36 2004
Subject: [Bioperl-l] searchio scripts
In-Reply-To: <ADEFIHIJAHBGLBNJCIADOEAOCEAA.rrouse@biomail.ucsd.edu>
References: <ADEFIHIJAHBGLBNJCIADOEAOCEAA.rrouse@biomail.ucsd.edu>
Message-ID: <6A08F6BD-633C-11D8-A988-000A95765236@trutane.net>

On Feb 18, 2004, at 3:09 PM, Richard Rouse wrote:

> I tried Steve's suggestion by putting this right above:
>   while ( my $blast = $in->next_result() ) {
>
> Then putting another } at the end of the script.
>
> Doing this and then running a large blast output file, I got:
>
> Using SearchIO->new()
>
> Report 1: Bio::Search::Result::BlastResult=HASH(0x87d7fb8)
>
> Report 2: Bio::Search::Result::BlastResult=HASH(0x8dfea60)

The ugly 'HASH' output is because we're no longer overloading "". 
You'll have to call to_string() on the BlastResult object to get 
prettier output.

>
> ------------- EXCEPTION  -------------
> MSG: Trouble in ResultTableWriter::_set_row_data_func() eval:

This is a bug in either the parsing code and/or GenericHit. I get the 
same trouble parsing /t/data/blast.report in the Bioperl distribution. 
Was the report you were parsing a TBLASTN? Could you file a bug report 
on this at http://bugzilla.bioperl.org/ ? Thanks.

Steve

> ------------- EXCEPTION  -------------
> MSG: Can't get identical or conserved data: no data.
> STACK Bio::Search::Hit::GenericHit::matches
> ../..//Bio/Search/Hit/GenericHit.pm:852
> STACK Bio::Search::Hit::GenericHit::frac_identical
> ../..//Bio/Search/Hit/GenericHit.pm:1043
> STACK (eval) (eval 310):1
> STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__
> ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:327
> STACK Bio::SearchIO::Writer::HitTableWriter::to_string
> ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267
> STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321
> STACK Bio::SearchIO::blast::write_result 
> ../..//Bio/SearchIO/blast.pm:1495
> STACK toplevel new.mod.hitwriter.pl:106
>
> --------------------------------------
>
>
>
> STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__
> ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:329
> STACK Bio::SearchIO::Writer::HitTableWriter::to_string
> ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267
> STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321
> STACK Bio::SearchIO::blast::write_result 
> ../..//Bio/SearchIO/blast.pm:1495
> STACK toplevel new.mod.hitwriter.pl:106
>
> I tried Lincoln's suggestion as well. In this case I added:
>
> my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV);
>
> above
>
> while ( my $blast = $in->next_result() ) {
>
>
> This script just runs getting no result.
>
> By the way I am running Suse linux 9.0, perl 5.8.1
>
> Thanks,
> Richard

From MRBATESALANN at netscape.net  Thu Feb 19 19:58:11 2004
From: MRBATESALANN at netscape.net (MRBATESALANN@netscape.net)
Date: Thu Feb 19 20:04:05 2004
Subject: [Bioperl-l] REPLY SOON
Message-ID: <SHA_MAILEBHm9s8jgHR000032bf@sha_mail.sanfordha.com>

Dear Friend,

As you read this, I don't want you to feel sorry for 
me, because, I believe everyone will die someday. 
My name is BATES ALAN a merchant in Dubai, in the 
U.A.E.I have been diagnosed with Esophageal cancer.
It has defiled all forms of medical treatment, and right now 
I have only about a few months to live, according to medical experts. 
I have not particularly lived my life so well, as I 
never really cared for anyone(not even myself)but my 
business. Though I am very rich, I was never 
generous, I was always hostile to people and only 
focused on my business as that was the only thing I 
cared for. But now I regret all this as I now know 
that there is more to life than just wanting to have 
or make all the money in the world. 
I believe when God gives me a second chance to come 
to this world I would live my life a different way 
from how I have lived it. Now that God has called 
me, I have willed and given most of my property 
and assets to my immediate and extended family 
members as well as a few close friends. 
I want God to be merciful to me and accept my soul 
so, I have decided to give alms to charity 
organizations, as I want this to be one of the last 
good deeds I do on earth. So far, I have distributed 
money to some charity organizations in the U.A.E, 
Algeria and Malaysia. Now that my health has 
deteriorated so badly, I cannot do this myself 
anymore. I once asked members of my family to close one 
of my accounts and distribute the money which I have 
there to charity organization in Bulgaria and 
Pakistan, they refused and kept the money to 
themselves. Hence, I do not trust them anymore, as 
they seem not to be contended with what I have left 
for them. 
The last of my money which no one knows of is the 
huge cash deposit of eighteen million dollars 
$18,000,000,00 that I have with a finance/Security Company 
abroad. I will want you to help me collect this deposit 
and dispatched it to charity organizations.
I have set aside 10% for you and for your time.
God be with you. 
BATES ALAN

From rrouse at biomail.ucsd.edu  Thu Feb 19 20:44:10 2004
From: rrouse at biomail.ucsd.edu (Richard Rouse)
Date: Thu Feb 19 20:50:14 2004
Subject: [Bioperl-l] searchio scripts
In-Reply-To: <6A08F6BD-633C-11D8-A988-000A95765236@trutane.net>
Message-ID: <ADEFIHIJAHBGLBNJCIADIEBKCEAA.rrouse@biomail.ucsd.edu>

Steve,
The report I was parsing was a BLASTN. I'll submit a bug report. 
Richard

-----Original Message-----
From: Steve Chervitz Trutane [mailto:steve@trutane.net]
Sent: Thursday, February 19, 2004 4:34 PM
To: rrouse@biomail.ucsd.edu
Cc: Bioperl
Subject: Re: [Bioperl-l] searchio scripts


On Feb 18, 2004, at 3:09 PM, Richard Rouse wrote:

> I tried Steve's suggestion by putting this right above:
>   while ( my $blast = $in->next_result() ) {
>
> Then putting another } at the end of the script.
>
> Doing this and then running a large blast output file, I got:
>
> Using SearchIO->new()
>
> Report 1: Bio::Search::Result::BlastResult=HASH(0x87d7fb8)
>
> Report 2: Bio::Search::Result::BlastResult=HASH(0x8dfea60)

The ugly 'HASH' output is because we're no longer overloading "". 
You'll have to call to_string() on the BlastResult object to get 
prettier output.

>
> ------------- EXCEPTION  -------------
> MSG: Trouble in ResultTableWriter::_set_row_data_func() eval:

This is a bug in either the parsing code and/or GenericHit. I get the 
same trouble parsing /t/data/blast.report in the Bioperl distribution. 
Was the report you were parsing a TBLASTN? Could you file a bug report 
on this at http://bugzilla.bioperl.org/ ? Thanks.

Steve

> ------------- EXCEPTION  -------------
> MSG: Can't get identical or conserved data: no data.
> STACK Bio::Search::Hit::GenericHit::matches
> ../..//Bio/Search/Hit/GenericHit.pm:852
> STACK Bio::Search::Hit::GenericHit::frac_identical
> ../..//Bio/Search/Hit/GenericHit.pm:1043
> STACK (eval) (eval 310):1
> STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__
> ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:327
> STACK Bio::SearchIO::Writer::HitTableWriter::to_string
> ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267
> STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321
> STACK Bio::SearchIO::blast::write_result 
> ../..//Bio/SearchIO/blast.pm:1495
> STACK toplevel new.mod.hitwriter.pl:106
>
> --------------------------------------
>
>
>
> STACK Bio::SearchIO::Writer::ResultTableWriter::__ANON__
> ../..//Bio/SearchIO/Writer/ResultTableWriter.pm:329
> STACK Bio::SearchIO::Writer::HitTableWriter::to_string
> ../..//Bio/SearchIO/Writer/HitTableWriter.pm:267
> STACK Bio::SearchIO::write_result ../..//Bio/SearchIO.pm:321
> STACK Bio::SearchIO::blast::write_result 
> ../..//Bio/SearchIO/blast.pm:1495
> STACK toplevel new.mod.hitwriter.pl:106
>
> I tried Lincoln's suggestion as well. In this case I added:
>
> my $in = Bio::SearchIO->new(-format=>'blast',-fh=>\*ARGV);
>
> above
>
> while ( my $blast = $in->next_result() ) {
>
>
> This script just runs getting no result.
>
> By the way I am running Suse linux 9.0, perl 5.8.1
>
> Thanks,
> Richard


From sbassi at asalup.org  Thu Feb 19 21:25:14 2004
From: sbassi at asalup.org (Sebastian Bassi)
Date: Thu Feb 19 21:32:12 2004
Subject: [Bioperl-l] Tm calculation
Message-ID: <4035700A.3050302@asalup.org>

Hello,

I suscribed to this list because I was told there is a thread about Tm. 
I did a Tm function for Biopython, based on the EMBOSS DAN. The good 
thing is that it performed exactly as DAN. The problem was that DAN 
formulae was dated (it was previous to Santalucia work).

I have some questions:
-Regarding "#$entropy -= 1.4; #Should only be applied to 
self-complimentary oligos, so add code to test self complimentarity 
before applying this line to Tm calculations "
How do you define self complementary oligos?
Is this a self complementary oligo: AAACCCTAGGGTTT? What about this: 
AAACCCTCAGGGTTT?
-Does the proposed version test for overriding pairs? I mean, take this 
sequence: ACCCGTGAGCTG. How many CC pairs that the program detects? The 
right anwser should be 2. But if you are using a standard string find 
function, it may overlook the overrriding CC and detect only one. I had 
this problem in one of my atemps. So I had to made my own find string 
function (somebody from biopython mailing list coded this function at my 
request, so I didn't actually wrote it).

I'm asking this instead of looking at the code because I don't know 
enought Perl (that's why I work on Python :)

Sorry for my English, it is not my native languaje!

-- 
Best regards,

//=\ Sebastian Bassi - Diplomado en Ciencia y Tecnologia, UNQ   //=\
\=// IT Manager Advanta Seeds - Balcarce Research Center -      \=//
//=\ Pro secretario ASALUP - www.asalup.org - PGP key available //=\
\=// E-mail: sbassi@genesdigitales.com - ICQ UIN: 3356556 -     \=//

                 http://Bioinformatica.info


From hlapp at gmx.net  Fri Feb 20 03:37:17 2004
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri Feb 20 03:43:26 2004
Subject: [Bioperl-l] New GO Parser and errors loading biosql database
In-Reply-To: <10C94843061E094A98C02EB77CFC328722FE08@nrcmrdex1d.imsb.nrc.ca>
Message-ID: <FD7D3FED-637F-11D8-9BA4-000A959EB4C4@gmx.net>


On Thursday, February 19, 2004, at 12:50  PM, Law, Annie wrote:

>  However, many of the entries are not able to be
> inserted (roughly 200).
> Mostly complaining about how the column name cannot be null.  However,  
> I'm
> not sure if it is related to
> The make test errors I am having with bioperl-db that I have listed  
> below or
> if this is an acceptable result.
> In general how should a user gauge how successful a load of the  
> database
> was?  I guess you can sort
> of look at the total number of expected number entries.

It's always a good idea to look over the errors and check whether there  
are any that just don't make sense. The one below is an example:

>
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::TermAdaptor (driver) failed, values  
> were
> ("BBD_pathwayID:C1cyc","","","") FKs (2)
> Column 'name' cannot be null

'BBD_pathwayID:C1cyc' is *not* a GO term (all GO terms have identifiers  
that start with GO:). It's in fact a dbxref of a term that erroneously  
ends up as a term because in the 1.4 release of bioperl a bug had been  
introduced into the dagflat parser (which the GO parser basically is  
identical to). I strongly recommend you upgrade at a minimum the module  
Bio/OntologyIO/dagflat.pm with the one from cvs (tag branch-1-4).  
Alternatively, update the entire bioperl distribution from cvs (again,  
use branch-1-4).

Doing so will get rid of most if not all of the errors.

Generally speaking, there should be no or only a few terms that fail to  
load, and if any fail then they should only fail because of column  
width constraints or something similar.

>
> 2) I have a question about The make test bioperl-db results which may  
> be
> related to the results that I am getting. I seem to be having problems  
> with
> the make test for bioperl-db.  I downloaded the tarball from the CVS  
> website
> and installed it.
> I looked at the documentation and I created User biosql which has been  
> given
> all the permissions it needs.  I also renamed the files as stated in  
> the
> steps below. In the t directory of bioperl-db $ cd t $ cp
> DBHarness.conf.example DBHarness.biosql.conf $ cp  
> DBHarness.conf.example
> DBHarness.markerdb.conf

You do not need to create DBHarness.markerdb.conf anymore. It's not  
used.

>
> I also put a copy of those file in the bioperl-db in the home directory
> since that was documented for the newest version Of bioperl-db.

Not sure where you found that. The only place where this file needs to  
reside is in the t/ directory.

> I did a make test in the bioperl-db directory and go the following  
> results.
> Most of the tests seem to fail. I am not sure why.

Generally speaking, just read the error message. It often says why, and  
so does it here.

>
> [root@microarray bioperl-db]# maket test
> PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
> "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
> t/cluster.......install_driver(mysql) failed: Can't load
> '/usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/ 
> mysql/mys
> ql.so' for module DBD::mysql:
> /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/auto/DBD/ 
> mysql/mysq
> l.so: undefined symbol: mysql_ssl_set at
> /usr/lib/perl5/5.8.0/i386-linux-thread-multi/DynaLoader.pm line 229.   
> at
> (eval 4) line 3 Compilation failed in require at (eval 4) line 3.  
> Perhaps a
> required shared library or dll isn't installed where expected  at
> t/DBTestHarness.pm line 211

This says that your DBI driver could not be loaded. It has nothing to  
do with bioperl-db. You have either not or not successfully installed  
the mysql DBI driver, or you have installed it at a non-standard  
location, or you have installed it under another version of perl.

Make sure the tests for the DBD::mysql module pass before trying to use  
the driver.

Obviously, if the DBI driver can't be loaded, none of the tests will  
succeed, as then no database connection can be opened.

>
> 3) Previously when I did a make test for the Bioperl 1.4 installation  
> most
> of the tests passed 97% I'm not sure whether the errors are expected  
> or not
>

Generally, *all* tests of a stable bioperl distribution (which 1.4 is)  
are supposed to pass. If one or more don't, then chances are high that  
something is wrong.

> Here are the results of the make test.  I only cut out the beginning  
> of the
> test and the summary at the end. Installation of bioperl
>
> ------------- EXCEPTION  -------------
> MSG: Failed to load module Bio::SeqIO::game. Can't locate IO/String.pm

The message pretty much says it all. Bioperl does depend at a lot of  
places on IO::String, so I'd strongly recommend you go ahead and  
install it.

>
> 4) Also, hopefully when I get this all running I would like to know  
> what is
> the best order for loading the database. I know you mentionned that  
> the GO
> database information should be loaded before the locuslink  
> information. Here
> is the list of proposed order of entering information into the  
> database.
> Can you use load_seqdatabase.pl for loading unigene information?

Yes you can. Make sure you read the POD of load_seqdatabase.pl to see  
how.

> 1.  load NCBI taxonomy database with load_ncbi_taxonomy.pl
> 2.  GO information

The only things for which order matters are those which are referenced,  
but provided only in an incomplete manner, by annotated data sources.  
Hence, species information and any ontology that your data source uses  
for annotation should be loaded in advance so that upon loading of the  
annotated sequences the referenced entities are found by look-up.

> 3.  load locuslink database information
> 4.  unigene information which I also had problems with loading  
> information
> in
> [root@ bioperl-1.4]#perl  
> /root/bioperl-db/scripts/biosql/load_seqdatabase.pl
> --dbuser=root --dbpass=ms22 --dbname bioseqdb
> --namespace "Unigene" -format unigene  
> /root/bioperl--1.4/unigenedata/Hs.data
> Loading /root/bioperl-1.4/unigenedata/Hs.data ...
> Bio::SeqIO: unigene cannot be found
> Exception
> ------------- EXCEPTION  -------------
> MSG: Failed to load module Bio::SeqIO::unigene. Can't locate
> Bio/SeqIO/unigene.pm in @INC (@INC contains:

The message pretty much says it. The indicated module, which is the  
bioperl unigene parser, fails to load. The reason is most likely that  
you didn't install bioperl, or installed in a location that is not in  
Perl's default search path. If the latter is the case, you need to  
setup the PERL5LIB environment variable prior to running any code that  
uses those modules.

	-hilmar

-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From brian_osborne at cognia.com  Fri Feb 20 08:04:52 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Fri Feb 20 08:11:08 2004
Subject: [Bioperl-l] Version 1.4 for Windows
Message-ID: <GAEDKMGOKFBLJPKCLKCCOEJPDHAA.brian_osborne@cognia.com>

O|B|F - Open Bioinformatics Foundation Update: Bioperl 1.4 for Windows 

                           February 20, 2004


------------------------------------------------------------------------

http://news.open-bio.org/archives/2004_02.html#000068


------------------------------------------------------------------------

Bioperl version 1.4 for Windows is available. Thanks once again
to Nigam Shah for creating and testing the PPM and PPD files.

-- 


From vince.forgetta at staff.mcgill.ca  Fri Feb 20 10:14:40 2004
From: vince.forgetta at staff.mcgill.ca (Vince Forgetta)
Date: Fri Feb 20 10:25:10 2004
Subject: [Bioperl-l] MSG: acc does not exist,
	but acc is OK and bioperl version is 1.2.2
Message-ID: <40362460.8050004@staff.mcgill.ca>

Hi all,

I am trying to use bioperl to retrieve a refseq accession e.g. 
NM_003000, but it throws the following error:

------------- EXCEPTION  -------------
MSG: acc does not exist
STACK Bio::DB::WebDBSeqI::get_Seq_by_acc 
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177

My code is:

    use Bio::DB::RefSeq;
    $gb = new Bio::DB::RefSeq;
    my $seq;
    $seq = $gb->get_Seq_by_acc('$accession');

I have version 1.2.2 so it's not a problem of changing "protein" to 
"nucleotide" in GenBank.pm.

When I change get_Seq_by_acc in WebDBSeqI.pm like so (added "$seqid" and 
"ARRAY" to distinguish between the error messages):


sub get_Seq_by_acc {
   my ($self,$seqid) = @_;
   $self->_sleep;
   my $seqio = $self->get_Stream_by_acc($seqid);
   $self->throw("acc $seqid does not exist") if( ! defined $seqio );
   my @seqs;
   while( my $seq = $seqio->next_seq() ) { push @seqs, $seq; }
   $self->throw("ARRAY acc does not exist") unless @seqs;
   if( wantarray ) { return @seqs } else { return shift @seqs }
}

I get:

------------- EXCEPTION  -------------
MSG: ARRAY acc does not exist
STACK Bio::DB::WebDBSeqI::get_Seq_by_acc 
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177

so is the problem that the  retreival of the sequence fails? Very odd. 
Thanks for the help.

Vince

-- 
Vincenzo Forgetta

 Bioinformatics
  McGill University and Genome Quebec Innovation Centre
  740 Dr. Penfield Avenue Room 7211
  Montreal, Quebec Canada, H3A 1A4

  Tel: 514-398-3311 00476 
  Email: vince.forgetta@staff.mcgill.ca


From ak at ebi.ac.uk  Fri Feb 20 10:39:03 2004
From: ak at ebi.ac.uk (Andreas Kahari)
Date: Fri Feb 20 10:45:19 2004
Subject: [Bioperl-l] MSG: acc does not exist,
	but acc is OK and bioperl version is 1.2.2
In-Reply-To: <40362460.8050004@staff.mcgill.ca>
References: <40362460.8050004@staff.mcgill.ca>
Message-ID: <20040220153903.GA18842@ebi.ac.uk>

On Fri, Feb 20, 2004 at 10:14:40AM -0500, Vince Forgetta wrote:
[cut]
> ------------- EXCEPTION  -------------
> MSG: acc does not exist
> STACK Bio::DB::WebDBSeqI::get_Seq_by_acc 
> /usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177
> 
> My code is:
> 
>    use Bio::DB::RefSeq;
>    $gb = new Bio::DB::RefSeq;
>    my $seq;
>    $seq = $gb->get_Seq_by_acc('$accession');

Perl does not interpolate variables in within single quotes.
You probably want to say

    $seq = $gb->get_Seq_by_acc($accession);


Cheers,
Andreas

-- 
| {} |      Andreas K?h?ri                                |()()|
|{}{}|      EMBL, European Bioinformatics Institute       | () |
| {} |      Wellcome Trust Genome Campus, Hinxton         |()()|
|{}{}|      Cambridge, CB10 1SD                           | () |
| {} |      United Kingdom                                |()()|
From vince.forgetta at staff.mcgill.ca  Fri Feb 20 10:40:48 2004
From: vince.forgetta at staff.mcgill.ca (Vince Forgetta)
Date: Fri Feb 20 10:54:16 2004
Subject: [Bioperl-l] MSG: acc does not exist, but acc is OK and bioperl
	version is 1.2.2
In-Reply-To: <20040220153903.GA18842@ebi.ac.uk>
References: <40362460.8050004@staff.mcgill.ca>
	<20040220153903.GA18842@ebi.ac.uk>
Message-ID: <40362A80.8000001@staff.mcgill.ca>

I had originally put $accession without single quotes and got the same 
error. I tried the single quotes as a debugging step. I still have the 
same problem when I remove them.

The problem seems to be sporadic. some days I can get accessions and 
other days I run into problems. Could this be the problem:

http://bioperl.org/Core/Latest/faq.html#Q2.3

Thanks.

Andreas Kahari wrote:

>On Fri, Feb 20, 2004 at 10:14:40AM -0500, Vince Forgetta wrote:
>[cut]
>  
>
>>------------- EXCEPTION  -------------
>>MSG: acc does not exist
>>STACK Bio::DB::WebDBSeqI::get_Seq_by_acc 
>>/usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177
>>
>>My code is:
>>
>>   use Bio::DB::RefSeq;
>>   $gb = new Bio::DB::RefSeq;
>>   my $seq;
>>   $seq = $gb->get_Seq_by_acc('$accession');
>>    
>>
>
>Perl does not interpolate variables in within single quotes.
>You probably want to say
>
>    $seq = $gb->get_Seq_by_acc($accession);
>
>
>Cheers,
>Andreas
>
>  
>


-- 
Vincenzo Forgetta

 Computational Biology
  McGill University and Genome Quebec Innovation Centre
  740 Dr. Penfield Avenue Room 7211
  Montreal, Quebec Canada, H3A 1A4

  Tel: 514-398-3311 00476 
  Email: vince.forgetta@staff.mcgill.ca


From jason at cgt.duhs.duke.edu  Fri Feb 20 11:14:32 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Fri Feb 20 11:21:04 2004
Subject: [Bioperl-l] MSG: acc does not exist, but acc is OK and bioperl
	version is 1.2.2
In-Reply-To: <40362A80.8000001@staff.mcgill.ca>
References: <40362460.8050004@staff.mcgill.ca>
	<20040220153903.GA18842@ebi.ac.uk>
	<40362A80.8000001@staff.mcgill.ca>
Message-ID: <Pine.LNX.4.50.0402201112120.24456-100000@tenero.duhs.duke.edu>

You can see if it is the case by going to
http://www.ebi.ac.uk/cgi-bin/dbfetch
and plugging in your accession.

refseq is available for download from ncbi site - you will find this
faster and more reliable than most webbased sequence server I expect.

-jason
On Fri, 20 Feb 2004, Vince Forgetta wrote:

> I had originally put $accession without single quotes and got the same
> error. I tried the single quotes as a debugging step. I still have the
> same problem when I remove them.
>
> The problem seems to be sporadic. some days I can get accessions and
> other days I run into problems. Could this be the problem:
>
> http://bioperl.org/Core/Latest/faq.html#Q2.3
>
> Thanks.
>
> Andreas Kahari wrote:
>
> >On Fri, Feb 20, 2004 at 10:14:40AM -0500, Vince Forgetta wrote:
> >[cut]
> >
> >
> >>------------- EXCEPTION  -------------
> >>MSG: acc does not exist
> >>STACK Bio::DB::WebDBSeqI::get_Seq_by_acc
> >>/usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177
> >>
> >>My code is:
> >>
> >>   use Bio::DB::RefSeq;
> >>   $gb = new Bio::DB::RefSeq;
> >>   my $seq;
> >>   $seq = $gb->get_Seq_by_acc('$accession');
> >>
> >>
> >
> >Perl does not interpolate variables in within single quotes.
> >You probably want to say
> >
> >    $seq = $gb->get_Seq_by_acc($accession);
> >
> >
> >Cheers,
> >Andreas
> >
> >
> >
>
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From vince.forgetta at staff.mcgill.ca  Fri Feb 20 11:18:27 2004
From: vince.forgetta at staff.mcgill.ca (Vince Forgetta)
Date: Fri Feb 20 11:29:22 2004
Subject: [Bioperl-l] MSG: acc does not exist, but acc is OK and bioperl
	version is 1.2.2
In-Reply-To: <Pine.LNX.4.50.0402201112120.24456-100000@tenero.duhs.duke.edu>
References: <40362460.8050004@staff.mcgill.ca>
	<20040220153903.GA18842@ebi.ac.uk>
	<40362A80.8000001@staff.mcgill.ca>
	<Pine.LNX.4.50.0402201112120.24456-100000@tenero.duhs.duke.edu>
Message-ID: <40363353.6080303@staff.mcgill.ca>

Thanks a bunch ! Turns out that EBI doesn't have NM_003000, but NCBI 
does. I thought they were the same thing ! I'll just DL refseq from NCBI.

Vince

Jason Stajich wrote:

>You can see if it is the case by going to
>http://www.ebi.ac.uk/cgi-bin/dbfetch
>and plugging in your accession.
>
>refseq is available for download from ncbi site - you will find this
>faster and more reliable than most webbased sequence server I expect.
>
>-jason
>On Fri, 20 Feb 2004, Vince Forgetta wrote:
>
>  
>
>>I had originally put $accession without single quotes and got the same
>>error. I tried the single quotes as a debugging step. I still have the
>>same problem when I remove them.
>>
>>The problem seems to be sporadic. some days I can get accessions and
>>other days I run into problems. Could this be the problem:
>>
>>http://bioperl.org/Core/Latest/faq.html#Q2.3
>>
>>Thanks.
>>
>>Andreas Kahari wrote:
>>
>>    
>>
>>>On Fri, Feb 20, 2004 at 10:14:40AM -0500, Vince Forgetta wrote:
>>>[cut]
>>>
>>>
>>>      
>>>
>>>>------------- EXCEPTION  -------------
>>>>MSG: acc does not exist
>>>>STACK Bio::DB::WebDBSeqI::get_Seq_by_acc
>>>>/usr/lib/perl5/site_perl/5.8.0/Bio/DB/WebDBSeqI.pm:177
>>>>
>>>>My code is:
>>>>
>>>>  use Bio::DB::RefSeq;
>>>>  $gb = new Bio::DB::RefSeq;
>>>>  my $seq;
>>>>  $seq = $gb->get_Seq_by_acc('$accession');
>>>>
>>>>
>>>>        
>>>>
>>>Perl does not interpolate variables in within single quotes.
>>>You probably want to say
>>>
>>>   $seq = $gb->get_Seq_by_acc($accession);
>>>
>>>
>>>Cheers,
>>>Andreas
>>>
>>>
>>>
>>>      
>>>
>>
>>    
>>
>
>--
>Jason Stajich
>Duke University
>jason at cgt.mc.duke.edu
>
>  
>


-- 
Vincenzo Forgetta

 Computational Biology
  McGill University and Genome Quebec Innovation Centre
  740 Dr. Penfield Avenue Room 7211
  Montreal, Quebec Canada, H3A 1A4

  Tel: 514-398-3311 00476 
  Email: vince.forgetta@staff.mcgill.ca


From cjfields at uiuc.edu  Fri Feb 20 13:39:03 2004
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri Feb 20 13:45:21 2004
Subject: [Bioperl-l] Version 1.4 for Windows
In-Reply-To: <GAEDKMGOKFBLJPKCLKCCOEJPDHAA.brian_osborne@cognia.com>
References: <GAEDKMGOKFBLJPKCLKCCOEJPDHAA.brian_osborne@cognia.com>
Message-ID: <6.0.0.22.2.20040220122738.01bd1480@express.cites.uiuc.edu>

There are two additional dependencies listed (HTML-Entities and IO-Scalar) 
that PPM 3.1 can't locate, although they are part of HTML-Parser and 
IO-stringy, listed as separate dependencies.  I think this is confusing PPM.

When using PPM, typing "install bioperl" only installs ver. 1.2.  Asking it 
to "install Bioperl-1.4" fails b/c it can't find the two dependencies 
listed above.  Any workarounds?

Chris

At 07:04 AM 2/20/2004, you wrote:
>O|B|F - Open Bioinformatics Foundation Update: Bioperl 1.4 for Windows
>
>                            February 20, 2004
>
>
>------------------------------------------------------------------------
>
>http://news.open-bio.org/archives/2004_02.html#000068
>
>
>
>------------------------------------------------------------------------
>
>Bioperl version 1.4 for Windows is available. Thanks once again
>to Nigam Shah for creating and testing the PPM and PPD files.
>
>--
>
>
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l

__________________________________

Chris Fields - Postdoctoral Researcher
Lab of Dr. Robert Switzer

Address:

University of Illinois at Urbana-Champaign
Dept. of Biochemistry - 323 RAL
600 S. Mathews Ave.
Urbana, IL 61801

Phone : (217) 333-7098
Fax : (217) 244-5858 

From sjmiller at email.arizona.edu  Fri Feb 20 14:45:01 2004
From: sjmiller at email.arizona.edu (Susan J. Miller)
Date: Fri Feb 20 14:51:13 2004
Subject: [Bioperl-l] Problem with Bio::Factor::EMBOSS
Message-ID: <403663BD.8070301@email.arizona.edu>

We just installed bioperl-run-1.4 (on a sun4u sparc SUNW,Ultra-4 running 
Solaris8), and I am not able to pass the value zero as a parameter to 
the EMBOSS tools - when I do I get an error message saying "MISSING 
MANDATORY ATTRIBUTE".  I've tried a couple different EMBOSS programs, 
and the result is the same. I've also tried various forms of zero (0, 
'0', a variable contining zero)...

My code:
==========================================================================
use Bio::Factory::EMBOSS;

@files = glob("*.fasta");
foreach $f (@files) {
     $emb_fac = Bio::Factory::EMBOSS->new(-verbose => 1);

     $rev = $emb_fac->program('revseq');
     $rev->run({'-sequence' => "$f", -outseq => "SE1",
	      -sbegin1 => '0', -send1 => '100'}); # '0' gives error!

     $mut = $emb_fac->program('msbar');
     $mut->run({'-sequence' => "$f", '-count' => '100',
	       '-point' => '1', '-block' => '1',
	       '-codon' => 0,
	       '-outseq' => "$f.mut"});
}
==========================================================================
Both revseq and msbar work if I pass a non-zero value.  With 0, here is 
the verbose output:
==========================================================================
$VAR1 = {
           '-outseq' => 'SE1',
           '-send1' => '100',
           '-sbegin1' => '0',
           '-sequence' => 'Seq1.cgi'
         };
Input attr: outseq => SE1
Input attr: send1 => 100
Input attr: sbegin1 => 0
Input attr: sequence => Seq1.cgi
Command line: revseq  -outseq SE1 -send1 100 -sbegin1 -sequence Seq1.cgi 
-auto
Died: value required for -sbegin1

$VAR1 = {
           '-outseq' => 'Seq1.cgi.mut',
           '-count' => '100',
           '-codon' => 0,
           '-sequence' => 'Seq1.cgi',
           '-block' => '1',
           '-point' => '1'
         };
Input attr: outseq => Seq1.cgi.mut
Input attr: count => 100
Input attr: codon => 0
Input attr: sequence => Seq1.cgi
Input attr: block => 1
Input attr: point => 1
--------------------------------------
MISSING MANDATORY ATTRIBUTE: -codon
--------------------------------------
$VAR1 = {
           'category' => 'mandatory',
           'values' => '0(None)1(Any of the 
following)2(Insertions)3(Deletions)4(Changes)5(Duplications)6(Moves)',
           'descr' => 'Types of codon mutations to perform. These are 
only done if the sequence is nucleic.',
           'unnamed' => 0,
           'default' => '0'
         };

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Program msbar needs attribute [-codon]!

STACK: Error::throw
STACK: Bio::Root::Root::throw 
/usr/local/lib/perl5/site_perl/5.6.0/Bio/Root/Root.pm:342
STACK: Bio::Tools::Run::EMBOSSApplication::run 
/usr/local/lib/perl5/site_perl/5.6.0/Bio/Tools/Run/EMBOSSApplication.pm:229
STACK: ./ex4b.pl:28
-----------------------------------------------------------
==========================================================================


Is this a bug?  Is there a work-around?

-- 
Thanks,
-susan

Susan J. Miller
Biotechnology Computing Facility
Arizona Research Laboratories
Bio West 228
University of Arizona
Tucson, AZ  85721
(520) 626-2597

From jason at cgt.duhs.duke.edu  Fri Feb 20 14:53:58 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Fri Feb 20 15:00:14 2004
Subject: [Bioperl-l] Problem with Bio::Factor::EMBOSS
In-Reply-To: <403663BD.8070301@email.arizona.edu>
References: <403663BD.8070301@email.arizona.edu>
Message-ID: <Pine.LNX.4.50.0402201451190.24456-100000@tenero.duhs.duke.edu>

Presumably changing line 226 to
+	    unless (defined $input->{$attr}) {
from
-	    unless ( $input->{$attr}) {

will fix it - can you let us know if it does.

-jason
On Fri, 20 Feb 2004, Susan J. Miller wrote:

> We just installed bioperl-run-1.4 (on a sun4u sparc SUNW,Ultra-4 running
> Solaris8), and I am not able to pass the value zero as a parameter to
> the EMBOSS tools - when I do I get an error message saying "MISSING
> MANDATORY ATTRIBUTE".  I've tried a couple different EMBOSS programs,
> and the result is the same. I've also tried various forms of zero (0,
> '0', a variable contining zero)...
>
> My code:
> ==========================================================================
> use Bio::Factory::EMBOSS;
>
> @files = glob("*.fasta");
> foreach $f (@files) {
>      $emb_fac = Bio::Factory::EMBOSS->new(-verbose => 1);
>
>      $rev = $emb_fac->program('revseq');
>      $rev->run({'-sequence' => "$f", -outseq => "SE1",
> 	      -sbegin1 => '0', -send1 => '100'}); # '0' gives error!
>
>      $mut = $emb_fac->program('msbar');
>      $mut->run({'-sequence' => "$f", '-count' => '100',
> 	       '-point' => '1', '-block' => '1',
> 	       '-codon' => 0,
> 	       '-outseq' => "$f.mut"});
> }
> ==========================================================================
> Both revseq and msbar work if I pass a non-zero value.  With 0, here is
> the verbose output:
> ==========================================================================
> $VAR1 = {
>            '-outseq' => 'SE1',
>            '-send1' => '100',
>            '-sbegin1' => '0',
>            '-sequence' => 'Seq1.cgi'
>          };
> Input attr: outseq => SE1
> Input attr: send1 => 100
> Input attr: sbegin1 => 0
> Input attr: sequence => Seq1.cgi
> Command line: revseq  -outseq SE1 -send1 100 -sbegin1 -sequence Seq1.cgi
> -auto
> Died: value required for -sbegin1
>
> $VAR1 = {
>            '-outseq' => 'Seq1.cgi.mut',
>            '-count' => '100',
>            '-codon' => 0,
>            '-sequence' => 'Seq1.cgi',
>            '-block' => '1',
>            '-point' => '1'
>          };
> Input attr: outseq => Seq1.cgi.mut
> Input attr: count => 100
> Input attr: codon => 0
> Input attr: sequence => Seq1.cgi
> Input attr: block => 1
> Input attr: point => 1
> --------------------------------------
> MISSING MANDATORY ATTRIBUTE: -codon
> --------------------------------------
> $VAR1 = {
>            'category' => 'mandatory',
>            'values' => '0(None)1(Any of the
> following)2(Insertions)3(Deletions)4(Changes)5(Duplications)6(Moves)',
>            'descr' => 'Types of codon mutations to perform. These are
> only done if the sequence is nucleic.',
>            'unnamed' => 0,
>            'default' => '0'
>          };
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Program msbar needs attribute [-codon]!
>
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/local/lib/perl5/site_perl/5.6.0/Bio/Root/Root.pm:342
> STACK: Bio::Tools::Run::EMBOSSApplication::run
> /usr/local/lib/perl5/site_perl/5.6.0/Bio/Tools/Run/EMBOSSApplication.pm:229
> STACK: ./ex4b.pl:28
> -----------------------------------------------------------
> ==========================================================================
>
>
> Is this a bug?  Is there a work-around?
>
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From shawnh at stanford.edu  Fri Feb 20 14:58:21 2004
From: shawnh at stanford.edu (Shawn Hoon)
Date: Fri Feb 20 15:04:32 2004
Subject: [Bioperl-l] quick question
Message-ID: <22541E64-63DF-11D8-A0E8-000A95783436@stanford.edu>

Anybody have a quick way of parsing concatenated clustalw files?
I could split the files up but wonder if anybody had a quick solution.
I don't think the AlignIO parse seems to handle this.

shawn

From jason at cgt.duhs.duke.edu  Fri Feb 20 15:10:25 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Fri Feb 20 15:16:39 2004
Subject: [Bioperl-l] quick question
In-Reply-To: <22541E64-63DF-11D8-A0E8-000A95783436@stanford.edu>
References: <22541E64-63DF-11D8-A0E8-000A95783436@stanford.edu>
Message-ID: <Pine.LNX.4.50.0402201509070.24456-100000@tenero.duhs.duke.edu>

if there is a clustalw header separating the concatenated alignments
AlignIO is supposed to handle it.

We might want to change the code below in AlignIO::clustalw to ignore
blank lines...

    my $first_line;
    if( defined ($first_line  = $self->_readline )
        && $first_line !~ /CLUSTAL/ ) {
        $self->warn("trying to parse a file which does not start with a CLUSTAL header");
    }

On Fri, 20 Feb 2004, Shawn Hoon wrote:

> Anybody have a quick way of parsing concatenated clustalw files?
> I could split the files up but wonder if anybody had a quick solution.
> I don't think the AlignIO parse seems to handle this.
>
> shawn
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From MRBATESALANN at netscape.net  Fri Feb 20 15:23:31 2004
From: MRBATESALANN at netscape.net (MRBATESALANN@netscape.net)
Date: Fri Feb 20 15:30:38 2004
Subject: [Bioperl-l] REPLY SOON
Message-ID: <NS10V7nOh9KvpvDtDJd0000061c@mail.koc-it.com>

Dear Friend,

As you read this, I don't want you to feel sorry for 
me, because, I believe everyone will die someday. 
My name is BATES ALAN a merchant in Dubai, in the 
U.A.E.I have been diagnosed with Esophageal cancer.
It has defiled all forms of medical treatment, and right now 
I have only about a few months to live, according to medical experts. 
I have not particularly lived my life so well, as I 
never really cared for anyone(not even myself)but my 
business. Though I am very rich, I was never 
generous, I was always hostile to people and only 
focused on my business as that was the only thing I 
cared for. But now I regret all this as I now know 
that there is more to life than just wanting to have 
or make all the money in the world. 
I believe when God gives me a second chance to come 
to this world I would live my life a different way 
from how I have lived it. Now that God has called 
me, I have willed and given most of my property 
and assets to my immediate and extended family 
members as well as a few close friends. 
I want God to be merciful to me and accept my soul 
so, I have decided to give alms to charity 
organizations, as I want this to be one of the last 
good deeds I do on earth. So far, I have distributed 
money to some charity organizations in the U.A.E, 
Algeria and Malaysia. Now that my health has 
deteriorated so badly, I cannot do this myself 
anymore. I once asked members of my family to close one 
of my accounts and distribute the money which I have 
there to charity organization in Bulgaria and 
Pakistan, they refused and kept the money to 
themselves. Hence, I do not trust them anymore, as 
they seem not to be contended with what I have left 
for them. 
The last of my money which no one knows of is the 
huge cash deposit of eighteen million dollars 
$18,000,000,00 that I have with a finance/Security Company 
abroad. I will want you to help me collect this deposit 
and dispatched it to charity organizations.
I have set aside 10% for you and for your time.
God be with you. 
BATES ALAN

From shawnh at stanford.edu  Fri Feb 20 18:00:31 2004
From: shawnh at stanford.edu (Shawn Hoon)
Date: Fri Feb 20 18:06:42 2004
Subject: [Bioperl-l] quick question
In-Reply-To: <Pine.LNX.4.50.0402201509070.24456-100000@tenero.duhs.duke.edu>
References: <22541E64-63DF-11D8-A0E8-000A95783436@stanford.edu>
	<Pine.LNX.4.50.0402201509070.24456-100000@tenero.duhs.duke.edu>
Message-ID: <954193CE-63F8-11D8-A0E8-000A95783436@stanford.edu>


On Feb 20, 2004, at 12:10 PM, Jason Stajich wrote:

> if there is a clustalw header separating the concatenated alignments
> AlignIO is supposed to handle it.
>

Maybe I'm wrong, but there is no catch in the clustalw module to break 
if
one sees the header so it keeps going..
in anycase, i have committed a fix that seems to work.

thanks

shawn

> We might want to change the code below in AlignIO::clustalw to ignore
> blank lines...
>
>     my $first_line;
>     if( defined ($first_line  = $self->_readline )
>         && $first_line !~ /CLUSTAL/ ) {
>         $self->warn("trying to parse a file which does not start with 
> a CLUSTAL header");
>     }
>
> On Fri, 20 Feb 2004, Shawn Hoon wrote:
>
>> Anybody have a quick way of parsing concatenated clustalw files?
>> I could split the files up but wonder if anybody had a quick solution.
>> I don't think the AlignIO parse seems to handle this.
>>
>> shawn
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From epopkie at hotmail.com  Sat Feb 21 00:35:34 2004
From: epopkie at hotmail.com (valentin)
Date: Sat Feb 21 07:39:34 2004
Subject: [Bioperl-l] Increase your metabolism!
Message-ID: <1077341734-10177@excite.com>


FATBLAST is an advanced fat-binding supplement
that removes fat from the foods you eat!

http://keytoyourlife.com/hgh/index.php?pid=eph9106


mjtseng  pilar rswamina  sleeping lorraine  yoshimi bjtyler 
   cannon  alohaly    zscott   horse puckett Snicker </I>  hearn  Joseph nielsen singh bowler
hill
Get off this list by writing to getmeoff731@excitemail.com

From aeh21swimming at hotmail.com  Sat Feb 21 08:07:20 2004
From: aeh21swimming at hotmail.com (bryon)
Date: Sat Feb 21 08:13:54 2004
Subject: [Bioperl-l] Lose Fat, Gain Muscle with HGH!!!
Message-ID: <1077368840-7142@excite.com>

Increase Energy, Lose Weight, Build Muscle.
SAVE 45% or more on HGH 
Follow this link: 
http://improvedpills.com/hgh/index.php?pid=eph9106


Human Growth Hormone (HGH) can help: 

Increase Energy 
Weight Loss 
Muscle Gain and Endurance  
Increase Immune Function
Smoother Skin - More Elasticity
Improve Quality of Deep Sleep 

http://improvedpills.com/hgh/index.php?pid=eph9106


mission scorpionbullet rock jordan23 fugazi mimi depeche 
impala dodgersorchid t-bone stingray
jazz midori quest 
From kvddrift at earthlink.net  Sat Feb 21 08:09:14 2004
From: kvddrift at earthlink.net (Koen van der Drift)
Date: Sat Feb 21 08:15:30 2004
Subject: [Bioperl-l] bioperl on Mac OS X
Message-ID: <25A0D726-646F-11D8-B78C-003065A5FDCC@earthlink.net>

Hi,

I am pleased to announce that the current version of bioperl (1.4) is 
now available for Mac OS X users that use fink. Currently it is only 
available for 10.3, and the unstable tree of fink needs to be enabled.

I have added as much dependencies as possible, so you should be able to 
use almost all features of bioperl. The only modules that I left out 
are SVG::Graph and AceDB which are not available with fink. I also did 
not include mysql support, which is a big package, and only used in a 
few cases. However, you can always install dbd-mysl-pm through fink 
first, and then bioperl if you need this feature.

I would appreciate any comments (positive or negative).


thanks,


- Koen.

From birney at ebi.ac.uk  Sat Feb 21 14:37:22 2004
From: birney at ebi.ac.uk (Ewan Birney)
Date: Sat Feb 21 14:43:28 2004
Subject: [Bioperl-l] bioperl on Mac OS X
In-Reply-To: <25A0D726-646F-11D8-B78C-003065A5FDCC@earthlink.net>
Message-ID: <Pine.OSX.4.44.0402211937110.1477-100000@ewan-birneys-computer.local>


Cool. Many thanks Koen.


From sbassi at asalup.org  Sat Feb 21 22:13:32 2004
From: sbassi at asalup.org (Sebastian Bassi)
Date: Sat Feb 21 22:20:15 2004
Subject: [Bioperl-l] Tm calculation
In-Reply-To: <4036870F.5070402@genetics.utah.edu>
References: <4035700A.3050302@asalup.org> <4036870F.5070402@genetics.utah.edu>
Message-ID: <40381E5C.5040900@asalup.org>

Barry Moore wrote:

> Sebastian,
> I would say that your first oligo (AAACCCTAGGGTTT) is complimentary, but 

I've been reading the paper and searching the net for an implementation 
of the Santalucia formulae. I found two interesting things:
1- Santalucia's lab has a webpage with a Tm calculator server 
(http://ozone2.chem.wayne.edu/Hyther/hytherm1main.html). The code can't 
be accessed since its a server side CGI script.
2- I found another web page that returns ALMOST the same results that 
Santalucia page (I think the very small difference is just because of 
round errors). But this one, it has it's code in JS, so the code is 
available.
Take a look here:
http://www.promega.com/biomath/calc11.htm
The code is here:
http://www.promega.com/biomath/oligotm.js
If this implementation is OK, we could just translate it to perl/python.
I am working on that right now.

-- 
Best regards,

//=\ Sebastian Bassi - Diplomado en Ciencia y Tecnologia, UNQ   //=\
\=// IT Manager Advanta Seeds - Balcarce Research Center -      \=//
//=\ Pro secretario ASALUP - www.asalup.org - PGP key available //=\
\=// E-mail: sbassi@genesdigitales.com - ICQ UIN: 3356556 -     \=//

                 http://Bioinformatica.info

From parvesh at pacific.net.sg  Sun Feb 22 04:00:44 2004
From: parvesh at pacific.net.sg (Parvesh)
Date: Sun Feb 22 13:54:32 2004
Subject: [Bioperl-l] help with Bioperl
Message-ID: <001301c3f922$5b812360$f55018d2@yourr64slkwmas>

Hi All,
IS there a method in Bioperl to map the amino acid exon structure to the genomic sequence?

Could you help me to locate this and help me to explain how to use it ? Thanks very much for your help.

Best wishes
parvesh
From jason at cgt.duhs.duke.edu  Sun Feb 22 14:52:10 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Sun Feb 22 14:58:24 2004
Subject: [Bioperl-l] aa -> dna alignment (was: help with Bioperl)
In-Reply-To: <001301c3f922$5b812360$f55018d2@yourr64slkwmas>
References: <001301c3f922$5b812360$f55018d2@yourr64slkwmas>
Message-ID: <Pine.LNX.4.50.0402221353440.16927-100000@tenero.duhs.duke.edu>

Not directly - you can use genewise or Guy Slater's exonerate with the
protein2dna model and then use Bioperl parsers to parse these reports.

You can also use BLAT, TBLASTN, TFASTY but you may have to cleanup these
alignments some.

Genewise should give the most accurate alignments but can be slow if you
don't already know about where your gene should land in the genomic
sequence.

-jason
On Sun, 22 Feb 2004, Parvesh wrote:

> Hi All, IS there a method in Bioperl to map the amino acid exon
> structure to the genomic sequence?
>
> Could you help me to locate this and help me to explain how to use it ? Thanks very much for your help.
>
> Best wishes
> parvesh
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From mailman21theatre at hotmail.com  Mon Feb 23 05:40:18 2004
From: mailman21theatre at hotmail.com (justin)
Date: Sun Feb 22 16:45:38 2004
Subject: [Bioperl-l] This Drug puts VlAGRA to shame!!
Message-ID: <1077532818-30148@excite.com>

The Biggest New Drug since V1agra! Many times as powerful.

http://healthdo.com/sv/index.php?pid=eph9106

C1AL1S has been seen all over TV as of late.

So why is it so much better than V1agra? Why are so many switching brands?

-A quicker more stable erection
-More enjoyable sex for both
-Longer sex
-Known to add length to you erection
-Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six)

We have it at a discounted savings. Save when you go through our site on all your orders.

See the difference today.

http://medspro.net/sv/index.php?pid=eph9106


hazel lulufireball safety dasha flight barry mailman 
church dolphinsmimi kitty shelley
mortimer tequila oatmeal 
From cougars21center at hotmail.com  Mon Feb 23 01:16:30 2004
From: cougars21center at hotmail.com (isaiah)
Date: Mon Feb 23 01:22:51 2004
Subject: [Bioperl-l] Lose Fat, Gain Muscle with HGH!!!
Message-ID: <1077516990-24678@excite.com>


Tired of looking at your wrinkled face in the mirror as you pluck yet another grey hair and watch the pounds pile on? Is the "spark" missing from your love life? If you're over 40, chances are it is. Wouldn't you enjoy a longer, healthier and happier life? 

http://improvedpills.com/hgh/index.php?pid=eph9106


Human Growth Hormone can repair the physiology of the old cell, and rejuvenating the body, and reversing years of damage!

http://improvedpills.com/hgh/index.php?pid=eph9106


Human Growth Hormone (HGH) increases: 

Energy 
Weight Loss 
Muscle Gain and Endurance  
Increase Immune Function
Smoother Skin - More Elasticity
Quality of Deep Sleep 

http://improvedpills.com/hgh/index.php?pid=eph9106


gordon rockjosh don naomi charlie1 irene binky 
guido abcdefkatie sting1 don
dragonfl snuffy gretchen 
From interscan at portal.open-bio.org  Mon Feb 23 01:59:40 2004
From: interscan at portal.open-bio.org (interscan@portal.open-bio.org)
Date: Mon Feb 23 02:05:51 2004
Subject: [Bioperl-l] InterScan NT Alert
Message-ID: <200402230705.i1N75k9Q017318@portal.open-bio.org>

Sender, InterScan has detected virus(es) in your e-mail attachment.

Date:  	Mon, 23 Feb 2004 07:59:40 +0100
Method:	Mail
From:  	<bioperl-l@bioperl.org>
To:    	info.desk@barentz.nl
File:  	part2.zip
Action:	clean failed - deleted
Virus: 	WORM_NETSKY.B 
From bazin at univ-montp2.fr  Mon Feb 23 05:20:31 2004
From: bazin at univ-montp2.fr (Eric Bazin)
Date: Mon Feb 23 05:26:09 2004
Subject: [Bioperl-l] BioQuery failure
Message-ID: <4039D3EF.9000100@univ-montp2.fr>

Hi,

I discovered bioperl-db few days ago and i'm very enthusiatic using this
tool but i've got a problem using BioQuery. I would be grateful if
anybody can give me an answer about that.

This a piece of my code:

my $db = Bio::DB::BioDB->new(-database => "biosql",
				 -host     => $host,
				 -dbname   => $dbname,
				 -driver   => $driver,
				 -user     => $dbuser,
				 -pass     => $dbpass,
				 -verbose  => 10,
				 );			
my $query = Bio::DB::Query::BioQuery->new(
	-datacollections=>["Bio::SeqI seq",
		"Bio::Annotation::Reference ref",
		"Bio::Annotation::Reference<=>Bio::SeqI"
		],
	-select=>["ref.authors"],
	-where=>["and", "seq.accession_number='AJ311144'",
		seq.display_id='AAG311144'"]
	);

$query->flag("distinct", 1);

my $adaptor = $db->get_object_adaptor("Bio::Annotation::Reference");
my @tab = $adaptor->find_by_query($query);

I receive this error message:

------------- EXCEPTION  -------------
MSG: slot 'accession' not mapped to column for table bioentry
STACK Bio::DB::Query::BioQuery::_map_slot_to_col
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:487
STACK Bio::DB::Query::BioQuery::_map_constraint_slots_to_columns
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:369
STACK Bio::DB::Query::BioQuery::_map_constraint_slots_to_columns
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:356
STACK Bio::DB::Query::BioQuery::translate_query
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:305
STACK Bio::DB::BioSQL::BaseDriver::translate_query
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BaseDriver.pm:1182
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_query
/usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1198
STACK (eval) /var/www/cgi-bin/getentry.pl:97
STACK toplevel /var/www/cgi-bin/getentry.pl:67

-- 
Eric Bazin
Laboratoire "G?nome Populations Interactions Adaptation"
UM2 - IFREMER - CNRS  UMR 5171
Universit? de Montpellier 2
C.C. 63,  b?timent 24 ;34095 Montpellier Cedex 5
Tel:(0)4-67-14-39-13
Tel perso:(0)6-20-91-49-62
Fax:(0)4-67-14-45-54
http://www.univ-montp2.fr/~genetix
Seminaires internes: http://162.38.181.25/seminaire.html


From Xiaoying.Lin at celera.com  Mon Feb 23 09:09:53 2004
From: Xiaoying.Lin at celera.com (Lin, Xiaoying)
Date: Mon Feb 23 09:15:59 2004
Subject: [Bioperl-l] help with Bioperl
Message-ID: <B97FA25EDA418049A146320ADFE65506180F57@celmrkv2.rkv.ad.celera.com>

There is a non Bioperl solution, a package called AAT by Huang et al.
You can access his server at 

http://deepc2.zool.iastate.edu/aat/aat/aat.html

I do not think there is a Bioperl parser for it yet.

In my hand it does a better job in getting the exon structure right than
other similar tools, especially for genes with repeatitive sequence and
tandem gene clusters. The speed is generally several folds faster, but
this should be taken with a grain of salt, since the default parameters
used by diff programs are not the same.

-Xiaoying 

> -----Original Message-----
> From: Parvesh [mailto:parvesh@pacific.net.sg] 
> Sent: Sunday, February 22, 2004 4:01 AM
> To: bioperl-l@bioperl.org
> Subject: [Bioperl-l] help with Bioperl
> 
> 
> Hi All,
> IS there a method in Bioperl to map the amino acid exon 
> structure to the genomic sequence?
> 
> Could you help me to locate this and help me to explain how 
> to use it ? Thanks very much for your help.
> 
> Best wishes
> parvesh
> 

From MEC at Stowers-Institute.org  Mon Feb 23 12:51:04 2004
From: MEC at Stowers-Institute.org (Cook, Malcolm)
Date: Mon Feb 23 12:57:13 2004
Subject: [Bioperl-l] Bio::Tools::GFF  use of seqname
Message-ID: <CED81D34E37D5043A1211565277A51E501491987@exchkc02.stowers-institute.org>

Dear Matthew, Ewan, et al1

I see in three places in  Bio::Tools::GFF the following:
 
  if( $feat->can('seqname') ) {
       $name = $feat->seq_id();
       $name ||= 'SEQ';
   } else {
       $name = 'SEQ';
   }

However, in Bio::SeqFeature::Generic we learn that

	$self->warn("-seqname is deprecated. Please use -seq_id
instead.");

So, should we rewrite those fragments in Bio::Tools:GFF as:
 $name = $feat->seq_id() || 'SEQ'

??

Thanks,

Malcolm Cook
Database Applications Manager, Bioinformatics
Stowers Institute for Medical Research 

From pst at ksu.edu  Mon Feb 23 12:58:36 2004
From: pst at ksu.edu (Paul St. Amand)
Date: Mon Feb 23 13:06:25 2004
Subject: [Bioperl-l] Help with reversing a sequence
Message-ID: <E6F964B7-6629-11D8-B2C9-0003938893E4@ksu.edu>

Hi,

I am using the following script to get a subsequence and reverse it. 
Note that I do NOT want the "reverse complement" of the sequence here, 
just the actual reverse. BioPerl has a method to get the revcom of a 
seq, such as:

print $outputfh "Reverse complemented sequence 5 to 10  is 
",$seqobj->trunc(5,10)->revcom->seq, "  \n";

Does BioPerl have a similar/better way to get the reverse (not revcom) 
of a sequence?

This is how I am doing it and it is slow. Is there a way that is faster 
or "better" using BioPerl???


use strict;
use warnings;
use Bio::SeqIO;
my $outputfh = *STDOUT;

     my ($infile, $in, $out, $seqobj);
     $infile = shift or die;

     $in  = Bio::SeqIO->new('-file' => $infile ,
                            '-format' => 'Fasta');
     $seqobj = $in->next_seq();

     $out = Bio::SeqIO->newFh('-format'   => 'fasta',
			     '-noclose'  => 1,
			     '-fh'       => $outputfh);

print $outputfh ">MyReversedSeq29856-29862\n",scalar 
reverse($seqobj->subseq(29856,29862)),"\n";


Thanks,
Paul

From pst at ksu.edu  Mon Feb 23 13:07:36 2004
From: pst at ksu.edu (Paul St. Amand)
Date: Mon Feb 23 13:15:24 2004
Subject: [Bioperl-l] bioperl on Mac OS X
Message-ID: <28EF51BF-662B-11D8-B2C9-0003938893E4@ksu.edu>

BioPerl on MacOSX has been great for me. I am just starting out and do 
not know perl at all, but with fink and your porting work on BioPerl, I 
can do some really useful stuff.

Many thanks!
Paul

From hlapp at gnf.org  Mon Feb 23 13:13:34 2004
From: hlapp at gnf.org (Hilmar Lapp)
Date: Mon Feb 23 13:19:34 2004
Subject: [Bioperl-l] BioQuery failure
In-Reply-To: <4039D3EF.9000100@univ-montp2.fr>
Message-ID: <FE9E1DF1-662B-11D8-A55D-000A959EB4C4@gnf.org>

First off, I have no clue where the code is taking the column accession  
from, since you give the correct attribute name accession_number.

For the rest see below.

On Monday, February 23, 2004, at 02:20  AM, Eric Bazin wrote:

> Hi,
>
> I discovered bioperl-db few days ago and i'm very enthusiatic using  
> this
> tool but i've got a problem using BioQuery. I would be grateful if
> anybody can give me an answer about that.
>
> This a piece of my code:
>
> my $db = Bio::DB::BioDB->new(-database => "biosql",
> 				 -host     => $host,
> 				 -dbname   => $dbname,
> 				 -driver   => $driver,
> 				 -user     => $dbuser,
> 				 -pass     => $dbpass,
> 				 -verbose  => 10,
> 				 );			
> my $query = Bio::DB::Query::BioQuery->new(
> 	-datacollections=>["Bio::SeqI seq",
> 		"Bio::Annotation::Reference ref",
> 		"Bio::Annotation::Reference<=>Bio::SeqI"
> 		],
> 	-select=>["ref.authors"],

Note that the -select parameter or setting will be ignored, since the  
adaptors need to have control over the select list in order to be able  
to build objects.


> 	-where=>["and", "seq.accession_number='AJ311144'",
> 		seq.display_id='AAG311144'"]
> 	);
>
> $query->flag("distinct", 1);
>
> my $adaptor = $db->get_object_adaptor("Bio::Annotation::Reference");
> my @tab = $adaptor->find_by_query($query);

Note that find_by_query() returns an object to you (a  
Bio::DB::Query::QueryResultI-compliant instance), which is basically an  
iterator over the result set (call $query_result->next_object() until  
it returns undef).

>
> I receive this error message:
>
> ------------- EXCEPTION  -------------
> MSG: slot 'accession' not mapped to column for table bioentry

As I said, I have no clue how you might get here. First off, to exclude  
the obvious, you did obtain the latest revision from CVS, right? Also,  
the test suite that comes with bioperl-db did or did not pass all tests?

If your answer is yes to both of the questions above, we need to get  
more verbose debugging output. Insert the following statement after you  
obtain the $db handle:

	$db->verbose(1);

Then run the code again, capture the output in a file, and send it to  
me.

	-hilmar

> STACK Bio::DB::Query::BioQuery::_map_slot_to_col
> /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:487
> STACK Bio::DB::Query::BioQuery::_map_constraint_slots_to_columns
> /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:369
> STACK Bio::DB::Query::BioQuery::_map_constraint_slots_to_columns
> /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:356
> STACK Bio::DB::Query::BioQuery::translate_query
> /usr/lib/perl5/site_perl/5.8.0/Bio/DB/Query/BioQuery.pm:305
> STACK Bio::DB::BioSQL::BaseDriver::translate_query
> /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/BaseDriver.pm:1182
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_query
> /usr/lib/perl5/site_perl/5.8.0/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:1198
> STACK (eval) /var/www/cgi-bin/getentry.pl:97
> STACK toplevel /var/www/cgi-bin/getentry.pl:67
>
> -- 
> Eric Bazin
> Laboratoire "G?nome Populations Interactions Adaptation"
> UM2 - IFREMER - CNRS  UMR 5171
> Universit? de Montpellier 2
> C.C. 63,  b?timent 24 ;34095 Montpellier Cedex 5
> Tel:(0)4-67-14-39-13
> Tel perso:(0)6-20-91-49-62
> Fax:(0)4-67-14-45-54
> http://www.univ-montp2.fr/~genetix
> Seminaires internes: http://162.38.181.25/seminaire.html
>
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From hlapp at gmx.net  Mon Feb 23 13:19:28 2004
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon Feb 23 13:25:25 2004
Subject: [Bioperl-l] Bio::Tools::GFF  use of seqname
In-Reply-To: <CED81D34E37D5043A1211565277A51E501491987@exchkc02.stowers-institute.org>
Message-ID: <D15BED4C-662C-11D8-A55D-000A959EB4C4@gmx.net>

You mean replace can('seqname') by can('seq_id')?

Actually, $feat->can('seq_id') must be true at all times iff 
$feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to test for 
it.

	-hilmar

On Monday, February 23, 2004, at 09:51  AM, Cook, Malcolm wrote:

> Dear Matthew, Ewan, et al1
>
> I see in three places in  Bio::Tools::GFF the following:
>
>   if( $feat->can('seqname') ) {
>        $name = $feat->seq_id();
>        $name ||= 'SEQ';
>    } else {
>        $name = 'SEQ';
>    }
>
> However, in Bio::SeqFeature::Generic we learn that
>
> 	$self->warn("-seqname is deprecated. Please use -seq_id
> instead.");
>
> So, should we rewrite those fragments in Bio::Tools:GFF as:
>  $name = $feat->seq_id() || 'SEQ'
>
> ??
>
> Thanks,
>
> Malcolm Cook
> Database Applications Manager, Bioinformatics
> Stowers Institute for Medical Research
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From jason at cgt.duhs.duke.edu  Mon Feb 23 13:53:38 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Mon Feb 23 13:59:45 2004
Subject: [Bioperl-l] Help with reversing a sequence
In-Reply-To: <E6F964B7-6629-11D8-B2C9-0003938893E4@ksu.edu>
References: <E6F964B7-6629-11D8-B2C9-0003938893E4@ksu.edu>
Message-ID: <Pine.LNX.4.50.0402231349390.23639-100000@tenero.duhs.duke.edu>

The perl reverse cmd is what you want that does exactly what you want --
reverse a string.  What is slow specifically?  Did you benchmark
something?

-jason
On Mon, 23 Feb 2004, Paul St. Amand wrote:

> Hi,
>
> I am using the following script to get a subsequence and reverse it.
> Note that I do NOT want the "reverse complement" of the sequence here,
> just the actual reverse. BioPerl has a method to get the revcom of a
> seq, such as:
>
> print $outputfh "Reverse complemented sequence 5 to 10  is
> ",$seqobj->trunc(5,10)->revcom->seq, "  \n";
>
> Does BioPerl have a similar/better way to get the reverse (not revcom)
> of a sequence?
>
> This is how I am doing it and it is slow. Is there a way that is faster
> or "better" using BioPerl???
>
>
> use strict;
> use warnings;
> use Bio::SeqIO;
> my $outputfh = *STDOUT;
>
>      my ($infile, $in, $out, $seqobj);
>      $infile = shift or die;
>
>      $in  = Bio::SeqIO->new('-file' => $infile ,
>                             '-format' => 'Fasta');
>      $seqobj = $in->next_seq();
>
>      $out = Bio::SeqIO->newFh('-format'   => 'fasta',
> 			     '-noclose'  => 1,
> 			     '-fh'       => $outputfh);
>
> print $outputfh ">MyReversedSeq29856-29862\n",scalar
> reverse($seqobj->subseq(29856,29862)),"\n";
>
>
>
>
>
> Thanks,
> Paul
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From MEC at Stowers-Institute.org  Mon Feb 23 14:53:03 2004
From: MEC at Stowers-Institute.org (Cook, Malcolm)
Date: Mon Feb 23 14:59:11 2004
Subject: [Bioperl-l] Bio::Tools::GFF  use of seqname
Message-ID: <CED81D34E37D5043A1211565277A51E501491993@exchkc02.stowers-institute.org>

Actually, I mean replace the 6 lines with the single line:

 $name = $feat->seq_id() || 'SEQ'

>-----Original Message-----
>From: Hilmar Lapp [mailto:hlapp@gmx.net] 
>Sent: Monday, February 23, 2004 12:19 PM
>To: Cook, Malcolm
>Cc: Bioperl; birney@sanger.ac.uk; mrp@sanger.ac.uk
>Subject: Re: [Bioperl-l] Bio::Tools::GFF use of seqname
>
>
>You mean replace can('seqname') by can('seq_id')?
>
>Actually, $feat->can('seq_id') must be true at all times iff 
>$feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to 
>test for 
>it.
>
>	-hilmar
>
>On Monday, February 23, 2004, at 09:51  AM, Cook, Malcolm wrote:
>
>> Dear Matthew, Ewan, et al1
>>
>> I see in three places in  Bio::Tools::GFF the following:
>>
>>   if( $feat->can('seqname') ) {
>>        $name = $feat->seq_id();
>>        $name ||= 'SEQ';
>>    } else {
>>        $name = 'SEQ';
>>    }
>>
>> However, in Bio::SeqFeature::Generic we learn that
>>
>> 	$self->warn("-seqname is deprecated. Please use -seq_id
>> instead.");
>>
>> So, should we rewrite those fragments in Bio::Tools:GFF as:
>>  $name = $feat->seq_id() || 'SEQ'
>>
>> ??
>>
>> Thanks,
>>
>> Malcolm Cook
>> Database Applications Manager, Bioinformatics
>> Stowers Institute for Medical Research
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>-- 
>-------------------------------------------------------------
>Hilmar Lapp                            email: lapp at gnf.org
>GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>-------------------------------------------------------------
>
>
>

From MEC at Stowers-Institute.org  Mon Feb 23 14:56:36 2004
From: MEC at Stowers-Institute.org (Cook, Malcolm)
Date: Mon Feb 23 15:02:43 2004
Subject: [Bioperl-l] inferring exon features in
	SeqFeature::Tools::Unflattener
Message-ID: <CED81D34E37D5043A1211565277A51E501491994@exchkc02.stowers-institute.org>

Dear Chris and fellow Bioperlers,

I have made the following patch on my local version of this module in
order to provide values for the seq_id and the /locus_tag and /gene tags
of inferred exons.

Please advise if this is in the spirit of the module, and whether this
patch can be incorporated in the live version.

Thanks!
Malcolm Cook
Database Applications Manager, Bioinformatics
Stowers Institute for Medical Research
(816) 926-4449 


Index: Unflattener.pm
===================================================================
RCS file:
/home/repository/bioperl/bioperl-live/Bio/SeqFeature/Tools/Unflattener.p
m,v
retrieving revision 1.19
diff -c -r1.19 Unflattener.pm
*** Unflattener.pm 2003/12/16 22:31:16 1.19
--- Unflattener.pm 2004/02/23 19:45:32
***************
*** 2408,2413 ****
--- 2408,2424 ----
 
-primary_tag=>'exon');
        my $locstr = 'exon::'.$self->_locstr($subsf);
  
+       ## Provide seq_id to new feature:
+       $subsf->seq_id($sf->seq_id);
+       ## Transfer /locus_tag and /gene tag values to inferred
+       ## features.  TODO: Perhaps? this should not be done
+       ## indiscriminantly but rather by virtue of the setting
+       ## of group_tag.
+       foreach my $tag (grep /gene|locus_tag/, $sf->get_all_tags) {
+         my @vals = $sf->get_tag_values($tag);
+         $subsf->add_tag_value($tag, @vals);
+       }
+ 
        # re-use feature if type and location the same
        if ($loc_h{$locstr}) {
     $subsf = $loc_h{$locstr};

From amackey at pcbi.upenn.edu  Mon Feb 23 16:10:49 2004
From: amackey at pcbi.upenn.edu (Aaron J. Mackey)
Date: Mon Feb 23 16:16:59 2004
Subject: [Bioperl-l] StandAloneBlast.pm,
	bl2seq() and tempfiles on Win32/cygwin
Message-ID: <C165E4B8-6644-11D8-9A28-000A958C5008@pcbi.upenn.edu>


A colleague of mine is frustrated by attempting to use 
Bio::Tools::Run::StandAloneBlast to run bl2seq (Perl 5.8.2, bioperl 
1.4, windows XP, CygWin, etc.):

# synopsis:
$seq1 = $seqio->next_seq;
$seq2 = $seqio->next_seq;
$factory->bl2seq($seq1, $seq2);

StandAloneBlast successfully writes two temp files in /tmp, which have 
the sequence data and can be read by "less" or "cat" in another open 
window (with the main program suspended in debugger); however, if 
either the program code or I at the command line attempt to run bl2seq, 
it dies with "Cannot open file /tmp/7aasd78asd".  If I "cp" the temp 
files into new files, it runs fine.  Or, if I call 
$factory->bl2seq($file1, $file2) with filenames instead of seq objects, 
it also works fine.  I have tried various incarnations of closing the 
filehandles and Bio::SeqIO objects that StandAloneBlast.pm is using to 
generate these temp files, but to no avail (and of course, the 
tempfiles disappear upon program completion).

This is not the "failed to open tempfile; too many files open" error 
seen previously, and I also expect a fair number of "works for me" 
responses - please save your breath.

Thanks for any input,

-Aaron

From pm66 at nyu.edu  Mon Feb 23 16:06:50 2004
From: pm66 at nyu.edu (Philip MacMenamin)
Date: Mon Feb 23 16:17:32 2004
Subject: [Bioperl-l] Bio::DB::GFF::Aggregator problem, new wormbase models.
In-Reply-To: <BC57ED88.BB42%todd.harris@cshl.org>
References: <BC57ED88.BB42%todd.harris@cshl.org>
Message-ID: <200402232111.i1NLBOsD003757@mx5.nyu.edu>

This worked for me:

my $db = new Bio::DB::GFF(-adaptor=>'dbi::mysqlopt',
			  -dsn=>'dbi:mysql:WS118;host=localhost',
			  -user=>$user,
			  -pass=>$pass,
#			  -aggregator => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})], # Not 
working 
			  -aggregator => [qw(wormabse_cds{coding_exon:curated})],
			 ) or die();
my $panelSeg = $db->segment(CDS=>$CDS);
if(!$panelSeg)
{
#do something 
}
else
{
  my @features = $panelSeg->features();
  my @UTRs = $searchSeg->features('UTR');
  my @all_transcripts = $searchSeg->features('wormabse_cds');
  $all_transcripts[0]{subfeatures}{UTR} = \@UTRs; ###<<<<<THIS WORKS>>>>>
}

Its not really a nice way to do it, but, it does the job with the new models. 

Thanks for the advice, 

Philip.

On Tuesday 17 February 2004 05:10 pm, you wrote:
> Hi Phillip -
>
> You need to aggregate the separate parts of the CDS.  Create a wormbase_cds
> (or whatever you wish to call it), aggregating the following features using
> the CDS group: coding_exon,5_UTR,3_UTR.
>
> The following stanza should do the trick.
>
> $dbgff = (-adaptor => 'dbi::mysql',
>           -dsn     => 'dbi:mysql:database=your_database;host=your_host',
>           -aggregators => [qw(wormabse_cds{coding_exon,5_UTR,3_UTR/CDS})],
>           -user    => 'your_username',
>           -pass    => 'your_dbgff_pass');
>
> This should do the trick for properly aggregating genes under the new
> WormBase CDS class.
>
> Todd Harris
>
From cjm at fruitfly.org  Mon Feb 23 17:38:05 2004
From: cjm at fruitfly.org (Chris Mungall)
Date: Mon Feb 23 17:44:42 2004
Subject: [Bioperl-l] Re: inferring exon features in
 SeqFeature::Tools::Unflattener - an improvement?
In-Reply-To: <CED81D34E37D5043A1211565277A51E501491992@exchkc02.stowers-institute.org>
Message-ID: <Pine.LNX.4.33.0402231423370.1743-100000@heartbroken.lbl.gov>

Hi Malcolm

Thanks for the patch! This is indeed in keeping with the spirit of the
module, I have incorporated it with one minor modification

this
             $subsf->seq_id($sf->seq_id);
to
             $subsf->seq_id($sf->seq_id) if $sf->seq_id;

This saves unneccessary null accessors; as far as I can tell, parsing
genbank/embl will populate the seq_id accessor in the underlying location
object, but this is not propagated up to the feature seq_id (either by
explicitly copying the data or by transitivity/delegation). All very
confusing. If you want this field populated in the newly created exon
location objects you may need to add extra code to do this.

Now in cvs

Cheers
Chris

On Mon, 23 Feb 2004, Cook, Malcolm wrote:

> Dear Chris and fellow Bioperlers,
>
> I have made the following patch on my local version of this module in
> order to provide values for the seq_id and the /locus_tag and /gene tags
> of inferred exons.
>
> Please advise if this is in the spirit of the module, and whether this
> patch can be incorporated in the live version.
>
> Thanks!
> Malcolm Cook
> Database Applications Manager, Bioinformatics
> Stowers Institute for Medical Research
> (816) 926-4449
>
>
>
> Index: Unflattener.pm
> ===================================================================
> RCS file:
> /home/repository/bioperl/bioperl-live/Bio/SeqFeature/Tools/Unflattener.p
> m,v
> retrieving revision 1.19
> diff -c -r1.19 Unflattener.pm
> *** Unflattener.pm 2003/12/16 22:31:16 1.19
> --- Unflattener.pm 2004/02/23 19:45:32
> ***************
> *** 2408,2413 ****
> --- 2408,2424 ----
>
> -primary_tag=>'exon');
>         my $locstr = 'exon::'.$self->_locstr($subsf);
>
> +       ## Provide seq_id to new feature:
> +       $subsf->seq_id($sf->seq_id);
> +       ## Transfer /locus_tag and /gene tag values to inferred
> +       ## features.  TODO: Perhaps? this should not be done
> +       ## indiscriminantly but rather by virtue of the setting
> +       ## of group_tag.
> +       foreach my $tag (grep /gene|locus_tag/, $sf->get_all_tags) {
> +         my @vals = $sf->get_tag_values($tag);
> +         $subsf->add_tag_value($tag, @vals);
> +       }
> +
>         # re-use feature if type and location the same
>         if ($loc_h{$locstr}) {
>      $subsf = $loc_h{$locstr};
>
>

From hlapp at gmx.net  Mon Feb 23 19:51:41 2004
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon Feb 23 19:57:53 2004
Subject: [Bioperl-l] Bio::Tools::GFF  use of seqname
In-Reply-To: <CED81D34E37D5043A1211565277A51E501491993@exchkc02.stowers-institute.org>
Message-ID: <9C021A6C-6663-11D8-B3BB-000A959EB4C4@gmx.net>

Looks like the way to go. -hilmar

On Monday, February 23, 2004, at 11:53  AM, Cook, Malcolm wrote:

> Actually, I mean replace the 6 lines with the single line:
>
>  $name = $feat->seq_id() || 'SEQ'
>
>> -----Original Message-----
>> From: Hilmar Lapp [mailto:hlapp@gmx.net]
>> Sent: Monday, February 23, 2004 12:19 PM
>> To: Cook, Malcolm
>> Cc: Bioperl; birney@sanger.ac.uk; mrp@sanger.ac.uk
>> Subject: Re: [Bioperl-l] Bio::Tools::GFF use of seqname
>>
>>
>> You mean replace can('seqname') by can('seq_id')?
>>
>> Actually, $feat->can('seq_id') must be true at all times iff
>> $feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to
>> test for
>> it.
>>
>> 	-hilmar
>>
>> On Monday, February 23, 2004, at 09:51  AM, Cook, Malcolm wrote:
>>
>>> Dear Matthew, Ewan, et al1
>>>
>>> I see in three places in  Bio::Tools::GFF the following:
>>>
>>>   if( $feat->can('seqname') ) {
>>>        $name = $feat->seq_id();
>>>        $name ||= 'SEQ';
>>>    } else {
>>>        $name = 'SEQ';
>>>    }
>>>
>>> However, in Bio::SeqFeature::Generic we learn that
>>>
>>> 	$self->warn("-seqname is deprecated. Please use -seq_id
>>> instead.");
>>>
>>> So, should we rewrite those fragments in Bio::Tools:GFF as:
>>>  $name = $feat->seq_id() || 'SEQ'
>>>
>>> ??
>>>
>>> Thanks,
>>>
>>> Malcolm Cook
>>> Database Applications Manager, Bioinformatics
>>> Stowers Institute for Medical Research
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>> -- 
>> -------------------------------------------------------------
>> Hilmar Lapp                            email: lapp at gnf.org
>> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
>> -------------------------------------------------------------
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From bernie21young at hotmail.com  Mon Feb 23 22:57:08 2004
From: bernie21young at hotmail.com (gerardo)
Date: Mon Feb 23 21:09:25 2004
Subject: [Bioperl-l] Drug lasts longer than VIAGR@?
Message-ID: <1077595028-13233@excite.com>

Here is an fantastic way to please your lady.

You can be ready for up to thirty-six hours.

The results are far greater than any other product.

http://prescribedmeds.com/sv/index.php?pid=eph9106


gordon baskeTjapan praise larry swimming joel gibson
jazz dragonflflight sylvie bfi 
hello1 barry raptor 
From m_conte at hotmail.com  Tue Feb 24 04:35:48 2004
From: m_conte at hotmail.com (matthieu CONTE)
Date: Tue Feb 24 04:41:55 2004
Subject: [Bioperl-l] <AUTHOR_LIST> missing
Message-ID: <BAY12-F47rwojQgSkn40000952f@hotmail.com>

Hello,
I have a new problem to load the whole rice genome form Tigr to my biosql db
I have download the parser $tigrxml.dtd.......
Thanks.


perl load_seqdatabase.pl --host biopipe --dbname biopipe --namespace biopipe 
--format tigr  /home/conte/pipeline_orthologues/data/orysa_tigr.txt
Loading /home/conte/pipeline_orthologues/data/orysa_tigr.txt ...

------------- EXCEPTION  -------------
MSG: [19]Required <AUTHOR_LIST> missing
STACK Bio::SeqIO::tigr::throw 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:1338
STACK Bio::SeqIO::tigr::_process_header 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:700
STACK Bio::SeqIO::tigr::_process_assembly 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:535
STACK Bio::SeqIO::tigr::_process_tigr 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:453
STACK Bio::SeqIO::tigr::_process 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:420
STACK Bio::SeqIO::tigr::_initialize 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:90
STACK Bio::SeqIO::new 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:358
STACK Bio::SeqIO::new 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:378
STACK toplevel load_seqdatabase.pl:436


Matthieu CONTE

_________________________________________________________________
MSN Messenger : discutez en direct avec vos amis ! 
http://www.msn.fr/msger/default.asp

From dhoworth at mrc-lmb.cam.ac.uk  Tue Feb 24 04:55:33 2004
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Tue Feb 24 05:01:52 2004
Subject: use of seq_id. was: [Bioperl-l] Bio::Tools::GFF  use of seqname
In-Reply-To: <D15BED4C-662C-11D8-A55D-000A959EB4C4@gmx.net>
References: <D15BED4C-662C-11D8-A55D-000A959EB4C4@gmx.net>
Message-ID: <403B1F95.20504@mrc-lmb.cam.ac.uk>

Hilmar Lapp wrote:
> Actually, $feat->can('seq_id') must be true at all times iff 
> $feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to test for it.

Where does this come from, please?  In the SeqFeatureI documentation it 
says seq_id 'is an attribute such that you *can* store the ID' (my 
emphasis).  You seem to be saying that if I'm creating a bunch of (sub) 
features just so I can use Bio::Graphics, I must attach a seq_id to each 
and every one.

I have an inverse question that I haven't managed to find an answer to 
yet. If I'm displaying these sub-features as segments, how can I attach 
some text to the feature that will be displayed alongside each 
individual segment?

Thanks, Dave

From ew9 at york.ac.uk  Tue Feb 24 05:30:02 2004
From: ew9 at york.ac.uk (Elizabeth Williams)
Date: Tue Feb 24 05:36:11 2004
Subject: [Bioperl-l] neighbor.pm
Message-ID: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk>

Hello,
I have a problem running the bit of code below.  I get this message:

"Can't call method "names" on an undefined value at 
/biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm 
line 470."

but not all the time - it mostly works but on some alignments it comes up 
with this error.
Any ideas of what the problem is or how to fix it?


                         #align sequences
                         my @params_align = ('ktuple' => 2, 'matrix' => 
'BLOSUM', 'QUIET' => 1);
                         my $factory = 
Bio::Tools::Run::Alignment::Clustalw->new(@params_align);
                         my $seq_array_ref = \@seq_array; # where 
@seq_array is an array of Bio::Seq objects
                         my $aln = $factory->align($seq_array_ref);
                         my @params_protdist = ('MODEL' => 'PAM', 'QUIET' 
=> 1);

                         my $protdist_factory = 
Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist);

                         $protdist_factory->version('3.6');

                         my $matrix = $protdist_factory->run($aln);

                         my @params_neighbor = ('type'=>'NJ', 'QUIET' => 1);

                         my $neighborfactory = 
Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor);

                         $neighborfactory->version('3.6');

                         my (@trees) = $neighborfactory->run($matrix);

                         my $outtree = new Bio::TreeIO(-file => 
">>geneorigin_results2.xls");

                         foreach my $tree (@trees) {

                                 $outtree->write_tree($tree);

                         }

Elizabeth J.B. Williams
CNAP
Department of Biology
University of York
York
YO10 5YW
mobile: 07813149274
work: 01904 328757
Fax: 01904 328762

From jason at cgt.duhs.duke.edu  Tue Feb 24 08:10:34 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Tue Feb 24 08:16:52 2004
Subject: [Bioperl-l] neighbor.pm
In-Reply-To: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk>
References: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk>
Message-ID: <Pine.LNX.4.50.0402240809400.31103-100000@tenero.duhs.duke.edu>

can you track it down to a specific dataset which causes the problem?  I
would first guess that neighbor is failing and we're not detecting that
very well.  you're getting an empty matrix so that is why names is
failing.

-jason

On Tue, 24 Feb 2004, Elizabeth Williams wrote:

> Hello,
> I have a problem running the bit of code below.  I get this message:
>
> "Can't call method "names" on an undefined value at
> /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm
> line 470."
>
> but not all the time - it mostly works but on some alignments it comes up
> with this error.
> Any ideas of what the problem is or how to fix it?
>
>
>                          #align sequences
>                          my @params_align = ('ktuple' => 2, 'matrix' =>
> 'BLOSUM', 'QUIET' => 1);
>                          my $factory =
> Bio::Tools::Run::Alignment::Clustalw->new(@params_align);
>                          my $seq_array_ref = \@seq_array; # where
> @seq_array is an array of Bio::Seq objects
>                          my $aln = $factory->align($seq_array_ref);
>                          my @params_protdist = ('MODEL' => 'PAM', 'QUIET'
> => 1);
>
>                          my $protdist_factory =
> Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist);
>
>                          $protdist_factory->version('3.6');
>
>                          my $matrix = $protdist_factory->run($aln);
>
>                          my @params_neighbor = ('type'=>'NJ', 'QUIET' => 1);
>
>                          my $neighborfactory =
> Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor);
>
>                          $neighborfactory->version('3.6');
>
>                          my (@trees) = $neighborfactory->run($matrix);
>
>                          my $outtree = new Bio::TreeIO(-file =>
> ">>geneorigin_results2.xls");
>
>                          foreach my $tree (@trees) {
>
>                                  $outtree->write_tree($tree);
>
>                          }
>
> Elizabeth J.B. Williams
> CNAP
> Department of Biology
> University of York
> York
> YO10 5YW
> mobile: 07813149274
> work: 01904 328757
> Fax: 01904 328762
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From ew9 at york.ac.uk  Tue Feb 24 08:32:26 2004
From: ew9 at york.ac.uk (Elizabeth Williams)
Date: Tue Feb 24 08:38:51 2004
Subject: [Bioperl-l] neighbor.pm
In-Reply-To: <Pine.LNX.4.50.0402240809400.31103-100000@tenero.duhs.duke. edu>
References: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk>
	<Pine.LNX.4.50.0402240809400.31103-100000@tenero.duhs.duke.edu>
Message-ID: <6.0.1.1.0.20040224132932.0252f470@ew9.imap.york.ac.uk>

I am pulling down the set of sequences using    eval {$seq 
=$gb->get_Seq_by_id($id);}
from a list of gi identifiers.
The list which stopped Neighbor.pm was:
2522394
10727920
13124364
633631
10579820
25317156
15887696
17934261
17988084
15964148
13474446
15892324
15604167
16124268
19914310
23051232
20906191
37520602
35211596
33862201
33634419
33238784
39933589
27376035
17132771
23041817
1652903
23129777
22295967
33632137
7287834
33862352
22406149
14324888

could this be a problem for my script and if so how is the best way of 
catching the error.

At 13:10 24/02/2004, you wrote:
>can you track it down to a specific dataset which causes the problem?  I
>would first guess that neighbor is failing and we're not detecting that
>very well.  you're getting an empty matrix so that is why names is
>failing.
>
>-jason
>
>On Tue, 24 Feb 2004, Elizabeth Williams wrote:
>
> > Hello,
> > I have a problem running the bit of code below.  I get this message:
> >
> > "Can't call method "names" on an undefined value at
> > 
> /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm
> > line 470."
> >
> > but not all the time - it mostly works but on some alignments it comes up
> > with this error.
> > Any ideas of what the problem is or how to fix it?
> >
> >
> >                          #align sequences
> >                          my @params_align = ('ktuple' => 2, 'matrix' =>
> > 'BLOSUM', 'QUIET' => 1);
> >                          my $factory =
> > Bio::Tools::Run::Alignment::Clustalw->new(@params_align);
> >                          my $seq_array_ref = \@seq_array; # where
> > @seq_array is an array of Bio::Seq objects
> >                          my $aln = $factory->align($seq_array_ref);
> >                          my @params_protdist = ('MODEL' => 'PAM', 'QUIET'
> > => 1);
> >
> >                          my $protdist_factory =
> > Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist);
> >
> >                          $protdist_factory->version('3.6');
> >
> >                          my $matrix = $protdist_factory->run($aln);
> >
> >                          my @params_neighbor = ('type'=>'NJ', 'QUIET' 
> => 1);
> >
> >                          my $neighborfactory =
> > Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor);
> >
> >                          $neighborfactory->version('3.6');
> >
> >                          my (@trees) = $neighborfactory->run($matrix);
> >
> >                          my $outtree = new Bio::TreeIO(-file =>
> > ">>geneorigin_results2.xls");
> >
> >                          foreach my $tree (@trees) {
> >
> >                                  $outtree->write_tree($tree);
> >
> >                          }
> >
> > Elizabeth J.B. Williams
> > CNAP
> > Department of Biology
> > University of York
> > York
> > YO10 5YW
> > mobile: 07813149274
> > work: 01904 328757
> > Fax: 01904 328762
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
>
>--
>Jason Stajich
>Duke University
>jason at cgt.mc.duke.edu

Elizabeth J.B. Williams
CNAP
Department of Biology
University of York
York
YO10 5YW
mobile: 07813149274
work: 01904 328757
Fax: 01904 328762

From brian_osborne at cognia.com  Tue Feb 24 10:22:13 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Tue Feb 24 10:28:20 2004
Subject: [Bioperl-l] StandAloneBlast.pm,
	bl2seq() and tempfiles on Win32/cygwin
In-Reply-To: <C165E4B8-6644-11D8-9A28-000A958C5008@pcbi.upenn.edu>
Message-ID: <GAEDKMGOKFBLJPKCLKCCIELNDHAA.brian_osborne@cognia.com>

Aaron,

Because he's using the BLAST Win binaries which don't understand Cygwin
paths?

Meaning these work:

blastall -i test.fa -d testdb.fa -p blastn

blastall -i e:/cygwin/home/bosborne/test.fa -d test.fa -p blastn


But this doesn't:

blastall -i /home/bosborne/test.fa -d test.fa -p blastn

?


Brian O.

-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Aaron J. Mackey
Sent: Monday, February 23, 2004 4:11 PM
To: Bioperl list
Cc: Sushma Parankush Das
Subject: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on
Win32/cygwin


A colleague of mine is frustrated by attempting to use
Bio::Tools::Run::StandAloneBlast to run bl2seq (Perl 5.8.2, bioperl
1.4, windows XP, CygWin, etc.):

# synopsis:
$seq1 = $seqio->next_seq;
$seq2 = $seqio->next_seq;
$factory->bl2seq($seq1, $seq2);

StandAloneBlast successfully writes two temp files in /tmp, which have
the sequence data and can be read by "less" or "cat" in another open
window (with the main program suspended in debugger); however, if
either the program code or I at the command line attempt to run bl2seq,
it dies with "Cannot open file /tmp/7aasd78asd".  If I "cp" the temp
files into new files, it runs fine.  Or, if I call
$factory->bl2seq($file1, $file2) with filenames instead of seq objects,
it also works fine.  I have tried various incarnations of closing the
filehandles and Bio::SeqIO objects that StandAloneBlast.pm is using to
generate these temp files, but to no avail (and of course, the
tempfiles disappear upon program completion).

This is not the "failed to open tempfile; too many files open" error
seen previously, and I also expect a fair number of "works for me"
responses - please save your breath.

Thanks for any input,

-Aaron

_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Tue Feb 24 10:35:37 2004
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Tue Feb 24 10:36:48 2004
Subject: [Bioperl-l] Re: [BioC] Questions about multtest
In-Reply-To: <20B7EB075F2D4542AFFAF813E98ACD93028226EF@cl-exsrv1.irad.bbsrc.ac.uk>
Message-ID: <BC60D979.4DC0%sdavis2@mail.nih.gov>

> OK, I using the multtest package to analyse my data, following the
> instructions in multtest.pdf.
> 
> I run:
> 
>> t <- mt.teststat(data[,6:12], c(0,0,0,1,1,1,1), test="t")
> 
> which calculates the t statistic for my data.  The t statistic for my first
> gene comes up as:
> 
>> t[1]
> [1] 40.60158
> 
> Presumably, this is equivalent to me running t.test:
> 
>> t.test(data[1,9:12], data[1,6:8], var.equal=FALSE, alternative="two.sided")
> 
>       Welch Two Sample t-test
> 
> data:  data[1, 9:12] and data[1, 6:8]
> t = 40.6016, df = 2, p-value = 0.0006061
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> 1.713804 2.120092
> sample estimates:
>   mean of x     mean of y
> -1.596190e-15 -1.916948e+00
> 
> So I want to know how I can get p-values for the t statistics I have just
> calculated using mt.teststat.
> 
> This is where I get confused - multtest.pdf says I should "compute raw nominal
> two-sided p-values for the 3,051 test statistics using the standard Gaussian
> distribution":
> 
>> rawp0 <- 2 * (1 - pnorm(abs(t)))

Keep in mind what this is doing:  computing a p-value based on the normal
approximation of the t-distribution with infinite degrees of freedom.  In
your case, this approximation does not hold (probably) because of the
smaller than infinite (or, in practice 50 or so) number of degrees of
freedom.  

The above is asking what the probability of seeing a z-score of 40, which is
nearly equivalent to 0 (and, to the number of significant digits here, IS
0).  

What you probably want is:
rawp0 <- 2 * (1 - pt(abs(t),df=2))

Like so:

> 2*(1-pt(40.60158,df=2))
[1] 0.000606065

Which agrees with your t-test value.  Then, you can soldier on.

> Soldiering on, I want to calculate adjusted p-values accoridng to Benjamini
> and Hochberg:
> 
>> res <- mt.rawp2adjp(rawp0, "BH")
>> adjp <- res$adjp[order(res$index), ]
>> adjp[1]
> [1] 0


Sean
-- 
Sean Davis, M.D., Ph.D.

Clinical Fellow
National Institutes of Health
National Cancer Institute
National Human Genome Research Institute

Clinical Fellow, Johns Hopkins
Department of Pediatric Oncology
-- 


From jaymoore at plantkind.com  Tue Feb 24 11:12:45 2004
From: jaymoore at plantkind.com (Jay Moore)
Date: Tue Feb 24 11:17:02 2004
Subject: [Bioperl-l] Re: Bio ::seqIO ::tigr
Message-ID: <403b77fd74b002.61830072@businessserve.co.uk>

Matthieu CONTE reported this:  MSG: [19]Required <AUTHOR_LIST> missing
(See his message below)

I get the same result, and when I looked into tigr.pm, the problem is not actually with the <AUTHOR_LIST> tag, despite the error message, nor I think is the TIGR XML file at fault.  The problem is happening further up the tigr.pm module, in the _process_header method, with the <KEYWORDS> line.  Not sure why (my regex skills are not so hot) but the method does not spot valid <KEYWORDS> lines in the XML.  I found that when I changed the regex in the KEYWORDS line from [^<] to ([^<]*) it would recognise the KEYWORDS tag OK, and progress on, past the <AUTHOR_LIST> as well.  Don't know exactly why, I just copied code from one of the other <HEADER> tags.

So far so good.

For me it bugs out later now - there is no _process_tiling_path method, which there should be.  I reported this one via bugzilla.  To get past this one I chopped the whole <TILING_PATH> object out of the TIGR XML file.

I now get another error later on - 
[79]Required <EXON> Missing.  Still looking at why this one happens.


Matthieu CONTE's original message:

I currently trying to use the Bio ::seqIO ::tigr module.
My objective is to download the whole rice genome form Tigr ( adress 
below)and to integrate it in my BioSQL DB.
For this I am trying to convert the tigr format in swiss format with the 
script below


use Bio::SeqIO;

my $in = Bio::SeqIO->new(-file 
=>'</home/conte/pipeline_orthologues/data/orysa_tigr.txt', -format 
=>'tigr');

my $out = Bio::SeqIO->new(-file => 
'>/home/conte/pipeline_orthologues/data/orysa_swiss.txt' , 
-format=>'swiss');

print $out $_ while <$in>;

I obtain:

------------ EXCEPTION  -------------
MSG: [19]Required <AUTHOR_LIST> missing
STACK Bio::SeqIO::tigr::throw 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:1338
STACK Bio::SeqIO::tigr::_process_header 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:700
STACK Bio::SeqIO::tigr::_process_assembly 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:535
STACK Bio::SeqIO::tigr::_process_tigr 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:453
STACK Bio::SeqIO::tigr::_process 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:420
STACK Bio::SeqIO::tigr::_initialize 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO/tigr.pm:90
STACK Bio::SeqIO::new 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:358
STACK Bio::SeqIO::new 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/SeqIO.pm:378
STACK toplevel get_bioseq_tigr.pl:8

Could you please tell me if there is a problem with the parser or with the 
input data format of Tigr?

Thanks in advance


Matthieu CONTE
m_conte at hotmail.com

_________________________________________________________________
MSN Messenger : discutez en direct avec vos amis ! 
http://www.msn.fr/msger/default.asp

From amackey at pcbi.upenn.edu  Tue Feb 24 11:19:34 2004
From: amackey at pcbi.upenn.edu (Aaron J. Mackey)
Date: Tue Feb 24 11:25:44 2004
Subject: [Bioperl-l] StandAloneBlast.pm,
	bl2seq() and tempfiles on Win32/cygwin
In-Reply-To: <GAEDKMGOKFBLJPKCLKCCEEMADHAA.brian_osborne@cognia.com>
References: <GAEDKMGOKFBLJPKCLKCCEEMADHAA.brian_osborne@cognia.com>
Message-ID: <3BAE2AC8-66E5-11D8-9A28-000A958C5008@pcbi.upenn.edu>

It's not StandALoneBlast's fault: it's using Bio::Root::IO::tempfile() 
which uses some convoluted logic that I can't follow (and, if I 
remember right, is a copy of an older File::Temp incarnation).

So, folks, how can we best inform BioPerl where we want it to make 
temporary files?

-Aaron

On Feb 24, 2004, at 11:11 AM, Brian Osborne wrote:

> Aaron,
>
> That could work. Unfortunately that would mess up the path for those
> applications that DO use Unix-style paths.
>
> Well, hold on, let me try....
>
> No, neither worked:
>
> MSG: Could not open /tmp/Av0MhgqzIJ: No such file or directory
>
> StandAloneBlast seems not to care about either env's. That's not nice.
>
>
> Brian O.
>
> -----Original Message-----
> From: Aaron J. Mackey [mailto:amackey@pcbi.upenn.edu]
> Sent: Tuesday, February 24, 2004 10:58 AM
> To: Brian Osborne
> Subject: Re: [Bioperl-l] StandAloneBlast.pm, bl2seq() and tempfiles on
> Win32/cygwin
>
> Ahh, right; perhaps the $TEMPDIR (or $TEMP?) environment variable would
> do the trick?
>
> -Aaron
>
> On Feb 24, 2004, at 10:46 AM, Brian Osborne wrote:
>
>> Aaron,
>>
>> Or ask the author for a way to set the tempdir, in my case it would be
>> "e:/cygwin/tmp". I couldn't see such a thing in the documentation,
>> perhaps I
>> missed it though.
>>
>> BIO
>>
>> -----Original Message-----
>> From: Aaron J. Mackey [mailto:amackey@pcbi.upenn.edu]
>> Sent: Tuesday, February 24, 2004 10:32 AM
>> To: Brian Osborne
>> Subject: Re: [Bioperl-l] StandAloneBlast.pm, bl2seq() and tempfiles on
>> Win32/cygwin
>>
>> Ooo, that's probably it; what's the solution?
>>
>> -Aaron
>>
>> On Feb 24, 2004, at 10:22 AM, Brian Osborne wrote:
>>
>>> Aaron,
>>>
>>> Because he's using the BLAST Win binaries which don't understand
>>> Cygwin
>>> paths?
>>>
>>> Meaning these work:
>>>
>>> blastall -i test.fa -d testdb.fa -p blastn
>>>
>>> blastall -i e:/cygwin/home/bosborne/test.fa -d test.fa -p blastn
>>>
>>>
>>> But this doesn't:
>>>
>>> blastall -i /home/bosborne/test.fa -d test.fa -p blastn
>>>
>>> ?
>>>
>>>
>>> Brian O.
>>>
>>> -----Original Message-----
>>> From: bioperl-l-bounces@portal.open-bio.org
>>> [mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Aaron J.
>>> Mackey
>>> Sent: Monday, February 23, 2004 4:11 PM
>>> To: Bioperl list
>>> Cc: Sushma Parankush Das
>>> Subject: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on
>>> Win32/cygwin
>>>
>>>
>>> A colleague of mine is frustrated by attempting to use
>>> Bio::Tools::Run::StandAloneBlast to run bl2seq (Perl 5.8.2, bioperl
>>> 1.4, windows XP, CygWin, etc.):
>>>
>>> # synopsis:
>>> $seq1 = $seqio->next_seq;
>>> $seq2 = $seqio->next_seq;
>>> $factory->bl2seq($seq1, $seq2);
>>>
>>> StandAloneBlast successfully writes two temp files in /tmp, which 
>>> have
>>> the sequence data and can be read by "less" or "cat" in another open
>>> window (with the main program suspended in debugger); however, if
>>> either the program code or I at the command line attempt to run
>>> bl2seq,
>>> it dies with "Cannot open file /tmp/7aasd78asd".  If I "cp" the temp
>>> files into new files, it runs fine.  Or, if I call
>>> $factory->bl2seq($file1, $file2) with filenames instead of seq
>>> objects,
>>> it also works fine.  I have tried various incarnations of closing the
>>> filehandles and Bio::SeqIO objects that StandAloneBlast.pm is using 
>>> to
>>> generate these temp files, but to no avail (and of course, the
>>> tempfiles disappear upon program completion).
>>>
>>> This is not the "failed to open tempfile; too many files open" error
>>> seen previously, and I also expect a fair number of "works for me"
>>> responses - please save your breath.
>>>
>>> Thanks for any input,
>>>
>>> -Aaron
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>

From Marc.Logghe at devgen.com  Tue Feb 24 11:32:55 2004
From: Marc.Logghe at devgen.com (Marc Logghe)
Date: Tue Feb 24 11:39:31 2004
Subject: [Bioperl-l] StandAloneBlast.pm,
	bl2seq() and tempfiles on Win32/cygwin
Message-ID: <BEE28BF86078B6429D6C780635718E21904BC5@morelia.be.devgen.com>


> -----Original Message-----
> From: Aaron J. Mackey [mailto:amackey@pcbi.upenn.edu]
> Sent: dinsdag 24 februari 2004 17:20
> To: Brian Osborne
> Cc: Bioperl list
> Subject: Re: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on
> Win32/cygwin

Setting the environment variable TMPDIR should work. It is used by File::Spec::Unix and File::Spec::Win32
HTH,
Marc

From hz5 at njit.edu  Tue Feb 24 11:37:13 2004
From: hz5 at njit.edu (hz5@njit.edu)
Date: Tue Feb 24 11:43:17 2004
Subject: [Bioperl-l] Bioperl graphics
Message-ID: <1077640633.403b7db94e285@webmail.njit.edu>

Dear all,

I am trying to render a CDS using bioperl, I want the arrow ruler on top 
display coordinates from 18058059 to 18068032 but it seems that it is too big, 
the image just wouldn't render

any suggestions?
        #
        #$s = 18058059 and $t = 18068032
        #
        my $whole_seq = Bio::SeqFeature::Generic->new(
                                        -start=>$s,
                                        -end=>$t,
                                        );
        $panel->add_track($whole_seq,
                    -glyph => 'arrow',
                    -fgcolor => 'black',
                        -bump => 0,
                    -bgcolor => 'red',
                    -double=>1,
                    -tick => 2);

Thanks, waiting online!

=========================================================
Haibo Zhang, PhD student
Computational Biology, NJIT & Rutgers University
Center for Applied Genomics, PHRI
http://afs13.njit.edu/~hz5
From brian_osborne at cognia.com  Tue Feb 24 11:41:27 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Tue Feb 24 11:47:37 2004
Subject: [Bioperl-l] StandAloneBlast.pm,
	bl2seq() and tempfiles on Win32/cygwin
In-Reply-To: <BEE28BF86078B6429D6C780635718E21904BC5@morelia.be.devgen.com>
Message-ID: <GAEDKMGOKFBLJPKCLKCCGEMBDHAA.brian_osborne@cognia.com>

Marc,

That worked!

So this is the fix for Cygwin, provided some other application, driven by
Perl and using those modules, expects to see Unix-style paths. I'll note in
INSTALL.WIN that this is the workaround but that since other apps in Bioperl
and Cygwin may suffer, one might want to set this in a script.

Brian O.

-----Original Message-----
From: Marc Logghe [mailto:Marc.Logghe@devgen.com]
Sent: Tuesday, February 24, 2004 11:33 AM
To: Aaron J. Mackey; Brian Osborne
Cc: Bioperl list
Subject: RE: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on
Win32/cygwin


> -----Original Message-----
> From: Aaron J. Mackey [mailto:amackey@pcbi.upenn.edu]
> Sent: dinsdag 24 februari 2004 17:20
> To: Brian Osborne
> Cc: Bioperl list
> Subject: Re: [Bioperl-l] StandAloneBlast.pm,bl2seq() and tempfiles on
> Win32/cygwin

Setting the environment variable TMPDIR should work. It is used by
File::Spec::Unix and File::Spec::Win32
HTH,
Marc


From light21bird at hotmail.com  Tue Feb 24 15:33:10 2004
From: light21bird at hotmail.com (stephen)
Date: Tue Feb 24 13:45:24 2004
Subject: [Bioperl-l] The Drug that puts VIAGR@ to shame!
Message-ID: <1077654790-30236@excite.com>

Here is an fantastic way to please your lady.

You can be ready for up to thirty-six hours.

The results are far greater than any other product.

http://prescribedmeds.com/sv/index.php?pid=eph9106


cougars eastervolley mimi planet godzilla wright chiquita
bird mishkasaskia moroni glenn 
turbo kitty monopoly 

Get off this list by writing to getoff3731@mail.com
From jason at cgt.duhs.duke.edu  Tue Feb 24 14:43:27 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Tue Feb 24 14:49:42 2004
Subject: [Bioperl-l] neighbor.pm
In-Reply-To: <6.0.1.1.0.20040224145032.0255bab8@ew9.imap.york.ac.uk>
References: <6.0.1.1.0.20040224102434.025311c0@ew9.imap.york.ac.uk>
	<Pine.LNX.4.50.0402240809400.31103-100000@tenero.duhs.duke.edu>
	<6.0.1.1.0.20040224132932.0252f470@ew9.imap.york.ac.uk>
	<Pine.LNX.4.50.0402240904530.31459-100000@tenero.duhs.duke.edu>
	<6.0.1.1.0.20040224145032.0255bab8@ew9.imap.york.ac.uk>
Message-ID: <Pine.LNX.4.50.0402241434310.669-100000@tenero.duhs.duke.edu>

There is a 'bad' amino acid in your data.

I get this when I run phylip protdist by hand on your data:
 WARNING -- BAD AMINO ACID:U AT POSITION 1206 OF SPECIES  32
offending base is a 'U'
[jason@sonogno jason]$ seqret -sbegin 1205 -send 1207 out.aln.fasta:AAF44872/1-3 stdout
Reads and writes (returns) sequences
>AAF44872/1-3
-U-

So you'll need to prune these out of the data I guess.

-jason

On Tue, 24 Feb 2004, Elizabeth Williams wrote:

> Here is the alignment.
>
> At 14:06 24/02/2004, you wrote:
> >Any chance you can save the multiple sequence alignment and send that out
> >instead?
> >-jason
> >
> >On Tue, 24 Feb 2004, Elizabeth Williams wrote:
> >
> > > I am pulling down the set of sequences using    eval {$seq
> > > =$gb->get_Seq_by_id($id);}
> > > from a list of gi identifiers.
> > > The list which stopped Neighbor.pm was:
> > > 2522394
> > > 10727920
> > > 13124364
> > > 633631
> > > 10579820
> > > 25317156
> > > 15887696
> > > 17934261
> > > 17988084
> > > 15964148
> > > 13474446
> > > 15892324
> > > 15604167
> > > 16124268
> > > 19914310
> > > 23051232
> > > 20906191
> > > 37520602
> > > 35211596
> > > 33862201
> > > 33634419
> > > 33238784
> > > 39933589
> > > 27376035
> > > 17132771
> > > 23041817
> > > 1652903
> > > 23129777
> > > 22295967
> > > 33632137
> > > 7287834
> > > 33862352
> > > 22406149
> > > 14324888
> > >
> > > could this be a problem for my script and if so how is the best way of
> > > catching the error.
> > >
> > > At 13:10 24/02/2004, you wrote:
> > > >can you track it down to a specific dataset which causes the problem?  I
> > > >would first guess that neighbor is failing and we're not detecting that
> > > >very well.  you're getting an empty matrix so that is why names is
> > > >failing.
> > > >
> > > >-jason
> > > >
> > > >On Tue, 24 Feb 2004, Elizabeth Williams wrote:
> > > >
> > > > > Hello,
> > > > > I have a problem running the bit of code below.  I get this message:
> > > > >
> > > > > "Can't call method "names" on an undefined value at
> > > > >
> > > >
> > /biol/programs/perl580/lib/site_perl/5.8.0/Bio/Tools/Run/Phylo/Phylip/Neighbor.pm
> > > > > line 470."
> > > > >
> > > > > but not all the time - it mostly works but on some alignments it
> > comes up
> > > > > with this error.
> > > > > Any ideas of what the problem is or how to fix it?
> > > > >
> > > > >
> > > > >                          #align sequences
> > > > >                          my @params_align = ('ktuple' => 2, 'matrix' =>
> > > > > 'BLOSUM', 'QUIET' => 1);
> > > > >                          my $factory =
> > > > > Bio::Tools::Run::Alignment::Clustalw->new(@params_align);
> > > > >                          my $seq_array_ref = \@seq_array; # where
> > > > > @seq_array is an array of Bio::Seq objects
> > > > >                          my $aln = $factory->align($seq_array_ref);
> > > > >                          my @params_protdist = ('MODEL' => 'PAM',
> > 'QUIET'
> > > > > => 1);
> > > > >
> > > > >                          my $protdist_factory =
> > > > > Bio::Tools::Run::Phylo::Phylip::ProtDist->new(@params_protdist);
> > > > >
> > > > >                          $protdist_factory->version('3.6');
> > > > >
> > > > >                          my $matrix = $protdist_factory->run($aln);
> > > > >
> > > > >                          my @params_neighbor = ('type'=>'NJ', 'QUIET'
> > > > => 1);
> > > > >
> > > > >                          my $neighborfactory =
> > > > > Bio::Tools::Run::Phylo::Phylip::Neighbor->new(@params_neighbor);
> > > > >
> > > > >                          $neighborfactory->version('3.6');
> > > > >
> > > > >                          my (@trees) = $neighborfactory->run($matrix);
> > > > >
> > > > >                          my $outtree = new Bio::TreeIO(-file =>
> > > > > ">>geneorigin_results2.xls");
> > > > >
> > > > >                          foreach my $tree (@trees) {
> > > > >
> > > > >                                  $outtree->write_tree($tree);
> > > > >
> > > > >                          }
> > > > >
> > > > > Elizabeth J.B. Williams
> > > > > CNAP
> > > > > Department of Biology
> > > > > University of York
> > > > > York
> > > > > YO10 5YW
> > > > > mobile: 07813149274
> > > > > work: 01904 328757
> > > > > Fax: 01904 328762
> > > > >
> > > > > _______________________________________________
> > > > > Bioperl-l mailing list
> > > > > Bioperl-l@portal.open-bio.org
> > > > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > > >
> > > >
> > > >--
> > > >Jason Stajich
> > > >Duke University
> > > >jason at cgt.mc.duke.edu
> > >
> > > Elizabeth J.B. Williams
> > > CNAP
> > > Department of Biology
> > > University of York
> > > York
> > > YO10 5YW
> > > mobile: 07813149274
> > > work: 01904 328757
> > > Fax: 01904 328762
> > >
> >
> >--
> >Jason Stajich
> >Duke University
> >jason at cgt.mc.duke.edu
>
> Elizabeth J.B. Williams
> CNAP
> Department of Biology
> University of York
> York
> YO10 5YW
> mobile: 07813149274
> work: 01904 328757
> Fax: 01904 328762
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From barry.moore at genetics.utah.edu  Tue Feb 24 13:21:39 2004
From: barry.moore at genetics.utah.edu (Barry Moore)
Date: Tue Feb 24 14:51:37 2004
Subject: [Bioperl-l] Re: Fwd: nucleic acid melting temperature
In-Reply-To: <Pine.LNX.4.44.0402241347040.31251-100000@hennig.ebi.ac.uk>
References: <Pine.LNX.4.44.0402241347040.31251-100000@hennig.ebi.ac.uk>
Message-ID: <403B9633.6070902@genetics.utah.edu>

Nicolas,

There is a module (primer.pm) that will allow you to generate a primer 
object.  This object has a Tm method to return the melting temperature 
of that primer.  About a week ago that method was updated to use the 
nearest-neighbor thermodynamic approach to calculating Tm, and there has 
been a discussion going on since then about that.  Your program exceeds 
the capabilities of that method in a variety of ways.  The current 
method calculates the enthalpy and entropy for all dinucleotide pairs, 
and adjusts those for duplex initiation.  It calculates Tm based on 
those values, the oligo concentration and salt concentration as per 
Allawi et. al Biochemistry 1997 36:10581-10594 (however the salt 
adjustment was taken from http://biotools.idtdna.com/analyzer/).  The 
primer.pm module containing that code can be found at: 
http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/SeqFeature/Primer.pm?cvsroot=bioperl.  
I believe that Rob Edwards is the current maintainer of that module.  
What the current method does not do that your program does is account 
for the possibility of mismatches and dangling ends.  I think the 
current primer object would need some redesigning to allow for those.  
You may also be using a more accurate adjustments for salt concentration.

Your Melting program looks like it would be a great addition to 
bioperl.  I'm farily new to bioperl, and don't know the overall object 
structure well enough yet to comment from a developers point of view, 
but I wonder if your algorithm would be better placed somewhere with a 
boarder scope than as a method of the SeqFeature::Primer object, perhaps 
as a method available to all sequence objects.  I beleive Rob made a 
similar comment in his original documentation of the Tm method.  Perhaps 
some of the seasoned Bioperl developers can discuss where a module with 
the capabilities of Melting should live.  Also as a new user, I would 
suggest that porting Melting to perl and integrating it into Bioperl is 
preferable to simply writing a wrapper (from the users point of view, 
not the developers of course).  To casual and new users of Bioperl, long 
lists of dependencies can be very daunting.

Barry Moore

Nicolas Le Novere wrote:

>On Tue, 24 Feb 2004, Jason Stajich wrote:
>
>  
>
>>Would absolutely love your help/contibution here if you've got some
>>working code, I've CC-ed the interested parties.
>>    
>>
>
>I do not have BioPerl working code regarding that issue. However I
>have a C program that is much better and much more plastic than
>EMBOSS DAN, or the GCG equivalent (well, last time I looked at those
>programs. Maybe they evolved). 
>
>http://www.ebi.ac.uk/~lenov/meltinghome.html
>
>What is the current state of melting temp calculation in BioPerl? Do
>you already have a module?
>
>If not, we could envision to write a wrapper to MELTING, like in WWW
>interface of the program, or in OligoDB, SOL or SEPON, that use
>MELTING for the elementary Tm.
> 
>Or on can rewrite MELTING as a BioPerl module and then take advantage
>of the reimplementation to improve the program, for instance to add
>correction for Mg2+ ions.
>
>Just tell me how I can help.
>
>  
>

-- 
Barry Moore
Dept. of Human Genetics
University of Utah
Salt Lake City, UT

From pst at ksu.edu  Mon Feb 23 14:21:16 2004
From: pst at ksu.edu (Paul St. Amand)
Date: Tue Feb 24 14:52:20 2004
Subject: [Bioperl-l] Help with reversing a sequence
Message-ID: <73BFE030-6635-11D8-B2C9-0003938893E4@ksu.edu>

The perl reverse cmd is what you want that does exactly what you want --
reverse a string.  What is slow specifically?  Did you benchmark
something?
-jason

No, I did not benchmark it and speed is really secondary to the 
question of "what is the best way to do this using BioPerl". I was just 
wondering if BioPerl had its own function or method for reversing like 
it has for revcom.

So, the reverse that I am using below is the best way?

 > print $outputfh ">MyReversedSeq29856-29862\n",scalar
 > reverse($seqobj->subseq(29856,29862)),"\n";

Thanks!
Paul
PS, I am not subscribed to BioPerl-l, so I could not post a reply 
on-line.


On Mon, 23 Feb 2004, Paul St. Amand wrote:

 > Hi,
 >
 > I am using the following script to get a subsequence and reverse it.
 > Note that I do NOT want the "reverse complement" of the sequence here,
 > just the actual reverse. BioPerl has a method to get the revcom of a
 > seq, such as:
 >
 > print $outputfh "Reverse complemented sequence 5 to 10  is
 > ",$seqobj->trunc(5,10)->revcom->seq, "  \n";
 >
 > Does BioPerl have a similar/better way to get the reverse (not revcom)
 > of a sequence?
 >
 > This is how I am doing it and it is slow. Is there a way that is 
faster
 > or "better" using BioPerl???
 >
 >
 > use strict;
 > use warnings;
 > use Bio::SeqIO;
 > my $outputfh = *STDOUT;
 >
 >      my ($infile, $in, $out, $seqobj);
 >      $infile = shift or die;
 >
 >      $in  = Bio::SeqIO->new('-file' => $infile ,
 >                             '-format' => 'Fasta');
 >      $seqobj = $in->next_seq();
 >
 >      $out = Bio::SeqIO->newFh('-format'   => 'fasta',
 >                            '-noclose'  => 1,
 >                            '-fh'       => $outputfh);
 >
 > print $outputfh ">MyReversedSeq29856-29862\n",scalar
 > reverse($seqobj->subseq(29856,29862)),"\n";
 >
 >
 >
 >
 >
 > Thanks,
 > Paul
 >
 > _______________________________________________
 > Bioperl-l mailing list
 > Bioperl-l at portal.open-bio.org
 > http://portal.open-bio.org/mailman/listinfo/bioperl-l
 >

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 3071 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040223/68cfa05b/attachment.bin
From xiang.deng at duke.edu  Mon Feb 23 10:37:50 2004
From: xiang.deng at duke.edu (Xiang Deng)
Date: Tue Feb 24 14:53:15 2004
Subject: [Bioperl-l] Command-line Psiblast using NCBI blastpgp
Message-ID: <OFC2FF23F6.F3562326-ON85256E43.0054A7B6-85256E43.0055ED74@notes.duke.edu>

Hi Everyboday,

I got a question about how to do psiblast using NCBI blastpgp. The thing I 
want to do is to use a PSSM generated from a multiple alignment of our 
internal data to blast against NCBI nr database. I followed the 
instruction from blast tutorial as follows,

blastpgp -i seq1.txt -B align.msf -e 5000 -F F -j 2 -v 10 -d nr -o 
test_out.txt -C pssm.txt

I do not know why I have to specify a single sequence in seq1.txt from the 
aligned sequences in align.msf. I want to use the pssm created from the 
multiple alignment in align.msf to blast instead of only one sequence. And 
the result looks like using the single sequence only for blast and I could 
not see any sign of using the PSSP calculated from the multiple alignment. 
I am concerned about that result, does anyone have the same experience and 
know what is going on there? whether or not the command-line above did 
exactly what I want and Iam just too suspicious? 

And anyone has a better way to do this kind of psiblast via command-line?

thanks a lot,

Xiang

Department of Pharmacology and Cancer Biology
Duke University Medical Center
Durham, NC 27710

From amackey at pcbi.upenn.edu  Tue Feb 24 15:01:23 2004
From: amackey at pcbi.upenn.edu (Aaron J. Mackey)
Date: Tue Feb 24 15:07:32 2004
Subject: [Bioperl-l] Command-line Psiblast using NCBI blastpgp
In-Reply-To: <OFC2FF23F6.F3562326-ON85256E43.0054A7B6-85256E43.0055ED74@notes.duke.edu>
References: <OFC2FF23F6.F3562326-ON85256E43.0054A7B6-85256E43.0055ED74@notes.duke.edu>
Message-ID: <38DAACA4-6704-11D8-9A28-000A958C5008@pcbi.upenn.edu>

You must include at least one sequence from the MSA as a query; this 
sequence defines the "columns" of the PSSM (i.e. any columns in the MSA 
that include gaps in this sequence will not be apart of the final 
PSSM).  blastpgp reads the MSA and builds a PSSM after determining the 
relative uniqueness of each sequence in the profile, and weighting the 
contribution of each sequence to the PSSM by its uniqueness (imagine 
the extreme: an MSA that consisted of the same protein repeated 10 
times; searching with this MSA would be no different than searching 
with the single protein).

How many sequences are in your MSA?  If less than 10, you won't see 
very much change between using the PSSM and just the query sequence 
alone.  If you have 50, but they're all practically the same 
(redundant) sequence, you'll also see little change in the results.

To sum up: don't be so suspicious, I expect it's working as well as it 
can, given your input sequences.

-Aaron

On Feb 23, 2004, at 10:37 AM, Xiang Deng wrote:

> Hi Everyboday,
>
> I got a question about how to do psiblast using NCBI blastpgp. The 
> thing I
> want to do is to use a PSSM generated from a multiple alignment of our
> internal data to blast against NCBI nr database. I followed the
> instruction from blast tutorial as follows,
>
> blastpgp -i seq1.txt -B align.msf -e 5000 -F F -j 2 -v 10 -d nr -o
> test_out.txt -C pssm.txt
>
> I do not know why I have to specify a single sequence in seq1.txt from 
> the
> aligned sequences in align.msf. I want to use the pssm created from the
> multiple alignment in align.msf to blast instead of only one sequence. 
> And
> the result looks like using the single sequence only for blast and I 
> could
> not see any sign of using the PSSP calculated from the multiple 
> alignment.
> I am concerned about that result, does anyone have the same experience 
> and
> know what is going on there? whether or not the command-line above did
> exactly what I want and Iam just too suspicious?
>
> And anyone has a better way to do this kind of psiblast via 
> command-line?
>
> thanks a lot,
>
> Xiang
>
> Department of Pharmacology and Cancer Biology
> Duke University Medical Center
> Durham, NC 27710
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

From laurichj at bioinfo.ucr.edu  Tue Feb 24 17:47:30 2004
From: laurichj at bioinfo.ucr.edu (Josh Lauricha)
Date: Tue Feb 24 19:24:09 2004
Subject: [Bioperl-l] Re: Bio ::seqIO ::tig
In-Reply-To: <403b77fd74b002.61830072@businessserve.co.uk>
References: <403b77fd74b002.61830072@businessserve.co.uk>
Message-ID: <20040224224730.GB18113@batch107a>

On Tue 02/24/04 16:12, Jay Moore wrote:
<snipped>

I've commited the fixes for the 1.4 version of tigr.pm to CVS, this
should fix both the tiling path and keyword problems. Testing would
help, as I don't use the files that cause this problem (or at least, I
don't run into it).

However, I'm puting the final touches on a XML::SAX based parser, which
I hope to have out fairly soon.

-- 

------------------------------------------------------
| Josh Lauricha            | Ford, your turning into |
| laurichj@bioinfo.ucr.edu | a penguin. Stop it.     |
| Bioinformatics, UCR      |                         |
|----------------------------------------------------|
| OpenPG:                                            |
|  5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8 |
|----------------------------------------------------|
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040224/cff1ae7c/attachment.bin
From redwards at utmem.edu  Tue Feb 24 19:27:40 2004
From: redwards at utmem.edu (Rob Edwards)
Date: Tue Feb 24 19:33:51 2004
Subject: [Bioperl-l] Re: Fwd: nucleic acid melting temperature
In-Reply-To: <403B9633.6070902@genetics.utah.edu>
References: <Pine.LNX.4.44.0402241347040.31251-100000@hennig.ebi.ac.uk>
	<403B9633.6070902@genetics.utah.edu>
Message-ID: <6BCD6BCC-6729-11D8-8A23-000A959E1622@utmem.edu>

This is a pretty good summary of the situation. I initially wrote  
Bio::SeqFeature::Primer to hold the Primer object.

At the time I was mainly using it with Bio::Tools::Run::Primer3 to  
design some primers for PCR amplification. The Tm calculation routine  
was included because it makes sense for a primer module (*). However,  
there are many people on the list that know a lot more about Tm  
calculations than I do, and I have updated the module when others  
propose better calculations.  I do think that Tm calculations should be  
in a separate module (probably either their own or  
Bio::Tools::SeqStats) as Tm calculations could be appropriate in a  
variety of different experiments, but I am happy to cede that this may  
not be desirable because of the large number of modules!

There actually is already a bioperl wrapper for running Melting ....  
Bio::Tools::Run::PiseApplication::melting (note that you'll need to  
install Bio::Tools::Run separately) that works via the Pise website.

We could duplicate the effort and rewrite Melting in Perl, we could  
write a separate Wrapper for Bio::Tools::Run, or we could direct people  
to the Pise implementation.

Rob

(*) The Bio::SeqFeature::Primer and other modules were re-written by me  
based on the work of Chad Matsalla for which I am grateful. I expect  
that Chad had a Tm calculator too (possibly the same one), though I  
can't find an old copy of his modules to check this.

On Feb 24, 2004, at 12:21 PM, Barry Moore wrote:

> Nicolas,
>
> There is a module (primer.pm) that will allow you to generate a primer  
> object.  This object has a Tm method to return the melting temperature  
> of that primer.  About a week ago that method was updated to use the  
> nearest-neighbor thermodynamic approach to calculating Tm, and there  
> has been a discussion going on since then about that.  Your program  
> exceeds the capabilities of that method in a variety of ways.  The  
> current method calculates the enthalpy and entropy for all  
> dinucleotide pairs, and adjusts those for duplex initiation.  It  
> calculates Tm based on those values, the oligo concentration and salt  
> concentration as per Allawi et. al Biochemistry 1997 36:10581-10594  
> (however the salt adjustment was taken from  
> http://biotools.idtdna.com/analyzer/).  The primer.pm module  
> containing that code can be found at:  
> http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-live/Bio/ 
> SeqFeature/Primer.pm?cvsroot=bioperl.  I believe that Rob Edwards is  
> the current maintainer of that module.  What the current method does  
> not do that your program does is account for the possibility of  
> mismatches and dangling ends.  I think the current primer object would  
> need some redesigning to allow for those.  You may also be using a  
> more accurate adjustments for salt concentration.
>
> Your Melting program looks like it would be a great addition to  
> bioperl.  I'm farily new to bioperl, and don't know the overall object  
> structure well enough yet to comment from a developers point of view,  
> but I wonder if your algorithm would be better placed somewhere with a  
> boarder scope than as a method of the SeqFeature::Primer object,  
> perhaps as a method available to all sequence objects.  I beleive Rob  
> made a similar comment in his original documentation of the Tm method.  
>  Perhaps some of the seasoned Bioperl developers can discuss where a  
> module with the capabilities of Melting should live.  Also as a new  
> user, I would suggest that porting Melting to perl and integrating it  
> into Bioperl is preferable to simply writing a wrapper (from the users  
> point of view, not the developers of course).  To casual and new users  
> of Bioperl, long lists of dependencies can be very daunting.
>
> Barry Moore
>

From hlapp at gnf.org  Tue Feb 24 22:28:48 2004
From: hlapp at gnf.org (Hilmar Lapp)
Date: Tue Feb 24 22:34:56 2004
Subject: use of seq_id. was: [Bioperl-l] Bio::Tools::GFF  use of seqname
In-Reply-To: <403B1F95.20504@mrc-lmb.cam.ac.uk>
Message-ID: <B94A4B7C-6742-11D8-983C-000A959EB4C4@gnf.org>


On Tuesday, February 24, 2004, at 01:55  AM, Dave Howorth wrote:

> Hilmar Lapp wrote:
>> Actually, $feat->can('seq_id') must be true at all times iff 
>> $feat->isa("Bio::SeqFeatureI"), so it's kind of superfluous to test 
>> for it.
>
> Where does this come from, please?  In the SeqFeatureI documentation 
> it says seq_id 'is an attribute such that you *can* store the ID' (my 
> emphasis).  You seem to be saying that if I'm creating a bunch of 
> (sub) features just so I can use Bio::Graphics, I must attach a seq_id 
> to each and every one.

I'm not saying anything about the value of the attribute. 
$feat->can('seq_id') will be true if you can call $feat->seq_id(), 
which you will always be able to since it's defined in 
Bio::SeqFeatureI. Whether that method returns garbage or something 
useful is another story. I'm not sure whether you have to set seq_id() 
to something meaningful in order to remain compatible with 
Bio::Graphics, but I'd guess you do. Lincoln?

>
> I have an inverse question that I haven't managed to find an answer to 
> yet. If I'm displaying these sub-features as segments, how can I 
> attach some text to the feature that will be displayed alongside each 
> individual segment?
>

This one is for Lincoln ...

	-hilmar

> Thanks, Dave
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From sdavis2 at mail.nih.gov  Wed Feb 25 07:22:35 2004
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed Feb 25 07:28:40 2004
Subject: [Bioperl-l] Gbrowse DAS support
Message-ID: <BC61FDBB.4E40%sdavis2@mail.nih.gov>

I have the following situation (and I imagine that I am not unique, here):
I have a number of different oligo and cDNA microarray platforms, each with
probes that have (human) sequence associated with them.  For many of them,
we construct our own annotation by blasting against various databases, the
result being blast results for each probe to genbank sequences, ests, refseq
genes, ensembl genes, or the reference genome.  While this level of
knowledge is adequate for most applications, we are now finding
(particularly with oligo probes) that we need to know with a fair amount of
detail what these sequences look like in genomic context down to the
basepair level.  Therefore, I would like to build a browser that
incorporates my local information including blast hits of my sequence
against various reference sequences.  While I could certainly build a local
database to hold all of the possible references and their assembly, I would
like to use Bio::Das to fetch annotations.  I could see fetching most
annotations from the DAS server, but including tracks for my local data,
also.  And, while I could build a browser, I would like to start with
something done already, and Gbrowse seems a likely candidate for me.

I currently have bioperl-1.4, bio::das, and gbrowse running.  The
installation file for gbrowse says that a feature wish list includes "better
DAS support" and the ability to "configure data sources on a track-by-track"
basis.  Has anyone accomplished this?  Are there other options at which I
should look (including "go do it yourself")?

Sean
-- 
Sean Davis, M.D., Ph.D.

Postdoctoral Research Fellow
NHGRI, NIH

Clinical Fellow
NCI, NIH

Clinical Fellow, Johns Hopkins
Department of Pediatric Oncology
-- 


From dhoworth at mrc-lmb.cam.ac.uk  Wed Feb 25 09:05:58 2004
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Wed Feb 25 09:12:02 2004
Subject: [Bioperl-l] CPAN install
Message-ID: <403CABC6.4090007@mrc-lmb.cam.ac.uk>

I'm trying to upgrade my bioperl installation to 1.4. I want to do it 
from CPAN, using perl -MCPAN because that's the way I install Perl 
software. I'm having trouble, so now I have an extra reason: to see if 
the install process needs fixing :)  I have a couple of questions:

(1) On <http://www.bioperl.org/Core/Latest/> it says
"Information on how to use CPAN.pm to automatically download BioPerl and 
various CPAN module dependencies is described on our INSTALL file." but 
the install file <http://www.bioperl.org/Core/Latest/INSTALL> does not 
contain this information as far as I can see? (it describes how to 
install Bundle::Bioperl that way, but not Bioperl itself)

(2) What is the name of the distribution? The link takes me to 
bioperl-1.4 but I get an error when I try to install it:

perl -MCPAN -e shell
cpan> install bioperl-1.4
Warning: Cannot install bioperl-1.4, don't know what it is.
Try the command

     i /bioperl-1.4/

to find objects with matching identifiers.

The suggested search returns a long list of individual modules. I get 
similar errors if I try 'bioperl' or 'Bioperl'.

(3) There is a module called Bioperl, which says it is part of the 
bioperl-1.4 distribution, but describes itself as "Bioperl 1.3 - Perl 
Modules for Biology"

Any ideas?

Thanks, Dave
-- 
Dave Howorth
MRC Centre for Protein Engineering
Hills Road, Cambridge, CB2 2QH
01223 252960

From dhoworth at mrc-lmb.cam.ac.uk  Wed Feb 25 09:32:50 2004
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Wed Feb 25 09:38:53 2004
Subject: [Bioperl-l] Re: CPAN install
In-Reply-To: <403CABC6.4090007@mrc-lmb.cam.ac.uk>
References: <403CABC6.4090007@mrc-lmb.cam.ac.uk>
Message-ID: <403CB212.4000601@mrc-lmb.cam.ac.uk>

I wrote:
> I'm trying to upgrade my bioperl installation to 1.4.
> 
> (2) What is the name of the distribution?

I got a little further. It is necessary to type:

  install B/BI/BIRNEY/bioperl-1.4.tar.gz

It might be worth documenting this in the install instructions or fixing 
the distribution so it's not necessary.

But now I get test failures:

Failed Test       Stat Wstat Total Fail  Failed  List of Failed
-------------------------------------------------------------------------------
t/DB.t                          78    8  10.26%  55 79-85
t/Perl.t                        14    1   7.14%  13
t/RestrictionIO.t               14    1   7.14%  10
121 subtests skipped.
Failed 3/179 test scripts, 98.32% okay. -4/8268 subtests failed, 100.05% 
okay.
make: *** [test_dynamic] Error 11
   /usr/bin/make test -- NOT OK
Running make install
   make test had returned bad status, won't install without force

My system is Debian Woody with a bunch of backports. Perl 5.6.1.

Any thoughts on why these tests are failing?

Thanks, Dave
-- 
Dave Howorth
MRC Centre for Protein Engineering
Hills Road, Cambridge, CB2 2QH
01223 252960

From brian_osborne at cognia.com  Wed Feb 25 09:35:11 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Wed Feb 25 09:41:24 2004
Subject: [Bioperl-l] CPAN install
In-Reply-To: <403CABC6.4090007@mrc-lmb.cam.ac.uk>
Message-ID: <GAEDKMGOKFBLJPKCLKCCAEMNDHAA.brian_osborne@cognia.com>

Dave,

(1) You're right, there should be instructions in the INSTALL file, I'll fix
that.

(2) That long list of modules from "i/bioperl-1.4/" is the list of modules
in 1.4. If you take a look at the top of that list you'll see "Distribution
id = B/BI/BIRNEY/bioperl-1.4.tar.gz", that's the name you should use, i.e.
"install B/BI/BIRNEY/bioperl-1.4.tar.gz".

However, you'll want to install the Bundle first, "d/BioPerl/" should give
you the exact name.


Brian O.


-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Dave Howorth
Sent: Wednesday, February 25, 2004 9:06 AM
To: bioperl-l@bioperl.org
Subject: [Bioperl-l] CPAN install

I'm trying to upgrade my bioperl installation to 1.4. I want to do it
from CPAN, using perl -MCPAN because that's the way I install Perl
software. I'm having trouble, so now I have an extra reason: to see if
the install process needs fixing :)  I have a couple of questions:

(1) On <http://www.bioperl.org/Core/Latest/> it says
"Information on how to use CPAN.pm to automatically download BioPerl and
various CPAN module dependencies is described on our INSTALL file." but
the install file <http://www.bioperl.org/Core/Latest/INSTALL> does not
contain this information as far as I can see? (it describes how to
install Bundle::Bioperl that way, but not Bioperl itself)

(2) What is the name of the distribution? The link takes me to
bioperl-1.4 but I get an error when I try to install it:

perl -MCPAN -e shell
cpan> install bioperl-1.4
Warning: Cannot install bioperl-1.4, don't know what it is.
Try the command

     i /bioperl-1.4/

to find objects with matching identifiers.

The suggested search returns a long list of individual modules. I get
similar errors if I try 'bioperl' or 'Bioperl'.

(3) There is a module called Bioperl, which says it is part of the
bioperl-1.4 distribution, but describes itself as "Bioperl 1.3 - Perl
Modules for Biology"

Any ideas?

Thanks, Dave
--
Dave Howorth
MRC Centre for Protein Engineering
Hills Road, Cambridge, CB2 2QH
01223 252960

_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From dhoworth at mrc-lmb.cam.ac.uk  Wed Feb 25 09:45:13 2004
From: dhoworth at mrc-lmb.cam.ac.uk (Dave Howorth)
Date: Wed Feb 25 09:51:16 2004
Subject: [Bioperl-l] CPAN install
In-Reply-To: <GAEDKMGOKFBLJPKCLKCCAEMNDHAA.brian_osborne@cognia.com>
References: <GAEDKMGOKFBLJPKCLKCCAEMNDHAA.brian_osborne@cognia.com>
Message-ID: <403CB4F9.9050304@mrc-lmb.cam.ac.uk>

Brian Osborne wrote:
> "install B/BI/BIRNEY/bioperl-1.4.tar.gz".

Thanks.

> However, you'll want to install the Bundle first, "d/BioPerl/" should give
> you the exact name.

I'd already upgraded that. 'install Bundle::BioPerl' works exactly as I 
would hope. And before that, upgrading libgd from source worked exactly 
as it should too :)

Thanks, Dave
-- 
Dave Howorth
MRC Centre for Protein Engineering
Hills Road, Cambridge, CB2 2QH
01223 252960

From hz5 at njit.edu  Wed Feb 25 09:49:54 2004
From: hz5 at njit.edu (hz5@njit.edu)
Date: Wed Feb 25 09:55:58 2004
Subject: [Bioperl-l] bioperl graphics
Message-ID: <1077720594.403cb61235be7@webmail.njit.edu>

Dear all,

Is there any way to render 2 Bio::Graphics::Panel into one png image? because I 
want 2 different arrows with different labeled coordinates on the same image 
and align to the left, but one Panel can only have one coordinates system.

Thanks!

=========================================================
Haibo Zhang, PhD student
Computational Biology, NJIT & Rutgers University
Center for Applied Genomics, PHRI
http://afs13.njit.edu/~hz5
From brian_osborne at cognia.com  Wed Feb 25 09:57:06 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Wed Feb 25 10:03:11 2004
Subject: [Bioperl-l] CPAN install
In-Reply-To: <403CB4F9.9050304@mrc-lmb.cam.ac.uk>
Message-ID: <GAEDKMGOKFBLJPKCLKCCEEMODHAA.brian_osborne@cognia.com>

Dave,

I forgot to mention that you may still get errors in your "make test",
despite the fact that you installed the Bundle. The question, perhaps, is
whether these failures have anything to do your intended use of Bioperl.
I've never done an install of Bioperl that passed all of the tests myself.
Most of us who like the CPAN approach just do the "force install" at that
point.

Brian O.

-----Original Message-----
From: Dave Howorth [mailto:dhoworth@mrc-lmb.cam.ac.uk]
Sent: Wednesday, February 25, 2004 9:45 AM
To: Brian Osborne
Cc: bioperl-l@bioperl.org
Subject: Re: [Bioperl-l] CPAN install

Brian Osborne wrote:
> "install B/BI/BIRNEY/bioperl-1.4.tar.gz".

Thanks.

> However, you'll want to install the Bundle first, "d/BioPerl/" should give
> you the exact name.

I'd already upgraded that. 'install Bundle::BioPerl' works exactly as I
would hope. And before that, upgrading libgd from source worked exactly
as it should too :)

Thanks, Dave
--
Dave Howorth
MRC Centre for Protein Engineering
Hills Road, Cambridge, CB2 2QH
01223 252960


From brian_osborne at cognia.com  Wed Feb 25 10:42:47 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Wed Feb 25 10:48:53 2004
Subject: [Bioperl-l] Re: CPAN install
In-Reply-To: <403CB212.4000601@mrc-lmb.cam.ac.uk>
Message-ID: <GAEDKMGOKFBLJPKCLKCCKENADHAA.brian_osborne@cognia.com>

Dave,

No idea, though when I do "perl t/DB.t" I also get failures. You can always
do the "force install" and thoroughly investigate after installation. The
Perl.t failure is odd, I'm guessing it's the RefSeq retrieval, but I'm not
sure. I don't think it has to do with your local installation. Same thing
for the RestrictionIO.t failure.

>It might be worth documenting this in the install instructions or fixing
>the distribution so it's not necessary.

Yes.

Brian O.

-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Dave Howorth
Sent: Wednesday, February 25, 2004 9:33 AM
To: bioperl-l@bioperl.org
Subject: [Bioperl-l] Re: CPAN install

I wrote:
> I'm trying to upgrade my bioperl installation to 1.4.
>
> (2) What is the name of the distribution?

I got a little further. It is necessary to type:

  install B/BI/BIRNEY/bioperl-1.4.tar.gz

It might be worth documenting this in the install instructions or fixing
the distribution so it's not necessary.

But now I get test failures:

Failed Test       Stat Wstat Total Fail  Failed  List of Failed
----------------------------------------------------------------------------
---
t/DB.t                          78    8  10.26%  55 79-85
t/Perl.t                        14    1   7.14%  13
t/RestrictionIO.t               14    1   7.14%  10
121 subtests skipped.
Failed 3/179 test scripts, 98.32% okay. -4/8268 subtests failed, 100.05%
okay.
make: *** [test_dynamic] Error 11
   /usr/bin/make test -- NOT OK
Running make install
   make test had returned bad status, won't install without force

My system is Debian Woody with a bunch of backports. Perl 5.6.1.

Any thoughts on why these tests are failing?

Thanks, Dave
--
Dave Howorth
MRC Centre for Protein Engineering
Hills Road, Cambridge, CB2 2QH
01223 252960

_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From crabtree at tigr.org  Wed Feb 25 11:46:29 2004
From: crabtree at tigr.org (Jonathan Crabtree)
Date: Wed Feb 25 11:54:01 2004
Subject: [Bioperl-l] bioperl graphics
In-Reply-To: <1077720594.403cb61235be7@webmail.njit.edu>
References: <1077720594.403cb61235be7@webmail.njit.edu>
Message-ID: <403CD165.4090309@tigr.org>


Haibo-

hz5@njit.edu wrote:

>Is there any way to render 2 Bio::Graphics::Panel into one png image? because I 
>want 2 different arrows with different labeled coordinates on the same image 
>and align to the left, but one Panel can only have one coordinates system.
>  
>

The answer is yes, with a couple of caveats.  The first is that you will 
have to take care of the layout of the individual Panel-generated 
images.  If you're left-justifying everything then this should be easy 
enough.  The second is that I would recommend making a one-line change 
to Bio/Graphics/Panel.pm, to prevent the package from trying to allocate 
the same set of colors twice (when you reuse the same GD object to draw 
the two different parts of the image.)  Search for the following piece 
of code in Panel.pm (at line 411 in bioperl-1.4, I think):

  for my $name ('white','black',keys %COLORS) {
    my $idx = $gd->colorAllocate(@{$COLORS{$name}});
    $translation_table{$name} = $idx;
  }

Change "colorAllocate" to "colorResolve"; this should have no effect on 
any existing Bio::Graphics code (AFAIK) and will allow you to do your 
two (or three or four)-Panel trick.  (As an aside, I'd like to lobby for 
this one-line change to be made in a future version of 
Bio::Graphics::Panel, for precisely this reason.)  In any case, once 
you've made that change and reinstalled your copy of Bioperl, here is a 
rough outline of what you need to do:

1. Set up your individual Bio::Graphics::Panel objects (e.g. $p1, $p2, 
$p3, etc.) as desired to draw your images, but do *not* call the gd 
method on any of them yet.

2. Create a GD::Image object big enough to hold the images that will be 
drawn by $p1, $p2, $p3, etc.:
    my $gdImg = GD::Image->new($fullWidth, $fullHeight);
(Note: use $p1->width(), $p1->height(), etc., to determine what 
$fullWidth and $fullHeight should be, based on your desired Panel layout 
algorithm.)

4. Use a "dummy" Bio::Graphics::Panel object to allocate all your colors 
(this is an optional step; I do this because my code does some drawing 
that isn't handled by Bio::Graphics::Panel, and want to make sure that 
the palette has been allocated before I start):

    my $dummyPanel = Bio::Graphics::Panel->new(-length => 100, -offset 
=> 0, -width => $fullWidth);
    $dummyPanel->gd($gdImg); # forces color allocation

5. Draw the individual panels and generate your png image:

    $p1->gd($gdImg);
    $p2->gd($gdImg);
    my $pngData = $gdImg->png();

I've glossed over some of the details here, for example the fact that 
you may need to know the value of $p1->height() before you can 
initialize $p2, but that's the basic idea.  I've been using this method 
to generate some comparative sequence displays and while it's definitely 
a bit of a hack, it works well in practice.  You can also do the same 
thing with a GD::SVG::Image if you'd like to generate SVG output.  Good 
luck,

Jonathan


From cjfields at uiuc.edu  Wed Feb 25 12:26:45 2004
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed Feb 25 18:05:57 2004
Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows
Message-ID: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu>

An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040225/aa08846c/attachment.htm
From abcdef21hanna at hotmail.com  Thu Feb 26 12:23:37 2004
From: abcdef21hanna at hotmail.com (milford)
Date: Wed Feb 25 23:28:49 2004
Subject: [Bioperl-l] Forget V1AGRA, there's a new game in town!
Message-ID: <1077816217-21027@excite.com>

The Biggest New Drug since V1agra! Many times as powerful.

http://medspro.net/sv/index.php?pid=eph9106

C1AL1S has been seen all over TV as of late.

So why is it so much better than V1agra? Why are so many switching brands?

-A quicker more stable erection
-More enjoyable sex for both
-Longer sex
-Known to add length to you erection
-Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six)

We have it at a discounted savings. Save when you go through our site on all your orders.

See the difference today.

http://medspro.net/sv/index.php?pid=eph9106


yoda gobluevicky kingdom fletch frogs softball binky 
gretchen vermontlamer zeppelin ruth
larry sarah1 homebrew 

Get off this list by writing to getoff3136@yahoomail.com 
From brian_osborne at cognia.com  Thu Feb 26 08:22:29 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Thu Feb 26 08:28:35 2004
Subject: [Bioperl-l] DBFetch problem
Message-ID: <GAEDKMGOKFBLJPKCLKCCKENKDHAA.brian_osborne@cognia.com>

>Dear Brian,

>I found out that the RNA entries are missing from our RefSeq at the moment,
>that is why the example entry is not found with dbfetch:

>LOCUS       NM_006732               3775 bp    mRNA    linear   PRI
20-DEC-2003

>This will be fixed today, so by tomorrow you should find the same
>entries as from NCBI website. NCBI has changed their RefSeq
>distribution files and most of the RNA entries were accidentally
>left outside the distribution.


From sdavis2 at mail.nih.gov  Thu Feb 26 16:01:53 2004
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu Feb 26 16:08:02 2004
Subject: [Bioperl-l] Mysql database connection from Gbrowse
Message-ID: <BC63C8F1.4EC3%sdavis2@mail.nih.gov>

Lincoln and others,

With regard to my last e-mail, I simply used bp_load_gff.pl and was able to
load the human gff files from gmod.org within an hour or so.

In any case, I have a working mysql database called human that I can access
via mysql command line.  I also have a working version of gbrowse that
connects to the yeast_chr1 memory database.  (I also connected to the DAS
server at ucsc--WAY COOL.)

I am now trying to connect to my mysql human database and get the following
returned to the browser.

http://localhost/cgi-bin/gbrowse/human

-----------------
An internal error has occurred

Could not open database.
install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC
contains: /System/Library/Perl/5.8.1/darwin-thread-multi-2level
/System/Library/Perl/5.8.1 /Library/Perl/5.8.1/darwin-thread-multi-2level
/Library/Perl/5.8.1 /Library/Perl
/Network/Library/Perl/5.8.1/darwin-thread-multi-2level
/Network/Library/Perl/5.8.1 /Network/Library/Perl .) at (eval 18) line 3.
Perhaps the DBD::mysql perl module hasn't been fully installed,
or perhaps the capitalisation of 'mysql' isn't right.
Available drivers: ExampleP, Multiplex, Proxy, Sponge.
 at /Library/Perl/5.8.1/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm line 139


 Please contact this site's maintainer ([no address given]) for assistance.

 For the source code for this browser, see the  Generic Model Organism
Database Project. For other questions, send mail to lstein@cshl.org.
$Id: yeast_chr1.conf,v 1.6 2004/02/04 15:03:43 marclogghe Exp $

 Note: This page uses cookie to save and restore preference information. No
information is shared.
Generic genome browser version Bio::Graphics::Browser=HASH(0x801294)
------------------

However, this works:
perl -e "use Bio::DB::GFF::Adaptor::dbi::mysql"
As does
perl -e "use DBD::mysql"


>From the human.conf file:
[GENERAL]
description   = human
db_adaptor    = Bio::DB::GFF
db_args       = -adaptor dbi::mysql
                -dsn     human

Any ideas?  

Thanks,
Sean

From f.cadieux at btinternet.com  Thu Feb 26 17:18:21 2004
From: f.cadieux at btinternet.com (Info)
Date: Thu Feb 26 17:24:44 2004
Subject: [Bioperl-l] Federal Provincial Subsidies
Message-ID: <200402262224.i1QMOE9Q026843@portal.open-bio.org>

  
CANADA BOOKS
26 CH. BELLEVUE
ST-ANNE-DES-LACS
QC, CANADA
J0R 1B0
(450) 224-9275
 

PRESS RELEASE
 
CANADIAN SUBSIDY DIRECTORY YEAR 2004 EDITION
 
Legal Deposit-National Library of Canada
ISBN 2-922870-05-7
 
The new revised edition of the Canadian Subsidy Directory 2004 is now
available. 
The new edition is the most complete and affordable reference for anyone
looking for financial support.
It is deemed to be the perfect tool for new or existing businesses,
individual ventures, foundations and associations.
 
This Publication contains  more than 2600 direct and indirect financial
subsidies, grants and loans offered by government departments and
agencies, foundations, associations and organisations.  In this new 2004
edition
all programs are well described.
 
The Canadian Subsidy Directory is the most comprehensive tool to start up
a business, improve existent activities, set up a business plan, or obtain
assistance from experts in fields such as: Industry, transport,
agriculture, communications, municipal infrastructure, education,
import-export, labor, construction and renovation, the service sector,
hi-tech industries, research and development, joint ventures, arts,
cinema, theatre, music and recording industry, the self employed,
contests, and new talents.
Assistance from and for foundations and associations, guidance to prepare
a business plan, market surveys, computers, and much more!
 
The Canadian Subsidy Directory is sold $ 69.95, to obtain a copy please
visit:
www.cbooks.biz
 
 
From meow21safety at hotmail.com  Fri Feb 27 11:32:06 2004
From: meow21safety at hotmail.com (rodrick)
Date: Thu Feb 26 22:37:16 2004
Subject: [Bioperl-l] Forget V1AGRA, there's a new game in town!
Message-ID: <1077899526-4901@excite.com>

The Biggest New Drug since V1agra! Many times as powerful.

http://healthdo.com/sv/index.php?pid=eph9106

-A quicker more stable erection
-More enjoyable sex for both
-Longer sex
-Known to add length to you erection
-Lasts up to 36 hours (not a thrity-six hour erection, but enhancement for thirty-six)

We have it at a discounted savings. Save when you go through our site on all your orders.

http://healthdo.com/sv/index.php?pid=eph9106


molly1 spaincynthia symbol ladybug dexter cherry sasha 
misha lightnew japan lloyd
oranges bird sbdc 

Get off this list go to http://healthdo.com/sv/applepie.php
From lstein at cshl.edu  Fri Feb 27 09:38:31 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Fri Feb 27 09:45:00 2004
Subject: [Bioperl-l] bioperl graphics
In-Reply-To: <403CD165.4090309@tigr.org>
References: <1077720594.403cb61235be7@webmail.njit.edu>
	<403CD165.4090309@tigr.org>
Message-ID: <200402271638.37134.lstein@cshl.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sorry, but if you change colorAllocate() to colorResolve(), you will 
break the ability to generate publication-quality images with 
GD::SVG.  Perhaps Todd Harris will add colorResolve() to a future 
version of GD::SVG, in which case I will make the suggested change to 
Bio::Graphics.

I would recommend instead making two Bio::Graphics::Panel objects, and 
generating a pair of GD objects (using the Panel->gd() method).  Then 
you can combine them onto a third GD object in whatever geometry you 
want by using GD->copy()

Lincoln

On Wednesday 25 February 2004 06:46 pm, Jonathan Crabtree wrote:
> Haibo-
>
> hz5@njit.edu wrote:
> >Is there any way to render 2 Bio::Graphics::Panel into one png
> > image? because I want 2 different arrows with different labeled
> > coordinates on the same image and align to the left, but one
> > Panel can only have one coordinates system.
>
> The answer is yes, with a couple of caveats.  The first is that you
> will have to take care of the layout of the individual
> Panel-generated images.  If you're left-justifying everything then
> this should be easy enough.  The second is that I would recommend
> making a one-line change to Bio/Graphics/Panel.pm, to prevent the
> package from trying to allocate the same set of colors twice (when
> you reuse the same GD object to draw the two different parts of the
> image.)  Search for the following piece of code in Panel.pm (at
> line 411 in bioperl-1.4, I think):
>
>   for my $name ('white','black',keys %COLORS) {
>     my $idx = $gd->colorAllocate(@{$COLORS{$name}});
>     $translation_table{$name} = $idx;
>   }
>
> Change "colorAllocate" to "colorResolve"; this should have no
> effect on any existing Bio::Graphics code (AFAIK) and will allow
> you to do your two (or three or four)-Panel trick.  (As an aside,
> I'd like to lobby for this one-line change to be made in a future
> version of
> Bio::Graphics::Panel, for precisely this reason.)  In any case,
> once you've made that change and reinstalled your copy of Bioperl,
> here is a rough outline of what you need to do:
>
> 1. Set up your individual Bio::Graphics::Panel objects (e.g. $p1,
> $p2, $p3, etc.) as desired to draw your images, but do *not* call
> the gd method on any of them yet.
>
> 2. Create a GD::Image object big enough to hold the images that
> will be drawn by $p1, $p2, $p3, etc.:
>     my $gdImg = GD::Image->new($fullWidth, $fullHeight);
> (Note: use $p1->width(), $p1->height(), etc., to determine what
> $fullWidth and $fullHeight should be, based on your desired Panel
> layout algorithm.)
>
> 4. Use a "dummy" Bio::Graphics::Panel object to allocate all your
> colors (this is an optional step; I do this because my code does
> some drawing that isn't handled by Bio::Graphics::Panel, and want
> to make sure that the palette has been allocated before I start):
>
>     my $dummyPanel = Bio::Graphics::Panel->new(-length => 100,
> -offset => 0, -width => $fullWidth);
>     $dummyPanel->gd($gdImg); # forces color allocation
>
> 5. Draw the individual panels and generate your png image:
>
>     $p1->gd($gdImg);
>     $p2->gd($gdImg);
>     my $pngData = $gdImg->png();
>
> I've glossed over some of the details here, for example the fact
> that you may need to know the value of $p1->height() before you can
> initialize $p2, but that's the basic idea.  I've been using this
> method to generate some comparative sequence displays and while
> it's definitely a bit of a hack, it works well in practice.  You
> can also do the same thing with a GD::SVG::Image if you'd like to
> generate SVG output.  Good luck,
>
> Jonathan
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

- -- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFAP1Zt0CIvUP7P+AkRAreyAJ0XIcjMDeT/Bw69OBOEhD8tsznP+QCfVLWo
+RnQaijXxPlVWTbmjTkbHYw=
=lN1U
-----END PGP SIGNATURE-----
From lstein at cshl.edu  Fri Feb 27 09:45:17 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Fri Feb 27 09:52:45 2004
Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows
In-Reply-To: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu>
References: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu>
Message-ID: <200402271645.18011.lstein@cshl.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Chris,

Do you want to try the bioperl-1.4 ppm located on this repository?

	http://www.gmod.org/ggb/ppm

I put it together myself and it's the one that seems to work properly 
for me.

Lincoln

On Wednesday 25 February 2004 07:26 pm, Chris Fields wrote:
>  I was unable to get the PPM package for 1.4 working for Windows
> from http:/bioperl.org/DIST and had to perform a workaround.  I
> decided to post it in case others were running into problems.
>
>  When I first tried installing Bioperl using PPM, it installs
> bioperl 1.2 first (!?!), then allows upgrading to 1.2.3.  However,
> it will not install 1.4 b/c of the additional dependencies
> (HTML-Entities and IO-Scalar).  The latter dependencies are notably
> not req'd for 1.2 or 1.2.3.  IMHO, I'm guessing that PPM can't find
> these modules b/c it is looking for specific ppm packages named
> HTML-Entities and IO-Scalar, not for the modules named
> HTML-Entities and IO-Scalar (which are included in the packages
> HTML-Parser and IO-stringy).  This problem could be linked to the
> version of PPM I'm using (3.1) on ActivePerl 5.8.3-809, both of
> which are very new, so I have no idea if this is a problem with
> older versions of PPM.
>
>  The workaround was to remove the dependencies manually.  I
> downloaded the relevant ppm tar file and corresponding ppd files
> (bioperl-1.4-ppm.tar.gz and Bioperl-1.4.ppd, respectively) to a
> local directory (C:\Perl\Bioperl).  Using a text editor, I removed
> all references to the added dependencies and saved the file.  More
> specifically, I deleted the following lines, listed twice under
> Implementations (so delete both sets!):
>
>
> <DEPENDENCY NAME="HTML-Entities"
> VERSION="0,0,0,0" />
> <DEPENDENCY NAME="IO-Scalar" VERSION="0,0,0,0"
> />
>
>
> I then entered PPM, set up a local ppd repository:
>
>  rep add local_bio "C:/Perl/Bioperl"
>
>  I then searched for and installed the modifed PPM file and it
> worked.
>
>  Like I said, I don't know if this is a PPM issue or not.  However,
> I think it might be a good idea to remove those dependencies just
> in case, as they are a bit redundant (both HTML-Parser and
> IO-stringy are already listed).
>
>  My two cents...
>  __________________________________
>
>
>
> Chris Fields - Postdoctoral Researcher
>  Lab of Dr. Robert Switzer
>
>  Address:
>
>  University of Illinois at Urbana-Champaign
>  Dept. of Biochemistry - 323 RAL
>  600 S. Mathews Ave.
>  Urbana, IL 61801
>
>  Phone : (217) 333-7098
>  Fax : (217) 244-5858

- -- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFAP1f90CIvUP7P+AkRAkyOAJ9BmoqcV3DC4zJh392bIveOQ9ec6wCfVMbb
EKv61liRTU8XfEeQ1yg6EeU=
=IP7P
-----END PGP SIGNATURE-----
From Glez-Izarzugaza at lycos.es  Thu Feb 26 22:55:21 2004
From: Glez-Izarzugaza at lycos.es (=?iso-8859-1?Q?Jose_M=AA_Glez_Izarzugaza?=)
Date: Fri Feb 27 09:59:39 2004
Subject: [Bioperl-l] Breadth-First Search Algorithm - BFS
Message-ID: <009001c3fce5$8624e0e0$5be625d5@txema>

Hello everyone,

I'm working with a graph and I need to calculate the values of C and L, to do so, I need an algorithm to calculate the distance to the other elements. 

A good one is BFS algorithm. 

I tried to write the script (the algorithm itself) in Perl but I got absolutely lost. 

Can anyone help me?

Thanks in advance,
Alquemius 

PS: Please, send me only-text mails
From lstein at cshl.edu  Fri Feb 27 09:52:37 2004
From: lstein at cshl.edu (Lincoln Stein)
Date: Fri Feb 27 09:59:48 2004
Subject: [Bioperl-l] Mysql database connection from Gbrowse
In-Reply-To: <BC63C8F1.4EC3%sdavis2@mail.nih.gov>
References: <BC63C8F1.4EC3%sdavis2@mail.nih.gov>
Message-ID: <200402271652.37417.lstein@cshl.edu>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Sean,

I can think of two explanations for this:

	1) you have two versions of perl and the one that you call when you
		are on the command line is different from the one that the
		CGI script calls

	2) you have two installations of the Perl library files, and the
		PERL5LIB environment variable is different under the CGI
		script than when you are logged in yourself.

Does this ring any bells?

Lincoln

On Thursday 26 February 2004 11:01 pm, Sean Davis wrote:
> Lincoln and others,
>
> With regard to my last e-mail, I simply used bp_load_gff.pl and was
> able to load the human gff files from gmod.org within an hour or
> so.
>
> In any case, I have a working mysql database called human that I
> can access via mysql command line.  I also have a working version
> of gbrowse that connects to the yeast_chr1 memory database.  (I
> also connected to the DAS server at ucsc--WAY COOL.)
>
> I am now trying to connect to my mysql human database and get the
> following returned to the browser.
>
> http://localhost/cgi-bin/gbrowse/human
>
> -----------------
> An internal error has occurred
>
> Could not open database.
> install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC
> (@INC contains:
> /System/Library/Perl/5.8.1/darwin-thread-multi-2level
> /System/Library/Perl/5.8.1
> /Library/Perl/5.8.1/darwin-thread-multi-2level /Library/Perl/5.8.1
> /Library/Perl
> /Network/Library/Perl/5.8.1/darwin-thread-multi-2level
> /Network/Library/Perl/5.8.1 /Network/Library/Perl .) at (eval 18)
> line 3. Perhaps the DBD::mysql perl module hasn't been fully
> installed, or perhaps the capitalisation of 'mysql' isn't right.
> Available drivers: ExampleP, Multiplex, Proxy, Sponge.
>  at /Library/Perl/5.8.1/Bio/DB/GFF/Adaptor/dbi/caching_handle.pm
> line 139
>
>
>  Please contact this site's maintainer ([no address given]) for
> assistance.
>
>  For the source code for this browser, see the  Generic Model
> Organism Database Project. For other questions, send mail to
> lstein@cshl.org. $Id: yeast_chr1.conf,v 1.6 2004/02/04 15:03:43
> marclogghe Exp $
>
>  Note: This page uses cookie to save and restore preference
> information. No information is shared.
> Generic genome browser version
> Bio::Graphics::Browser=HASH(0x801294) ------------------
>
> However, this works:
> perl -e "use Bio::DB::GFF::Adaptor::dbi::mysql"
> As does
> perl -e "use DBD::mysql"
>
>
> From the human.conf file:
> [GENERAL]
> description   = human
> db_adaptor    = Bio::DB::GFF
> db_args       = -adaptor dbi::mysql
>                 -dsn     human
>
> Any ideas?
>
> Thanks,
> Sean
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

- -- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFAP1m10CIvUP7P+AkRAhYgAJ98eOkEOIBuVbOsp+p0l64BJOQVIQCfX1g6
BKpWVoq+N2ltfmVYPMazrZY=
=mYfN
-----END PGP SIGNATURE-----
From laurichj at cs.ucr.edu  Thu Feb 26 12:45:35 2004
From: laurichj at cs.ucr.edu (Josh Lauricha)
Date: Fri Feb 27 10:01:00 2004
Subject: [Bioperl-l] Re: Bio ::seqIO ::tigr
In-Reply-To: <BAY12-F52D3jzJUePld0000630b@hotmail.com>
References: <BAY12-F52D3jzJUePld0000630b@hotmail.com>
Message-ID: <94AAA922-6883-11D8-ABEB-000A95BBDAD2@cs.ucr.edu>

Does the source_term_id refer to the source_tag()?

On Feb 26, 2004, at 9:08 AM, matthieu CONTE wrote:

> Ok I manage to use load_seqdatabase !
> But....there is another problem.......
> There is null field and I think Biosql don?t accept this.
> Table Seqfeature id : field 'source_term_id'
>
> Do you think  it will be better to make modifications on the  
> tigrxml.dtd or on the load_seqdatabase script?
>
>
> [conte@bearn biosql]$ perl load_seqdatabase.pl --dbuser biosql  
> --dbpass biosql --namespace orysa_tigr --format tigr  
> /home/conte/pipeline_orthologues/data/orysa_tigr/chr07.xml
> Loading /home/conte/pipeline_orthologues/data/orysa_tigr/chr07.xml ...
>
> -------------------- WARNING ---------------------
> MSG: insert in Bio::DB::BioSQL::SeqFeatureAdaptor (driver) failed,  
> values were ("","1") FKs (26216,37,<NULL>)
> Column 'source_term_id' cannot be null
> ---------------------------------------------------
> Could not store 8355.t01530:
> ------------- EXCEPTION  -------------
> MSG: create: object (Bio::SeqFeature::Generic) failed to insert or to  
> be found by unique key
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create  
> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:207
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store  
> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:253
> STACK Bio::DB::Persistent::PersistentObject::store  
> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/Persistent/ 
> PersistentObject.pm:270
> STACK Bio::DB::BioSQL::SeqAdaptor::store_children  
> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ 
> SeqAdaptor.pm:246
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create  
> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:215
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store  
> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:253
> STACK Bio::DB::Persistent::PersistentObject::store  
> /usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/DB/Persistent/ 
> PersistentObject.pm:270
> STACK (eval) load_seqdatabase.pl:517
> STACK toplevel load_seqdatabase.pl:500
>
>
>
>
>
>
>
>
> -----------------------------------------------------------
> Matthieu CONTE
> M. Sc. in Bioinformatics from SIB
>
> CIRAD-Biotrop TA40/03
> Avenue Agropolis
> 34398 Montpellier Cedex 5
> FRANCE
>
> m_conte@hotmail.com
> tel: (33)04 67 61 60 21
> fax :(33) 4 67 61 56 05
>
> -----------------------------------------------------------
>
>
>
>
>
>> From: Josh Lauricha <laurichj@cs.ucr.edu>
>> To: "matthieu CONTE" <m_conte@hotmail.com>
>> Subject: Re: [Bioperl-l] Re: Bio ::seqIO ::tigr Date: Wed, 25 Feb  
>> 2004 08:50:39 -0800
>>
>> Thanks for pointing out the typos (the other one is my e-mail address  
>> ;).
>>
>> However, based on the size of the file your using (the error is at  
>> line 2892), I am willing to bet they are the .coordset files. These  
>> are not the Tigr XML format. Actually, they are not even valid XML...  
>> If this is the case (check by the extention or, if still in doubt,  
>> open them up. If there is a <TIGR> tag on the first line then its an  
>> error in my parser), Jason wrote a parser for this that I can send to  
>> you.
>>
>> On Feb 25, 2004, at 8:02 AM, matthieu CONTE wrote:
>>
>>> Hi,
>>> I tried your version of tigr.pm Mr Lauricha. There is a typing  
>>> mistake line 820.
>>>
>>> unfortunately it still have another  problem:
>>> "MSG: [2892]Required <ASSEMBLY_SEQUENCE> missing"
>>>
>>> -----------------------------------------------------------
>>> Matthieu CONTE
>>> M. Sc. in Bioinformatics from SIB
>>>
>>> CIRAD-Biotrop TA40/03
>>> Avenue Agropolis
>>> 34398 Montpellier Cedex 5
>>> FRANCE
>>>
>>> m_conte@hotmail.com
>>> tel: (33)04 67 61 60 21
>>> fax :(33) 4 67 61 56 05
>>>
>>> -----------------------------------------------------------
>>>
>>> _________________________________________________________________
>>> MSN Messenger : discutez en direct avec vos amis !  
>>> http://www.msn.fr/msger/default.asp
>>>
>>>
>> Josh Lauricha
>> laurichj@bioinfo.ucr.edu
>> OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8
>> << PGP.sig >>
>
> _________________________________________________________________
> MSN Search, le moteur de recherche qui pense comme vous !  
> http://search.msn.fr/worldwide.asp
>
>
Josh Lauricha
laurichj@bioinfo.ucr.edu
OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8

Josh Lauricha
laurichj@bioinfo.ucr.edu
OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8
Josh Lauricha
laurichj@bioinfo.ucr.edu
OpenPGP: 5A0D 92D3 D093 79DE F724 1137 6DF1 B5EB D9CE AAA8
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 194 bytes
Desc: This is a digitally signed message part
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040226/27dfbabf/PGP.bin
From todd.harris at cshl.edu  Fri Feb 27 10:13:51 2004
From: todd.harris at cshl.edu (Todd Harris)
Date: Fri Feb 27 10:20:09 2004
Subject: [Bioperl-l] bioperl graphics
In-Reply-To: <200402271638.37134.lstein@cshl.edu>
Message-ID: <BC64BACF.C2A7%todd.harris@cshl.edu>

Hi Lincoln - 

I'll add this to GD::SVG next week and drop you a line when complete.

I'm also planning to add code that will allow one to fall back onto GD if a
method has not been mapped to the GD::SVG namespace - and if that still
doesn't work within the SVG gestalt, to die with a modicum of grace.

t

> On 2/27/04 8:38 AM, Lincoln Stein wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Sorry, but if you change colorAllocate() to colorResolve(), you will
> break the ability to generate publication-quality images with
> GD::SVG.  Perhaps Todd Harris will add colorResolve() to a future
> version of GD::SVG, in which case I will make the suggested change to
> Bio::Graphics.
> 
> I would recommend instead making two Bio::Graphics::Panel objects, and
> generating a pair of GD objects (using the Panel->gd() method).  Then
> you can combine them onto a third GD object in whatever geometry you
> want by using GD->copy()
> 
> Lincoln
> 
> On Wednesday 25 February 2004 06:46 pm, Jonathan Crabtree wrote:
>> Haibo-
>> 
>> hz5@njit.edu wrote:
>>> Is there any way to render 2 Bio::Graphics::Panel into one png
>>> image? because I want 2 different arrows with different labeled
>>> coordinates on the same image and align to the left, but one
>>> Panel can only have one coordinates system.
>> 
>> The answer is yes, with a couple of caveats.  The first is that you
>> will have to take care of the layout of the individual
>> Panel-generated images.  If you're left-justifying everything then
>> this should be easy enough.  The second is that I would recommend
>> making a one-line change to Bio/Graphics/Panel.pm, to prevent the
>> package from trying to allocate the same set of colors twice (when
>> you reuse the same GD object to draw the two different parts of the
>> image.)  Search for the following piece of code in Panel.pm (at
>> line 411 in bioperl-1.4, I think):
>> 
>>   for my $name ('white','black',keys %COLORS) {
>>     my $idx = $gd->colorAllocate(@{$COLORS{$name}});
>>     $translation_table{$name} = $idx;
>>   }
>> 
>> Change "colorAllocate" to "colorResolve"; this should have no
>> effect on any existing Bio::Graphics code (AFAIK) and will allow
>> you to do your two (or three or four)-Panel trick.  (As an aside,
>> I'd like to lobby for this one-line change to be made in a future
>> version of
>> Bio::Graphics::Panel, for precisely this reason.)  In any case,
>> once you've made that change and reinstalled your copy of Bioperl,
>> here is a rough outline of what you need to do:
>> 
>> 1. Set up your individual Bio::Graphics::Panel objects (e.g. $p1,
>> $p2, $p3, etc.) as desired to draw your images, but do *not* call
>> the gd method on any of them yet.
>> 
>> 2. Create a GD::Image object big enough to hold the images that
>> will be drawn by $p1, $p2, $p3, etc.:
>>     my $gdImg = GD::Image->new($fullWidth, $fullHeight);
>> (Note: use $p1->width(), $p1->height(), etc., to determine what
>> $fullWidth and $fullHeight should be, based on your desired Panel
>> layout algorithm.)
>> 
>> 4. Use a "dummy" Bio::Graphics::Panel object to allocate all your
>> colors (this is an optional step; I do this because my code does
>> some drawing that isn't handled by Bio::Graphics::Panel, and want
>> to make sure that the palette has been allocated before I start):
>> 
>>     my $dummyPanel = Bio::Graphics::Panel->new(-length => 100,
>> -offset => 0, -width => $fullWidth);
>>     $dummyPanel->gd($gdImg); # forces color allocation
>> 
>> 5. Draw the individual panels and generate your png image:
>> 
>>     $p1->gd($gdImg);
>>     $p2->gd($gdImg);
>>     my $pngData = $gdImg->png();
>> 
>> I've glossed over some of the details here, for example the fact
>> that you may need to know the value of $p1->height() before you can
>> initialize $p2, but that's the basic idea.  I've been using this
>> method to generate some comparative sequence displays and while
>> it's definitely a bit of a hack, it works well in practice.  You
>> can also do the same thing with a GD::SVG::Image if you'd like to
>> generate SVG output.  Good luck,
>> 
>> Jonathan
>> 
>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> - -- 
> Lincoln D. Stein
> Cold Spring Harbor Laboratory
> 1 Bungtown Road
> Cold Spring Harbor, NY 11724
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.1 (GNU/Linux)
> 
> iD8DBQFAP1Zt0CIvUP7P+AkRAreyAJ0XIcjMDeT/Bw69OBOEhD8tsznP+QCfVLWo
> +RnQaijXxPlVWTbmjTkbHYw=
> =lN1U
> -----END PGP SIGNATURE-----
> 

From cjfields at uiuc.edu  Fri Feb 27 10:47:18 2004
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri Feb 27 10:53:27 2004
Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows
In-Reply-To: <200402271645.18011.lstein@cshl.edu>
References: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu>
	<200402271645.18011.lstein@cshl.edu>
Message-ID: <6.0.0.22.2.20040227094211.01bfc0c8@express.cites.uiuc.edu>

I already have it installed an it seems to be fine.

I have just one question (and I don't want to start a flame war): what OS 
do you find that Bioperl works best for?  I'm using a duel-boot system with 
Windows XP and Fedora Core 1, and I've had fewer problems with Fedora (esp. 
when using GBrowse), but I don't know if this is due to the configuration 
of Bioperl, GBrowse, or Perl on either OS.  I'm considering going pure 
Linux within the year (although Mac OS X is looking very appealing).

Chris

At 08:45 AM 2/27/2004, Lincoln Stein wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Hi Chris,
>
>Do you want to try the bioperl-1.4 ppm located on this repository?
>
>         http://www.gmod.org/ggb/ppm
>
>I put it together myself and it's the one that seems to work properly
>for me.
>
>Lincoln
>
>On Wednesday 25 February 2004 07:26 pm, Chris Fields wrote:
> >  I was unable to get the PPM package for 1.4 working for Windows
> > from http:/bioperl.org/DIST and had to perform a workaround.  I
> > decided to post it in case others were running into problems.
> >
> >  When I first tried installing Bioperl using PPM, it installs
> > bioperl 1.2 first (!?!), then allows upgrading to 1.2.3.  However,
> > it will not install 1.4 b/c of the additional dependencies
> > (HTML-Entities and IO-Scalar).  The latter dependencies are notably
> > not req'd for 1.2 or 1.2.3.  IMHO, I'm guessing that PPM can't find
> > these modules b/c it is looking for specific ppm packages named
> > HTML-Entities and IO-Scalar, not for the modules named
> > HTML-Entities and IO-Scalar (which are included in the packages
> > HTML-Parser and IO-stringy).  This problem could be linked to the
> > version of PPM I'm using (3.1) on ActivePerl 5.8.3-809, both of
> > which are very new, so I have no idea if this is a problem with
> > older versions of PPM.
> >
> >  The workaround was to remove the dependencies manually.  I
> > downloaded the relevant ppm tar file and corresponding ppd files
> > (bioperl-1.4-ppm.tar.gz and Bioperl-1.4.ppd, respectively) to a
> > local directory (C:\Perl\Bioperl).  Using a text editor, I removed
> > all references to the added dependencies and saved the file.  More
> > specifically, I deleted the following lines, listed twice under
> > Implementations (so delete both sets!):
> >
> >
> > <DEPENDENCY NAME="HTML-Entities"
> > VERSION="0,0,0,0" />
> > <DEPENDENCY NAME="IO-Scalar" VERSION="0,0,0,0"
> > />
> >
> >
> > I then entered PPM, set up a local ppd repository:
> >
> >  rep add local_bio "C:/Perl/Bioperl"
> >
> >  I then searched for and installed the modifed PPM file and it
> > worked.
> >
> >  Like I said, I don't know if this is a PPM issue or not.  However,
> > I think it might be a good idea to remove those dependencies just
> > in case, as they are a bit redundant (both HTML-Parser and
> > IO-stringy are already listed).
> >
> >  My two cents...
> >  __________________________________
> >
> >
> >
> > Chris Fields - Postdoctoral Researcher
> >  Lab of Dr. Robert Switzer
> >
> >  Address:
> >
> >  University of Illinois at Urbana-Champaign
> >  Dept. of Biochemistry - 323 RAL
> >  600 S. Mathews Ave.
> >  Urbana, IL 61801
> >
> >  Phone : (217) 333-7098
> >  Fax : (217) 244-5858
>
>- --
>Lincoln D. Stein
>Cold Spring Harbor Laboratory
>1 Bungtown Road
>Cold Spring Harbor, NY 11724
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.2.1 (GNU/Linux)
>
>iD8DBQFAP1f90CIvUP7P+AkRAkyOAJ9BmoqcV3DC4zJh392bIveOQ9ec6wCfVMbb
>EKv61liRTU8XfEeQ1yg6EeU=
>=IP7P
>-----END PGP SIGNATURE-----
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l

__________________________________

Chris Fields - Postdoctoral Researcher
Lab of Dr. Robert Switzer

Address:

University of Illinois at Urbana-Champaign
Dept. of Biochemistry - 323 RAL
600 S. Mathews Ave.
Urbana, IL 61801

Phone : (217) 333-7098
Fax : (217) 244-5858 

From brian_osborne at cognia.com  Fri Feb 27 10:57:14 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Fri Feb 27 11:03:26 2004
Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows
In-Reply-To: <6.0.0.22.2.20040227094211.01bfc0c8@express.cites.uiuc.edu>
Message-ID: <GAEDKMGOKFBLJPKCLKCCEEOHDHAA.brian_osborne@cognia.com>

Chris,

I'd always intended to have a double-boot Windows/Linux machine but I
thought I'd check out Cygwin, just for fun. I was so impressed with it
running Bioperl that I decided not to install Linux.

I must use Windows at work so pure Unix is not an option for me. Someone
recently showed me Windows on a Linux machine running VMWare, that was
impressive.

Brian O.


-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Chris Fields
Sent: Friday, February 27, 2004 10:47 AM
To: bioperl-l@bioperl.org; Lincoln Stein
Cc: bioperl-l@bioperl.org
Subject: Re: [Bioperl-l] bioperl-1.4 ppm package for Windows

I already have it installed an it seems to be fine.

I have just one question (and I don't want to start a flame war): what OS
do you find that Bioperl works best for?  I'm using a duel-boot system with
Windows XP and Fedora Core 1, and I've had fewer problems with Fedora (esp.
when using GBrowse), but I don't know if this is due to the configuration
of Bioperl, GBrowse, or Perl on either OS.  I'm considering going pure
Linux within the year (although Mac OS X is looking very appealing).

Chris

At 08:45 AM 2/27/2004, Lincoln Stein wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Hi Chris,
>
>Do you want to try the bioperl-1.4 ppm located on this repository?
>
>         http://www.gmod.org/ggb/ppm
>
>I put it together myself and it's the one that seems to work properly
>for me.
>
>Lincoln
>
>On Wednesday 25 February 2004 07:26 pm, Chris Fields wrote:
> >  I was unable to get the PPM package for 1.4 working for Windows
> > from http:/bioperl.org/DIST and had to perform a workaround.  I
> > decided to post it in case others were running into problems.
> >
> >  When I first tried installing Bioperl using PPM, it installs
> > bioperl 1.2 first (!?!), then allows upgrading to 1.2.3.  However,
> > it will not install 1.4 b/c of the additional dependencies
> > (HTML-Entities and IO-Scalar).  The latter dependencies are notably
> > not req'd for 1.2 or 1.2.3.  IMHO, I'm guessing that PPM can't find
> > these modules b/c it is looking for specific ppm packages named
> > HTML-Entities and IO-Scalar, not for the modules named
> > HTML-Entities and IO-Scalar (which are included in the packages
> > HTML-Parser and IO-stringy).  This problem could be linked to the
> > version of PPM I'm using (3.1) on ActivePerl 5.8.3-809, both of
> > which are very new, so I have no idea if this is a problem with
> > older versions of PPM.
> >
> >  The workaround was to remove the dependencies manually.  I
> > downloaded the relevant ppm tar file and corresponding ppd files
> > (bioperl-1.4-ppm.tar.gz and Bioperl-1.4.ppd, respectively) to a
> > local directory (C:\Perl\Bioperl).  Using a text editor, I removed
> > all references to the added dependencies and saved the file.  More
> > specifically, I deleted the following lines, listed twice under
> > Implementations (so delete both sets!):
> >
> >
> > <DEPENDENCY NAME="HTML-Entities"
> > VERSION="0,0,0,0" />
> > <DEPENDENCY NAME="IO-Scalar" VERSION="0,0,0,0"
> > />
> >
> >
> > I then entered PPM, set up a local ppd repository:
> >
> >  rep add local_bio "C:/Perl/Bioperl"
> >
> >  I then searched for and installed the modifed PPM file and it
> > worked.
> >
> >  Like I said, I don't know if this is a PPM issue or not.  However,
> > I think it might be a good idea to remove those dependencies just
> > in case, as they are a bit redundant (both HTML-Parser and
> > IO-stringy are already listed).
> >
> >  My two cents...
> >  __________________________________
> >
> >
> >
> > Chris Fields - Postdoctoral Researcher
> >  Lab of Dr. Robert Switzer
> >
> >  Address:
> >
> >  University of Illinois at Urbana-Champaign
> >  Dept. of Biochemistry - 323 RAL
> >  600 S. Mathews Ave.
> >  Urbana, IL 61801
> >
> >  Phone : (217) 333-7098
> >  Fax : (217) 244-5858
>
>- --
>Lincoln D. Stein
>Cold Spring Harbor Laboratory
>1 Bungtown Road
>Cold Spring Harbor, NY 11724
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.2.1 (GNU/Linux)
>
>iD8DBQFAP1f90CIvUP7P+AkRAkyOAJ9BmoqcV3DC4zJh392bIveOQ9ec6wCfVMbb
>EKv61liRTU8XfEeQ1yg6EeU=
>=IP7P
>-----END PGP SIGNATURE-----
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l

__________________________________

Chris Fields - Postdoctoral Researcher
Lab of Dr. Robert Switzer

Address:

University of Illinois at Urbana-Champaign
Dept. of Biochemistry - 323 RAL
600 S. Mathews Ave.
Urbana, IL 61801

Phone : (217) 333-7098
Fax : (217) 244-5858

_______________________________________________
Bioperl-l mailing list
Bioperl-l@portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Fri Feb 27 11:16:03 2004
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri Feb 27 11:22:02 2004
Subject: [Bioperl-l] Making gff files for ucsc or ncbi build
Message-ID: <BC64D773.4F0B%sdavis2@mail.nih.gov>

Does anyone know where the build files for build 34_2 are kept these days?
They used to be called "by chromosome" files, or some such thing.  I am
looking to generate gff files and corresponding fasta files for use with
Gbrowse.  If I make them, I would love to make them available to the
community (unless they are out there somewhere already).

Sean

From cjfields at uiuc.edu  Fri Feb 27 12:20:52 2004
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri Feb 27 12:26:59 2004
Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows
In-Reply-To: <GAEDKMGOKFBLJPKCLKCCEEOHDHAA.brian_osborne@cognia.com>
References: <6.0.0.22.2.20040227094211.01bfc0c8@express.cites.uiuc.edu>
	<GAEDKMGOKFBLJPKCLKCCEEOHDHAA.brian_osborne@cognia.com>
Message-ID: <6.0.0.22.2.20040227110605.01c86fa8@express.cites.uiuc.edu>

At 09:57 AM 2/27/2004, you wrote:
>Chris,
>
>I'd always intended to have a double-boot Windows/Linux machine but I
>thought I'd check out Cygwin, just for fun. I was so impressed with it
>running Bioperl that I decided not to install Linux.

I originally installed Linux as a means to an end (using software that 
aren't Windows friendly, like mfold).  It's hard to let go of the luxuries 
of Windows, thought, especially when you have programs like Office, 
SigmaPlot, and others which make benchwork research so much easier (and 
require little to no programming experience).  I really like Linux from a 
number of aspects (open-source, development, etc).  However, I think that 
Apple has really hit upon something with OS X.  It is a nice combination of 
open- and closed-source (I don't mind paying for software,as long as it's 
reasonable) and isn't unreasonably priced.  I get the best of both worlds 
(closed source software like Office, Endnotes, etc. with open-source 
software like MySQL and Apache, with a UNIX-based OS, and nice development 
tools).  Apple also is really pushing the bioinformatics angle.  The 
constant updates for both OS X and Linux make both much more appealing to me.

That's it, my next system is a G5!!!!  Now I'll just have to sell the car...

>I must use Windows at work so pure Unix is not an option for me. Someone
>recently showed me Windows on a Linux machine running VMWare, that was
>impressive.

I have managed to get a few things running under Wine (Windows emulation in 
Linux).  It works for certain things, but I haven't tried it out too much 
b/c I have a dual-boot system.  I just get tired of the an number of 
Windows issues (I use Sun's Java VM and it crawls on Windows XP but flies 
on Linux).

Chris

>Brian O.
>
>
>
>-----Original Message-----
>From: bioperl-l-bounces@portal.open-bio.org
>[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Chris Fields
>Sent: Friday, February 27, 2004 10:47 AM
>To: bioperl-l@bioperl.org; Lincoln Stein
>Cc: bioperl-l@bioperl.org
>Subject: Re: [Bioperl-l] bioperl-1.4 ppm package for Windows
>
>I already have it installed an it seems to be fine.
>
>I have just one question (and I don't want to start a flame war): what OS
>do you find that Bioperl works best for?  I'm using a duel-boot system with
>Windows XP and Fedora Core 1, and I've had fewer problems with Fedora (esp.
>when using GBrowse), but I don't know if this is due to the configuration
>of Bioperl, GBrowse, or Perl on either OS.  I'm considering going pure
>Linux within the year (although Mac OS X is looking very appealing).
>
>Chris
>
>At 08:45 AM 2/27/2004, Lincoln Stein wrote:
> >-----BEGIN PGP SIGNED MESSAGE-----
> >Hash: SHA1
> >
> >Hi Chris,
> >
> >Do you want to try the bioperl-1.4 ppm located on this repository?
> >
> >         http://www.gmod.org/ggb/ppm
> >
> >I put it together myself and it's the one that seems to work properly
> >for me.
> >
> >Lincoln
> >
> >On Wednesday 25 February 2004 07:26 pm, Chris Fields wrote:
> > >  I was unable to get the PPM package for 1.4 working for Windows
> > > from http:/bioperl.org/DIST and had to perform a workaround.  I
> > > decided to post it in case others were running into problems.
> > >
> > >  When I first tried installing Bioperl using PPM, it installs
> > > bioperl 1.2 first (!?!), then allows upgrading to 1.2.3.  However,
> > > it will not install 1.4 b/c of the additional dependencies
> > > (HTML-Entities and IO-Scalar).  The latter dependencies are notably
> > > not req'd for 1.2 or 1.2.3.  IMHO, I'm guessing that PPM can't find
> > > these modules b/c it is looking for specific ppm packages named
> > > HTML-Entities and IO-Scalar, not for the modules named
> > > HTML-Entities and IO-Scalar (which are included in the packages
> > > HTML-Parser and IO-stringy).  This problem could be linked to the
> > > version of PPM I'm using (3.1) on ActivePerl 5.8.3-809, both of
> > > which are very new, so I have no idea if this is a problem with
> > > older versions of PPM.
> > >
> > >  The workaround was to remove the dependencies manually.  I
> > > downloaded the relevant ppm tar file and corresponding ppd files
> > > (bioperl-1.4-ppm.tar.gz and Bioperl-1.4.ppd, respectively) to a
> > > local directory (C:\Perl\Bioperl).  Using a text editor, I removed
> > > all references to the added dependencies and saved the file.  More
> > > specifically, I deleted the following lines, listed twice under
> > > Implementations (so delete both sets!):
> > >
> > >
> > > <DEPENDENCY NAME="HTML-Entities"
> > > VERSION="0,0,0,0" />
> > > <DEPENDENCY NAME="IO-Scalar" VERSION="0,0,0,0"
> > > />
> > >
> > >
> > > I then entered PPM, set up a local ppd repository:
> > >
> > >  rep add local_bio "C:/Perl/Bioperl"
> > >
> > >  I then searched for and installed the modifed PPM file and it
> > > worked.
> > >
> > >  Like I said, I don't know if this is a PPM issue or not.  However,
> > > I think it might be a good idea to remove those dependencies just
> > > in case, as they are a bit redundant (both HTML-Parser and
> > > IO-stringy are already listed).
> > >
> > >  My two cents...
> > >  __________________________________
> > >
> > >
> > >
> > > Chris Fields - Postdoctoral Researcher
> > >  Lab of Dr. Robert Switzer
> > >
> > >  Address:
> > >
> > >  University of Illinois at Urbana-Champaign
> > >  Dept. of Biochemistry - 323 RAL
> > >  600 S. Mathews Ave.
> > >  Urbana, IL 61801
> > >
> > >  Phone : (217) 333-7098
> > >  Fax : (217) 244-5858
> >
> >- --
> >Lincoln D. Stein
> >Cold Spring Harbor Laboratory
> >1 Bungtown Road
> >Cold Spring Harbor, NY 11724
> >-----BEGIN PGP SIGNATURE-----
> >Version: GnuPG v1.2.1 (GNU/Linux)
> >
> >iD8DBQFAP1f90CIvUP7P+AkRAkyOAJ9BmoqcV3DC4zJh392bIveOQ9ec6wCfVMbb
> >EKv61liRTU8XfEeQ1yg6EeU=
> >=IP7P
> >-----END PGP SIGNATURE-----
> >_______________________________________________
> >Bioperl-l mailing list
> >Bioperl-l@portal.open-bio.org
> >http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>__________________________________
>
>Chris Fields - Postdoctoral Researcher
>Lab of Dr. Robert Switzer
>
>Address:
>
>University of Illinois at Urbana-Champaign
>Dept. of Biochemistry - 323 RAL
>600 S. Mathews Ave.
>Urbana, IL 61801
>
>Phone : (217) 333-7098
>Fax : (217) 244-5858
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l

__________________________________

Chris Fields - Postdoctoral Researcher
Lab of Dr. Robert Switzer

Address:

University of Illinois at Urbana-Champaign
Dept. of Biochemistry - 323 RAL
600 S. Mathews Ave.
Urbana, IL 61801

Phone : (217) 333-7098
Fax : (217) 244-5858 

From jrs at denny.farviolet.com  Fri Feb 27 15:27:10 2004
From: jrs at denny.farviolet.com (Jeremy Semeiks)
Date: Fri Feb 27 15:33:01 2004
Subject: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS
In-Reply-To: <009001c3fce5$8624e0e0$5be625d5@txema>
References: <009001c3fce5$8624e0e0$5be625d5@txema>
Message-ID: <20040227202710.GH595@64.81.242.180>

On Fri, Feb 27, 2004 at 04:55:21AM +0100, Jose M? Glez Izarzugaza wrote:
> Hello everyone,
> 
> I'm working with a graph and I need to calculate the values of C and L, to do so, I need an algorithm to calculate the distance to the other elements. 
> 
> A good one is BFS algorithm. 
> 
> I tried to write the script (the algorithm itself) in Perl but I got absolutely lost. 
> 
> Can anyone help me?

Hi Jose,

One solution is to use the Graph::Base module on CPAN:

http://search.cpan.org/~jhi/Graph-0.20101/lib/Graph/Base.pm

This module includes Dijkstra's shortest path algorithm. Dijkstra's is
a few times slower than BFS, but you shouldn't see a difference for
small graphs. (And in my experience, if you're trying to run search
algorithms on large graphs, Perl is too slow anyway -- consider
something like the C++ Boost Graph library instead.)

HTH,
Jeremy
From jason at cgt.duhs.duke.edu  Fri Feb 27 15:53:16 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Fri Feb 27 15:59:16 2004
Subject: [Bioperl-l] Re: [Gmod-schema] install problems, continued (fwd)
Message-ID: <Pine.LNX.4.50.0402271553150.24497-100000@tenero.duhs.duke.edu>


--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu

---------- Forwarded message ----------
Date: Fri, 27 Feb 2004 14:30:57 -0500 (EST)
From: Don Gilbert <gilbertd@bio.indiana.edu>
To: cain@cshl.org, p.lijnzaad@med.uu.nl
Cc: gmod-schema@lists.sourceforge.net
Subject: Re: [Gmod-schema] install problems, continued


THere is a problem with bioperl-1.4 related to ontology parsing.
I'm assuming people in bioperl/ontology world know about this, but
here is what I found -- Bio/Ontology/RelationshipType.pm
has been reverted to remove some of Allen Day's necessary patches
and now it won't parse SO and some of the other standard bio ontology data.

- Don

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Found unknown type of relationship: [derived_from]
Known types are: [IS_A], [PART_OF], [CONTAINS], [FOUND_IN]
STACK: Error::throw
STACK: Bio::Root::Root::throw /bio/biodb/common/perl/lib/Bio/Root/Root.pm:328
STACK: Bio::Ontology::RelationshipType::get_instance /bio/biodb/common/perl/lib/Bio/Ontology/RelationshipType.pm:143
STACK: Bio::Ontology::SimpleGOEngine::add_relationship_type /bio/biodb/common/perl/lib/Bio/Ontology/SimpleGOEngine.pm:284
STACK: Bio::OntologyIO::dagflat::_parse_flat_file /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:556
STACK: Bio::OntologyIO::dagflat::parse /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:250
STACK: Bio::OntologyIO::dagflat::next_ontology /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:284
STACK: /bio/biodb/gmod/bin/gmod_load_ontology.pl:119
-----------------------------------------------------------
Problem loading ontology /bio/biodb/gmod/data/ontologies/song/so.ontology: 65280 at bin/

-- reinstated below comment out, leading to lots of errors
-- UC versus lc terms, and derived_from, other relations not in perl module.


cricket.% diff
  [bioperl-eariler]/Bio/Ontology/RelationshipType.pm
  [bioperl-1.4]/Bio/Ontology/RelationshipType.pm
1c1
< # $Id: RelationshipType.pm,v 1.11 2003/06/20 18:31:44 allenday Exp $
---
> # $Id: RelationshipType.pm,v 1.5.2.5 2003/09/08 12:16:19 heikki Exp $
136,145c136,141
<
< #
< #see the cell ontology.  this code is too strict, even for dag-edit files. -allen
< #
< #    if ( ! (($name eq IS_A) || ($name eq PART_OF) ||
< #         ($name eq CONTAINS) || ( $name eq FOUND_IN ))) {
< #        my $msg = "Found unknown type of relationship: [" . $name . "]\n";
< #        $msg .= "Known types are: [" . IS_A . "], [" . PART_OF . "], [" . CONTAINS . "], [" . FOUND_IN . "]";
< #        $class->throw( $msg );
< #    }
---
>     if ( ! (($name eq IS_A) || ($name eq PART_OF) ||
>           ($name eq CONTAINS) || ( $name eq FOUND_IN ))) {
>         my $msg = "Found unknown type of relationship: [" . $name . "]\n";
>         $msg .= "Known types are: [" . IS_A . "], [" . PART_OF . "], [" . CONTAINS . "], [" . FOUND_IN . "]";
>         $class->throw( $msg );
>     }
364a361,363
>


-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
Gmod-schema mailing list
Gmod-schema@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/gmod-schema
From brian_osborne at cognia.com  Fri Feb 27 12:47:42 2004
From: brian_osborne at cognia.com (Brian Osborne)
Date: Fri Feb 27 16:00:04 2004
Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows
In-Reply-To: <6.0.0.22.2.20040225104734.01c13840@express.cites.uiuc.edu>
Message-ID: <GAEDKMGOKFBLJPKCLKCCMEOKDHAA.brian_osborne@cognia.com>

Chris,

Nigam Shah has fixed package.lst and Bioperl-1.4.ppd in DIST. Thank you for
telling us about this problem.

Brian O.

-----Original Message-----
From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org]On Behalf Of Chris Fields
Sent: Wednesday, February 25, 2004 12:27 PM
To: bioperl-l@bioperl.org
Subject: [Bioperl-l] bioperl-1.4 ppm package for Windows

I was unable to get the PPM package for 1.4 working for Windows from
http:/bioperl.org/DIST and had to perform a workaround.  I decided to post
it in case others were running into problems.

When I first tried installing Bioperl using PPM, it installs bioperl 1.2
first (!?!), then allows upgrading to 1.2.3.  However, it will not install
1.4 b/c of the additional dependencies (HTML-Entities and IO-Scalar).  The
latter dependencies are notably not req'd for 1.2 or 1.2.3.  IMHO, I'm
guessing that PPM can't find these modules b/c it is looking for specific
ppm packages named HTML-Entities and IO-Scalar, not for the modules named
HTML-Entities and IO-Scalar (which are included in the packages HTML-Parser
and IO-stringy).  This problem could be linked to the version of PPM I'm
using (3.1) on ActivePerl 5.8.3-809, both of which are very new, so I have
no idea if this is a problem with older versions of PPM.

The workaround was to remove the dependencies manually.  I downloaded the
relevant ppm tar file and corresponding ppd files (bioperl-1.4-ppm.tar.gz
and Bioperl-1.4.ppd, respectively) to a local directory (C:\Perl\Bioperl).
Using a text editor, I removed all references to the added dependencies and
saved the file.  More specifically, I deleted the following lines, listed
twice under Implementations (so delete both sets!):
<DEPENDENCY NAME="HTML-Entities"
VERSION="0,0,0,0" />
<DEPENDENCY NAME="IO-Scalar" VERSION="0,0,0,0"
/>

I then entered PPM, set up a local ppd repository:

rep add local_bio "C:/Perl/Bioperl"

I then searched for and installed the modifed PPM file and it worked.

Like I said, I don't know if this is a PPM issue or not.  However, I think
it might be a good idea to remove those dependencies just in case, as they
are a bit redundant (both HTML-Parser and IO-stringy are already listed).

My two cents...
__________________________________


Chris Fields - Postdoctoral Researcher
Lab of Dr. Robert Switzer

Address:

University of Illinois at Urbana-Champaign
Dept. of Biochemistry - 323 RAL
600 S. Mathews Ave.
Urbana, IL 61801

Phone : (217) 333-7098
Fax : (217) 244-5858
From cjfields at uiuc.edu  Fri Feb 27 12:05:51 2004
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri Feb 27 16:00:12 2004
Subject: [Bioperl-l] Re: Windows PPM for Bioperl 1.4
In-Reply-To: <000a01c3fd43$675de9f0$34167680@Vivek>
References: <000a01c3fd43$675de9f0$34167680@Vivek>
Message-ID: <6.0.0.22.2.20040227094821.01bfb6b8@express.cites.uiuc.edu>

An HTML attachment was scrubbed...
URL: http://portal.open-bio.org/pipermail/bioperl-l/attachments/20040227/b428f970/attachment-0001.htm
From jason at cgt.duhs.duke.edu  Fri Feb 27 15:58:16 2004
From: jason at cgt.duhs.duke.edu (Jason Stajich)
Date: Fri Feb 27 16:04:15 2004
Subject: [Bioperl-l] Re: [Gmod-schema] install problems, continued (fwd)
In-Reply-To: <Pine.LNX.4.50.0402271553150.24497-100000@tenero.duhs.duke.edu>
References: <Pine.LNX.4.50.0402271553150.24497-100000@tenero.duhs.duke.edu>
Message-ID: <Pine.LNX.4.50.0402271557270.24497-100000@tenero.duhs.duke.edu>

ignore this sorry -- wasn't thinking.  as hilmar replied on gmod-schema
list this was improper comparison with bioperl 1.2.x branch not bioperl
1.4.

--jason
On Fri, 27 Feb 2004, Jason Stajich wrote:

>
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
>
> ---------- Forwarded message ----------
> Date: Fri, 27 Feb 2004 14:30:57 -0500 (EST)
> From: Don Gilbert <gilbertd@bio.indiana.edu>
> To: cain@cshl.org, p.lijnzaad@med.uu.nl
> Cc: gmod-schema@lists.sourceforge.net
> Subject: Re: [Gmod-schema] install problems, continued
>
>
> THere is a problem with bioperl-1.4 related to ontology parsing.
> I'm assuming people in bioperl/ontology world know about this, but
> here is what I found -- Bio/Ontology/RelationshipType.pm
> has been reverted to remove some of Allen Day's necessary patches
> and now it won't parse SO and some of the other standard bio ontology data.
>
> - Don
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Found unknown type of relationship: [derived_from]
> Known types are: [IS_A], [PART_OF], [CONTAINS], [FOUND_IN]
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /bio/biodb/common/perl/lib/Bio/Root/Root.pm:328
> STACK: Bio::Ontology::RelationshipType::get_instance /bio/biodb/common/perl/lib/Bio/Ontology/RelationshipType.pm:143
> STACK: Bio::Ontology::SimpleGOEngine::add_relationship_type /bio/biodb/common/perl/lib/Bio/Ontology/SimpleGOEngine.pm:284
> STACK: Bio::OntologyIO::dagflat::_parse_flat_file /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:556
> STACK: Bio::OntologyIO::dagflat::parse /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:250
> STACK: Bio::OntologyIO::dagflat::next_ontology /bio/biodb/common/perl/lib/Bio/OntologyIO/dagflat.pm:284
> STACK: /bio/biodb/gmod/bin/gmod_load_ontology.pl:119
> -----------------------------------------------------------
> Problem loading ontology /bio/biodb/gmod/data/ontologies/song/so.ontology: 65280 at bin/
>
> -- reinstated below comment out, leading to lots of errors
> -- UC versus lc terms, and derived_from, other relations not in perl module.
>
>
> cricket.% diff
>   [bioperl-eariler]/Bio/Ontology/RelationshipType.pm
>   [bioperl-1.4]/Bio/Ontology/RelationshipType.pm
> 1c1
> < # $Id: RelationshipType.pm,v 1.11 2003/06/20 18:31:44 allenday Exp $
> ---
> > # $Id: RelationshipType.pm,v 1.5.2.5 2003/09/08 12:16:19 heikki Exp $
> 136,145c136,141
> <
> < #
> < #see the cell ontology.  this code is too strict, even for dag-edit files. -allen
> < #
> < #    if ( ! (($name eq IS_A) || ($name eq PART_OF) ||
> < #         ($name eq CONTAINS) || ( $name eq FOUND_IN ))) {
> < #        my $msg = "Found unknown type of relationship: [" . $name . "]\n";
> < #        $msg .= "Known types are: [" . IS_A . "], [" . PART_OF . "], [" . CONTAINS . "], [" . FOUND_IN . "]";
> < #        $class->throw( $msg );
> < #    }
> ---
> >     if ( ! (($name eq IS_A) || ($name eq PART_OF) ||
> >           ($name eq CONTAINS) || ( $name eq FOUND_IN ))) {
> >         my $msg = "Found unknown type of relationship: [" . $name . "]\n";
> >         $msg .= "Known types are: [" . IS_A . "], [" . PART_OF . "], [" . CONTAINS . "], [" . FOUND_IN . "]";
> >         $class->throw( $msg );
> >     }
> 364a361,363
> >
>
>
> -------------------------------------------------------
> SF.Net is sponsored by: Speed Start Your Linux Apps Now.
> Build and deploy apps & Web services for Linux with
> a free DVD software kit from IBM. Click Now!
> http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
> _______________________________________________
> Gmod-schema mailing list
> Gmod-schema@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-schema
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
From natg at shore.net  Fri Feb 27 16:31:05 2004
From: natg at shore.net (Nathan (Nat) Goodman)
Date: Fri Feb 27 16:37:03 2004
Subject: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS
Message-ID: <001101c3fd79$01acae30$de02000a@systemsbiology.net>

Graph::Base is seriously broken.  I urge anyone who's using it to check
the bug list at 
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Graph. 

The developer, Jarkko Hietaniemi, is well aware of the problems and is
working on a rewrite.  In the meantime, I have a very simple graph
package that I'm happy to share.  It handles undirected, unlabelled
graphs only.  It provides depth and breadth first search, all pairs
shortest path, enumeration of all paths in a graph, as well as the
basics.

Best,
Nat
----------

Jose M? Glez Izarzugaza wrote:
>> I'm working with a graph and I need to calculate the values of C and
L, to do so, I need an algorithm to calculate the distance to the other
elements....
>> A good one is BFS algorithm. 

Jeremy replied:
> One solution is to use the Graph::Base module on CPAN:
> 
> http://search.cpan.org/~jhi/Graph-0.20101/lib/Graph/Base.pm


From Steven.Roels at mpi.com  Fri Feb 27 16:59:01 2004
From: Steven.Roels at mpi.com (Roels, Steven)
Date: Fri Feb 27 17:05:54 2004
Subject: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS
Message-ID: <C50012B03D77BC4CBBE8FE39E0B3B31C56B52B@US-VS1.corp.mpi.com>


Nat,

Thanks for the heads-up.

Anyone know off-hand how, if at all, these bugs impact
Bio::Ontology::SimpleGOEngine?

Thanks,

-Steve

*****************************************************************
Steve Roels, Ph.D.                       
Senior Scientist I - Computational Biology     Phone: (617) 761-6820
Millennium Pharmaceuticals, Inc.         FAX:   (617) 577-3555
640 Memorial Drive                       Email: roels@mpi.com
Cambridge, MA 02139-4853
*****************************************************************

>-----Original Message-----
>From: bioperl-l-bounces@portal.open-bio.org
[mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf
>Of Nathan (Nat) Goodman
>Sent: Friday, February 27, 2004 4:31 PM
>To: bioperl-l@portal.open-bio.org
>Cc: 'Jeremy Semeiks'
>Subject: RE: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS
>
>Graph::Base is seriously broken.  I urge anyone who's using it to check
>the bug list at
>http://rt.cpan.org/NoAuth/Bugs.html?Dist=Graph.
>
>The developer, Jarkko Hietaniemi, is well aware of the problems and is
>working on a rewrite.  In the meantime, I have a very simple graph
>package that I'm happy to share.  It handles undirected, unlabelled
>graphs only.  It provides depth and breadth first search, all pairs
>shortest path, enumeration of all paths in a graph, as well as the
>basics.
>
>Best,
>Nat
>----------
>
>Jose M? Glez Izarzugaza wrote:
>>> I'm working with a graph and I need to calculate the values of C and
>L, to do so, I need an algorithm to calculate the distance to the other
>elements....
>>> A good one is BFS algorithm.
>
>Jeremy replied:
>> One solution is to use the Graph::Base module on CPAN:
>>
>> http://search.cpan.org/~jhi/Graph-0.20101/lib/Graph/Base.pm
>
>
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l


This e-mail, including any attachments, is a confidential business communication, and may contain information that is confidential, proprietary and/or privileged.  This e-mail is intended only for the individual(s) to whom it is addressed, and may not be saved, copied, printed, disclosed or used by anyone else.  If you are not the(an) intended recipient, please immediately delete this e-mail from your computer system and notify the sender.  Thank you.


From ruby21rusty at hotmail.com  Fri Feb 27 21:19:52 2004
From: ruby21rusty at hotmail.com (felix)
Date: Fri Feb 27 19:31:36 2004
Subject: [Bioperl-l] 8 times longer than V_I A_G R_A??
Message-ID: <1077934792-8855@excite.com>

Here is an wondefrul way to please your lady.

You can be ready for love for up to thirty-six hours.

The results are far better than any other product.

http://drugsbusiness.com/sv/index.php?pid=eph9106


kleenex binkytango awesome nikita kingdom e-mail new
taffy eclipsetina dougie guess 
groovy mission lucas 

Get off this list by writing to http://drugsbusiness.com/sv/applepie.php

From hlapp at gnf.org  Fri Feb 27 19:45:13 2004
From: hlapp at gnf.org (Hilmar Lapp)
Date: Fri Feb 27 19:51:13 2004
Subject: [Bioperl-l] Re: Bio ::seqIO ::tigr
In-Reply-To: <94AAA922-6883-11D8-ABEB-000A95BBDAD2@cs.ucr.edu>
Message-ID: <5E8C17E6-6987-11D8-B6DE-000A959EB4C4@gnf.org>


On Thursday, February 26, 2004, at 09:45  AM, Josh Lauricha wrote:

> Does the source_term_id refer to the source_tag()?

Yes.

>
> On Feb 26, 2004, at 9:08 AM, matthieu CONTE wrote:
>
>> [conte@bearn biosql]$ perl load_seqdatabase.pl --dbuser biosql 
>> --dbpass biosql --namespace orysa_tigr --format tigr 
>> /home/conte/pipeline_orthologues/data/orysa_tigr/chr07.xml
>> Loading /home/conte/pipeline_orthologues/data/orysa_tigr/chr07.xml ...
>>
>> -------------------- WARNING ---------------------
>> MSG: insert in Bio::DB::BioSQL::SeqFeatureAdaptor (driver) failed, 
>> values were ("","1") FKs (26216,37,<NULL>)
>> Column 'source_term_id' cannot be null
>> ---------------------------------------------------
>>

What this means is that there was no $feat->source_tag set, and looking 
up undef resulted in undef for the foreign key :-)

While bioperl doesn't enforce it, for biosql the source_tag() as well 
as the primary_tag() of a feature are mandatory and cannot be undef. 
Most SeqIO parsers set source_tag() to a static default if there is no 
value, e.g. 'EMBL/GenBank/SwissProt' if you're using FTHelper.pm.

	-hilmar
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------


From hlapp at gnf.org  Fri Feb 27 19:59:59 2004
From: hlapp at gnf.org (Hilmar Lapp)
Date: Fri Feb 27 20:05:59 2004
Subject: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS
In-Reply-To: <C50012B03D77BC4CBBE8FE39E0B3B31C56B52B@US-VS1.corp.mpi.com>
Message-ID: <6EA1F8A2-6989-11D8-B6DE-000A959EB4C4@gnf.org>

Well, that depends on what you want to do with the ontology (or its 
engine, respectively). If all that you want is the methods defined in 
Bio::Ontology::OntologyI, then there is no effect. If you want to 
obtain $ontology->engine->graph() and then run those algorithms that 
are buggy, well then the bugs will have an effect obviously ...

Does this answer your question?

	-hilmar

On Friday, February 27, 2004, at 01:59  PM, Roels, Steven wrote:

>
> Nat,
>
> Thanks for the heads-up.
>
> Anyone know off-hand how, if at all, these bugs impact
> Bio::Ontology::SimpleGOEngine?
>
> Thanks,
>
> -Steve
>
> *****************************************************************
> Steve Roels, Ph.D.
> Senior Scientist I - Computational Biology     Phone: (617) 761-6820
> Millennium Pharmaceuticals, Inc.         FAX:   (617) 577-3555
> 640 Memorial Drive                       Email: roels@mpi.com
> Cambridge, MA 02139-4853
> *****************************************************************
>
>> -----Original Message-----
>> From: bioperl-l-bounces@portal.open-bio.org
> [mailto:bioperl-l-bounces@portal.open-bio.org] On Behalf
>> Of Nathan (Nat) Goodman
>> Sent: Friday, February 27, 2004 4:31 PM
>> To: bioperl-l@portal.open-bio.org
>> Cc: 'Jeremy Semeiks'
>> Subject: RE: [Bioperl-l] OT: Breadth-First Search Algorithm - BFS
>>
>> Graph::Base is seriously broken.  I urge anyone who's using it to 
>> check
>> the bug list at
>> http://rt.cpan.org/NoAuth/Bugs.html?Dist=Graph.
>>
>> The developer, Jarkko Hietaniemi, is well aware of the problems and is
>> working on a rewrite.  In the meantime, I have a very simple graph
>> package that I'm happy to share.  It handles undirected, unlabelled
>> graphs only.  It provides depth and breadth first search, all pairs
>> shortest path, enumeration of all paths in a graph, as well as the
>> basics.
>>
>> Best,
>> Nat
>> ----------
>>
>> Jose M? Glez Izarzugaza wrote:
>>>> I'm working with a graph and I need to calculate the values of C and
>> L, to do so, I need an algorithm to calculate the distance to the 
>> other
>> elements....
>>>> A good one is BFS algorithm.
>>
>> Jeremy replied:
>>> One solution is to use the Graph::Base module on CPAN:
>>>
>>> http://search.cpan.org/~jhi/Graph-0.20101/lib/Graph/Base.pm
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> This e-mail, including any attachments, is a confidential business 
> communication, and may contain information that is confidential, 
> proprietary and/or privileged.  This e-mail is intended only for the 
> individual(s) to whom it is addressed, and may not be saved, copied, 
> printed, disclosed or used by anyone else.  If you are not the(an) 
> intended recipient, please immediately delete this e-mail from your 
> computer system and notify the sender.  Thank you.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------