From Bank.Beszteri at awi.de  Tue Apr  1 08:31:49 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Tue, 01 Apr 2008 14:31:49 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
Message-ID: <47F22B35.1030502@awi.de>

Dear list,

we have recently started to try to find a solution for indexing large 
sequence databases / flat files for a java project, and because we ran 
into problems using biojava, and because both the OBDA and BioSQL ways 
seem to be compatible across bio~ projects, we also started to 
experiment with bioperl. It looks like this should work fine, but we had 
a couple of problems here, too. Perhaps some of you can give me hint 
what we are doing wrong!

The first thing we tried was to use Bio::DB::Flat for indexing a TrEMBL 
flat file (~ 12 GB); but it seems we haven?t got a machine with enough 
memory to be able to handle this. (Perhaps you would be using the "bdb" 
style index in such a case in bioperl, but this apparently doesn?t work 
with biojava, so we had to stick with "flat"). So next we started to 
test BioSQL, by trying to load just Swissprot in a MySQL DB first, like:

load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser xyz 
--dbpass abc --driver mysql --namespace uniprot_sprot --format swiss 
uniprot_sprot.dat

Here we get an error message

###########################################

Loading /biodb/spinkern/uniprot_sprot.dat ...
Could not store Q6DAH5:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: The supplied lineage does not start near 'Erwinia carotovora subsp. 
atroseptica' (I was supplied 'Erwinia carotovora subsp. | Pectobacterium 
| Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | 
Proteobacteria | Bacteria')
STACK: Error::throw
STACK: Bio::Root::Root::throw 
/biodb/spinkern/bioperl-1.5/bioperl-1.5.2_102/Bio/Root/Root.pm:359
STACK: Bio::Species::classification 
/biodb/spinkern/bioperl-1.5/bioperl-1.5.2_102/Bio/Species.pm:174
STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:552 

STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1305 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:973 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:852 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182 

STACK: Bio::DB::Persistent::PersistentObject::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:244 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 

STACK: Bio::DB::Persistent::PersistentObject::store 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271 

STACK: load_seqdatabase.pl:622
-----------------------------------------------------------

at load_seqdatabase.pl line 635

############################################

or similar, depending on whether we use a pre-loaded ncbi taxonomy or 
not, and which Swissprot release we are trying to load. It often seems 
to come from sg. like here, subsp. or other special addition to the 
species line; but alternative genus names and other curious things also 
to appear. It looks like Species.pm tries to validate the species name 
against the lineage info already there in the BioSQL DB, and in several 
cases, it finds inconsistencies. If we start with the ncbi taxonomy 
already loaded in the database, the first error comes much earlier.

I found a thread on the same problem from ~ two years ago 
(http://thread.gmane.org/gmane.comp.lang.perl.bio.general/13766/focus=13788), 
where the solution recommended was to update bioperl, so I was quite 
surprised to find the problem with the version you can see above 
(1.5.2_102 bioperl core, 1.5.2_100 bioperl_db). Can someone give me any 
hints as to what is going wrong here?

The only workaround we have found so far was to comment out line 174 in 
Species.pm:

$self->throw("The supplied lineage does not start near '$name' (I was 
supplied '".join(" | ", @vals)."')");

After doing so, load_seqdatabase.pl runs for several hours (until it 
evetually crashes; I haven?t found out yet why), but proceeds really 
slowly. I also found some info on this for Pg and Oracle in the mailing 
list, but has anyone some approximate numbers for MySQL, how long should 
a first Swissprot load take?

Would be grateful to hear about your ideas / experiences on these issues!

Bank Beszteri


Bioinformatics / Scientific Computing
Alfred Wegener Institute for Polar and Marine Research
Am Handelshafen 12.
27570 Bremerhaven
Germany


From cjfields at uiuc.edu  Tue Apr  1 20:45:28 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 1 Apr 2008 19:45:28 -0500
Subject: [Bioperl-l] quick update on bioperl nightly builds
Message-ID: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>

I'm simplifying the nightly build archive names (removing svn revision  
# and date) in case anyone needs to update bioperl-live/run/db/network  
on a regular basis (read: GBrowse installations).  When I have time  
I'll start working on automated builds, which will require some extra  
work with Module::Build and Build.PL.

chris

From hiekeen at gmail.com  Tue Apr  1 22:14:07 2008
From: hiekeen at gmail.com (Jinyan Huang)
Date: Wed, 2 Apr 2008 10:14:07 +0800
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
Message-ID: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>

I have 20 pathways. My interesting genes are in these pathways. There
are some genes overlaps in these pathways. How can I make a graphic
network using these genes? It means connecting these pathways through
these overlap genes. What kind of software can I use?

Thank you very much in advance.

-- 
Best regards,
Jinyan Huang (ekeen)
School of Life Sciences and Technology, 1302 Room
Tongji University
Siping Road 1239, Shanghai 200092
P.R. China
Tel :0086-21-65981041
Msn: hiekeen at hotmail.com
eMail: hiekeen at gmail.com

From hlapp at gmx.net  Tue Apr  1 22:30:06 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 1 Apr 2008 22:30:06 -0400
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47F22B35.1030502@awi.de>
References: <47F22B35.1030502@awi.de>
Message-ID: <CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>


On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
> [...] So next we started to test BioSQL, by trying to load just  
> Swissprot in a MySQL DB first, like:
>
> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser  
> xyz --dbpass abc --driver mysql --namespace uniprot_sprot --format  
> swiss uniprot_sprot.dat
>
> Here we get an error message
>
> ###########################################
>
> Loading /biodb/spinkern/uniprot_sprot.dat ...
> Could not store Q6DAH5:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: The supplied lineage does not start near 'Erwinia carotovora  
> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |  
> Pectobacterium | Enterobacteriaceae | Enterobacteriales |  
> Gammaproteobacteria | Proteobacteria | Bacteria')
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/ 
> bioperl-1.5.2_102/Bio/Root/Root.pm:359
> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/ 
> bioperl-1.5.2_102/Bio/Species.pm:174
> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 
> 552
> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object / 
> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:1305
> STACK:  
> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:973
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / 
> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:852
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:182
> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 
> 244
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:169
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:251
> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/spinkern/ 
> bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271
> STACK: load_seqdatabase.pl:622
> -----------------------------------------------------------
>
> at load_seqdatabase.pl line 635
>
> ############################################
>
> or similar, depending on whether we use a pre-loaded ncbi taxonomy  
> or not

I recommend to always use a pre-loaded NCBI taxonomy unless you know  
there are only a few organisms that are straightforward (for the  
parser, that is).

> , and which Swissprot release we are trying to load. It often seems  
> to come from sg. like here, subsp. or other special addition to the  
> species line; but alternative genus names and other curious things  
> also to appear. It looks like Species.pm tries to validate the  
> species name against the lineage info already there in the BioSQL  
> DB, and in several cases, it finds inconsistencies.

It actually happens upon a successful lookup when the species object  
is populated from the database.

> [...]
> The only workaround we have found so far was to comment out line  
> 174 in Species.pm:
>
> $self->throw("The supplied lineage does not start near '$name' (I  
> was supplied '".join(" | ", @vals)."')");

That should be OK if you work with a pre-loaded taxonomy. It's sort  
of a sanity check that should catch a parser having messed up a  
species. If you use a pre-loaded NCBI taxonomy the results of the  
species parsing don't matter in all details so long as the NCBI  
taxonID is parsed out correctly, and then found in the database.

Note that this actually a warn() in the main trunk version of  
BioPerl, so you might want to upgrade to that (or change throw() to  
warn() in your version). You still get the records flagged with that,  
but it isn't an exception.

>
> After doing so, load_seqdatabase.pl runs for several hours (until  
> it evetually crashes; I haven?t found out yet why), but proceeds  
> really slowly.

It should certainly *not* crash. Note also that you can supply --safe  
on the command line, in which case the script will continue with the  
next record if one fails to load for whatever reason.

You will want to adjust the width constraint of dbxref.accession, for  
example to 128 chars. This will also be fixed for BioSQL 1.0.1.
See http://bugzilla.open-bio.org/show_bug.cgi?id=2474


> I also found some info on this for Pg and Oracle in the mailing  
> list, but has anyone some approximate numbers for MySQL, how long  
> should a first Swissprot load take?

Possibly around 20 hours according to Erik Rijkers:
See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html

You can use the --logchunks N option to have it print out performance  
statistics every N records.

Hope this helps,

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Apr  1 22:38:12 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 1 Apr 2008 22:38:12 -0400
Subject: [Bioperl-l] Very basic implementation of GenBank XML SeqIO
	module
In-Reply-To: <47F13C2C.4070909@umdnj.edu>
References: <47F13C2C.4070909@umdnj.edu>
Message-ID: <DBDEDED2-656B-4CFD-B603-C0868ED5DAD9@gmx.net>

Ryan - do you not have a committer account?

I do agree with Chris on the test. Modules w/o tests tend to become  
'pseudogenized.'

	-hilmar

On Mar 31, 2008, at 3:31 PM, Ryan Golhar wrote:
> I have a (very) basic SAX implementation of a SeqIO module to parse  
> GenBank XML records.  Right now, it only reads in basic information  
> regarding the sequence and the sequence itself.
>
> It does not yet parse the features table.  Should I submit it to be  
> included in bioperl or wait until I implement more for the features  
> table?  I'm not sure when I'll get around to it though
>
> Ryan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Tue Apr  1 23:12:04 2008
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 01 Apr 2008 23:12:04 -0400
Subject: [Bioperl-l] quick update on bioperl nightly builds
In-Reply-To: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
Message-ID: <1207105924.6184.4.camel@frissell>

Hi Chris,

The tarball is currently (Apr 1) being built in a tmp directory, so that
the extracted tarball is ./tmp/bioperl-live/.  Is that intended?

Thanks,
Scott

On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
> I'm simplifying the nightly build archive names (removing svn revision  
> # and date) in case anyone needs to update bioperl-live/run/db/network  
> on a regular basis (read: GBrowse installations).  When I have time  
> I'll start working on automated builds, which will require some extra  
> work with Module::Build and Build.PL.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory


From cjfields at uiuc.edu  Tue Apr  1 23:59:30 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 1 Apr 2008 22:59:30 -0500
Subject: [Bioperl-l] quick update on bioperl nightly builds
In-Reply-To: <1207105924.6184.4.camel@frissell>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
Message-ID: <D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>

Nope, that isn't intended.  I fixed it and reran it manually, so it  
should be fine now (note I didn't update the log file; the next cron  
run will catch that).

I may toy around with your recent passthrough flag addition to try  
getting automated PPM's up and running.

chris

On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:

> Hi Chris,
>
> The tarball is currently (Apr 1) being built in a tmp directory, so  
> that
> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>
> Thanks,
> Scott
>
> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>> I'm simplifying the nightly build archive names (removing svn  
>> revision
>> # and date) in case anyone needs to update bioperl-live/run/db/ 
>> network
>> on a regular basis (read: GBrowse installations).  When I have time
>> I'll start working on automated builds, which will require some extra
>> work with Module::Build and Build.PL.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Wed Apr  2 07:33:38 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 2 Apr 2008 07:33:38 -0400
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
In-Reply-To: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
Message-ID: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>

On Tue, Apr 1, 2008 at 10:14 PM, Jinyan Huang <hiekeen at gmail.com> wrote:
> I have 20 pathways. My interesting genes are in these pathways. There
>  are some genes overlaps in these pathways. How can I make a graphic
>  network using these genes? It means connecting these pathways through
>  these overlap genes. What kind of software can I use?

R/Bioconductor has tools for working with graphs and pathways.
Cytoscape is another open-source graphical solution.  Ingenuity is, of
course, not free.  If you are looking at a perl solution, you can look
at the various graph modules and their integration with the Graphviz
libraries.

SEan

From cain.cshl at gmail.com  Wed Apr  2 08:28:22 2008
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 02 Apr 2008 08:28:22 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl
	nightly	builds
In-Reply-To: <D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
Message-ID: <1207139302.6507.7.camel@frissell>

Hi Chris,

(trimmed out gbrowse mailing list since this is just bioperl business)

Speaking of the pass through stuff, Sendu mentioned that I stomped on
some changes to Build.PL that you and he did when I committed that
change, so it should be rolled back.  Is there a good (svn) way to do
that?  Or should I just copy the contents of the old (good) Build.PL
into a fresh file in my checkout and commit it?

Thanks,
Scott

On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
> Nope, that isn't intended.  I fixed it and reran it manually, so it  
> should be fine now (note I didn't update the log file; the next cron  
> run will catch that).
> 
> I may toy around with your recent passthrough flag addition to try  
> getting automated PPM's up and running.
> 
> chris
> 
> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > The tarball is currently (Apr 1) being built in a tmp directory, so  
> > that
> > the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
> >
> > Thanks,
> > Scott
> >
> > On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
> >> I'm simplifying the nightly build archive names (removing svn  
> >> revision
> >> # and date) in case anyone needs to update bioperl-live/run/db/ 
> >> network
> >> on a regular basis (read: GBrowse installations).  When I have time
> >> I'll start working on automated builds, which will require some extra
> >> work with Module::Build and Build.PL.
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                         cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory


From robert.citek at gmail.com  Wed Apr  2 08:24:06 2008
From: robert.citek at gmail.com (Robert Citek)
Date: Wed, 2 Apr 2008 07:24:06 -0500
Subject: [Bioperl-l] module for pubchem queries
Message-ID: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>

Hello all,

I have a list of chemical compounds that have some kind of interaction
with proteins or genes.  The current list contains names or SMILES and
I would like to get the CID number for those compounds.  Currently,
I'm using perl to query the NCBI's eutils[1], which works great.  But
I was just curious to know of there was a bioperl module to do
something similar.  A quick google didn't turn up anything, so I
thought I'd ask.

[1] http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html

Regards,
- Robert

From David.Messina at sbc.su.se  Wed Apr  2 08:41:45 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 2 Apr 2008 14:41:45 +0200
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
In-Reply-To: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
Message-ID: <628aabb70804020541v6cee4584ibd9935290ae7cc0a@mail.gmail.com>

I have no personal experience with it, but a colleague of mine suggested
VisANT <http://visant.bu.edu/>.


Dave

From cjfields at uiuc.edu  Wed Apr  2 11:03:32 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 2 Apr 2008 10:03:32 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl nightly
	builds
In-Reply-To: <1207139302.6507.7.camel@frissell>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
	<1207139302.6507.7.camel@frissell>
Message-ID: <3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>

The changes I made were related to problems checking MySQL for  
Bio::DB::SeqFeature::Store tests when connectivity requires username/ 
password.  For some reason it tests DB connectivity up front, while  
Bio::DB::GFF assumes the DB setup is correct (no direct DB check) then  
runs tests assuming the setup is correct.

You can view the diffs for your commits here:

http://code.open-bio.org/svnweb/index.cgi/bioperl/diff/bioperl-live/trunk/ModuleBuildBioperl.pm?revs=14604&revs=14548

http://code.open-bio.org/svnweb/index.cgi/bioperl/diff/bioperl-live/trunk/Build.PL?revs=14604&revs=14565

I'll try working on merging them together today; it shouldn't be too  
hard (the changes were fairly minor in both Build.PL and  
Module::Build).  I'll test to make sure your changes stay in as well.   
Down the road I believe we need to rethink how we want the Build  
process to run using Module::Build as it's a bit convoluted, but it  
works for now.

chris

On Apr 2, 2008, at 7:28 AM, Scott Cain wrote:
> Hi Chris,
>
> (trimmed out gbrowse mailing list since this is just bioperl business)
>
> Speaking of the pass through stuff, Sendu mentioned that I stomped on
> some changes to Build.PL that you and he did when I committed that
> change, so it should be rolled back.  Is there a good (svn) way to do
> that?  Or should I just copy the contents of the old (good) Build.PL
> into a fresh file in my checkout and commit it?
>
> Thanks,
> Scott
>
> On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
>> Nope, that isn't intended.  I fixed it and reran it manually, so it
>> should be fine now (note I didn't update the log file; the next cron
>> run will catch that).
>>
>> I may toy around with your recent passthrough flag addition to try
>> getting automated PPM's up and running.
>>
>> chris
>>
>> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> The tarball is currently (Apr 1) being built in a tmp directory, so
>>> that
>>> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>>>
>>> Thanks,
>>> Scott
>>>
>>> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>>>> I'm simplifying the nightly build archive names (removing svn
>>>> revision
>>>> # and date) in case anyone needs to update bioperl-live/run/db/
>>>> network
>>>> on a regular basis (read: GBrowse installations).  When I have time
>>>> I'll start working on automated builds, which will require some  
>>>> extra
>>>> work with Module::Build and Build.PL.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> ------------------------------------------------------------------------
>>> Scott Cain, Ph. D.                                         cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> -------------------------------------------------------------------------
>> Check out the new SourceForge.net Marketplace.
>> It's the best place to buy or sell services for
>> just about anything Open Source.
>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> Gmod-gbrowse at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Apr  2 11:54:05 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 2 Apr 2008 10:54:05 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl nightly
	builds
In-Reply-To: <3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
	<1207139302.6507.7.camel@frissell>
	<3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>
Message-ID: <71375DA3-A751-4908-8000-D9ACAE39B19C@uiuc.edu>

Okay, committed them.  The accept passthrough still appears to work;  
let me know if anything pops up.

chris

On Apr 2, 2008, at 10:03 AM, Chris Fields wrote:

> ...
> I'll try working on merging them together today; it shouldn't be too  
> hard (the changes were fairly minor in both Build.PL and  
> Module::Build).  I'll test to make sure your changes stay in as  
> well.  Down the road I believe we need to rethink how we want the  
> Build process to run using Module::Build as it's a bit convoluted,  
> but it works for now.
>
> chris
>
> On Apr 2, 2008, at 7:28 AM, Scott Cain wrote:
>> Hi Chris,
>>
>> (trimmed out gbrowse mailing list since this is just bioperl  
>> business)
>>
>> Speaking of the pass through stuff, Sendu mentioned that I stomped on
>> some changes to Build.PL that you and he did when I committed that
>> change, so it should be rolled back.  Is there a good (svn) way to do
>> that?  Or should I just copy the contents of the old (good) Build.PL
>> into a fresh file in my checkout and commit it?
>>
>> Thanks,
>> Scott
>>
>> On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
>>> Nope, that isn't intended.  I fixed it and reran it manually, so it
>>> should be fine now (note I didn't update the log file; the next cron
>>> run will catch that).
>>>
>>> I may toy around with your recent passthrough flag addition to try
>>> getting automated PPM's up and running.
>>>
>>> chris
>>>
>>> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
>>>
>>>> Hi Chris,
>>>>
>>>> The tarball is currently (Apr 1) being built in a tmp directory, so
>>>> that
>>>> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>>>>
>>>> Thanks,
>>>> Scott
>>>>
>>>> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>>>>> I'm simplifying the nightly build archive names (removing svn
>>>>> revision
>>>>> # and date) in case anyone needs to update bioperl-live/run/db/
>>>>> network
>>>>> on a regular basis (read: GBrowse installations).  When I have  
>>>>> time
>>>>> I'll start working on automated builds, which will require some  
>>>>> extra
>>>>> work with Module::Build and Build.PL.
>>>>>
>>>>> chris
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> -- 
>>>> ------------------------------------------------------------------------
>>>> Scott Cain, Ph. D.                                         cain at cshl.edu
>>>> GMOD Coordinator (http://www.gmod.org/)
>>>> 216-392-3087
>>>> Cold Spring Harbor Laboratory
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> -------------------------------------------------------------------------
>>> Check out the new SourceForge.net Marketplace.
>>> It's the best place to buy or sell services for
>>> just about anything Open Source.
>>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>>> _______________________________________________
>>> Gmod-gbrowse mailing list
>>> Gmod-gbrowse at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>> -- 
>> ------------------------------------------------------------------------
>> Scott Cain, Ph. D.                                   cain.cshl at gmail.com
>> GMOD Coordinator (http://www.gmod.org/)                      
>> 216-392-3087
>> Cold Spring Harbor Laboratory
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From zhpan99 at yahoo.com  Wed Apr  2 13:52:46 2008
From: zhpan99 at yahoo.com (Pan Zheng)
Date: Wed, 2 Apr 2008 10:52:46 -0700 (PDT)
Subject: [Bioperl-l] installing bioperl-1.5.2 errors:DB_File
Message-ID: <726978.82400.qm@web53105.mail.re2.yahoo.com>

Hi,
   
  I am installing bioperl-1.5.2_102 under cygwin on my Windows XP and having some errors during the process.
   
  When I was running "perl Build test", one major error is the error about DB_File. I tried to install DB_File from cpan and rpm without any luck.
   
  ++++++++++++++++++++++++
  CPAN: File::Temp loaded ok (v0.16)
CPAN: YAML loaded ok (v0.62)
    CPAN.pm: Going to build P/PM/PMQS/DB_File-1.817.tar.gz
  Parsing config.in...
Looks Good.
Checking if your kit is complete...
Looks good
Note (probably harmless): No library found for -ldb
Writing Makefile for DB_File
cp DB_File.pm blib/lib/DB_File.pm
AutoSplitting blib/lib/DB_File.pm (blib/lib/auto/DB_File)
gcc -c  -I/usr/local/BerkeleyDB/include -DPERL_USE_SAFE_PUTENV -fno-strict-alias
ing -pipe -Wdeclaration-after-statement -DUSEIMPORTLIB -O3   -DVERSION=\"1.817\"
 -DXS_VERSION=\"1.817\"  "-I/usr/lib/perl5/5.8/cygwin/CORE"  -D_NOT_CORE  -DmDB_
Prefix_t=size_t -DmDB_Hash_t=u_int32_t   version.c
version.c:30:16: db.h: No such file or directory
make: *** [version.o] Error 1
  PMQS/DB_File-1.817.tar.gz
  /usr/bin/make -- NOT OK
Running make test
  Can't test without successful make
Running make install
  Make had returned bad status, install seems impossible
Failed during this command:
 PMQS/DB_File-1.817.tar.gz                    : make NO
  +++++++++++++++++++++++++++++++++++++++++++++++
   
   
  I can't remember I had this kind error while installing earlier version.
   
  Would you please help me on DB_File installation ?
   
  Thanks.
   
  Pan

       
---------------------------------
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.

From dr.hogart at gmail.com  Thu Apr  3 09:01:03 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Thu, 03 Apr 2008 17:01:03 +0400
Subject: [Bioperl-l] support of clustalw2 in bio::run::tool::alignment
Message-ID: <op.t81c31ljavnppr@hogart.img.ras.ru>

As for as I understand clustalw2 is not supported in bioperl v1.5.2.100.  
In what version it will be realized?
Thank you in advance.


From slduncan at iastate.edu  Thu Apr  3 14:13:16 2008
From: slduncan at iastate.edu (slduncan at iastate.edu)
Date: Thu, 3 Apr 2008 13:13:16 -0500 (CDT)
Subject: [Bioperl-l] help installing bioperl with cygwin
Message-ID: <161313331084931@webmail.iastate.edu>

I am trying to use cpan to install bioperl and I had an error message saying:
c:\Documents not recognized as and external or internal....
Any ideas here.  Also, I am new to the computer world so please be kind. :)

Stacy Duncan
Iowa State University
Bioinformatics and Computational Biology
1802 University Blvd.
VMRI Building 6
Ames, IA 50011-1240
office phone: (515) 294-8385
office fax: (515) 294-1401
home phone: (336) 965-5622
e-mail: slduncan at iastate.edu


From cjfields at uiuc.edu  Fri Apr  4 16:13:23 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 15:13:23 -0500
Subject: [Bioperl-l] help installing bioperl with cygwin
In-Reply-To: <161313331084931@webmail.iastate.edu>
References: <161313331084931@webmail.iastate.edu>
Message-ID: <B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>

It's best if you use ActiveState's Perl installation (it's the only  
one we really support at this moment, unless someone wants to give  
StrawberryPerl a run).  See:

http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows

chris

On Apr 3, 2008, at 1:13 PM, slduncan at iastate.edu wrote:

> I am trying to use cpan to install bioperl and I had an error  
> message saying:
> c:\Documents not recognized as and external or internal....
> Any ideas here.  Also, I am new to the computer world so please be  
> kind. :)
>
> Stacy Duncan
> Iowa State University
> Bioinformatics and Computational Biology
> 1802 University Blvd.
> VMRI Building 6
> Ames, IA 50011-1240
> office phone: (515) 294-8385
> office fax: (515) 294-1401
> home phone: (336) 965-5622
> e-mail: slduncan at iastate.edu
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Apr  4 16:07:12 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 15:07:12 -0500
Subject: [Bioperl-l] installing bioperl-1.5.2 errors:DB_File
In-Reply-To: <726978.82400.qm@web53105.mail.re2.yahoo.com>
References: <726978.82400.qm@web53105.mail.re2.yahoo.com>
Message-ID: <F786C444-6A18-4AA5-8AE8-6C0ECEEACC5E@uiuc.edu>

I think you have to use the cygwin installer to install DB_File (it  
also installs dependencies, such as BDB).  According to 'perldoc  
perlcygwin':

....
Optional Libraries for Perl on Cygwin

Several Perl functions and modules depend on the existence of some  
optional libraries. Configure will find them if they are installed in  
one of the directories listed as being used for library searches. Pre- 
built packages for most of these are available from the Cygwin  
installer.
....

chris
On Apr 2, 2008, at 12:52 PM, Pan Zheng wrote:

> Hi,
>
>  I am installing bioperl-1.5.2_102 under cygwin on my Windows XP and  
> having some errors during the process.
>
>  When I was running "perl Build test", one major error is the error  
> about DB_File. I tried to install DB_File from cpan and rpm without  
> any luck.
>
>  ++++++++++++++++++++++++
>  CPAN: File::Temp loaded ok (v0.16)
> CPAN: YAML loaded ok (v0.62)
>    CPAN.pm: Going to build P/PM/PMQS/DB_File-1.817.tar.gz
>  Parsing config.in...
> Looks Good.
> Checking if your kit is complete...
> Looks good
> Note (probably harmless): No library found for -ldb
> Writing Makefile for DB_File
> cp DB_File.pm blib/lib/DB_File.pm
> AutoSplitting blib/lib/DB_File.pm (blib/lib/auto/DB_File)
> gcc -c  -I/usr/local/BerkeleyDB/include -DPERL_USE_SAFE_PUTENV -fno- 
> strict-alias
> ing -pipe -Wdeclaration-after-statement -DUSEIMPORTLIB -O3   - 
> DVERSION=\"1.817\"
> -DXS_VERSION=\"1.817\"  "-I/usr/lib/perl5/5.8/cygwin/CORE"  - 
> D_NOT_CORE  -DmDB_
> Prefix_t=size_t -DmDB_Hash_t=u_int32_t   version.c
> version.c:30:16: db.h: No such file or directory
> make: *** [version.o] Error 1
>  PMQS/DB_File-1.817.tar.gz
>  /usr/bin/make -- NOT OK
> Running make test
>  Can't test without successful make
> Running make install
>  Make had returned bad status, install seems impossible
> Failed during this command:
> PMQS/DB_File-1.817.tar.gz                    : make NO
>  +++++++++++++++++++++++++++++++++++++++++++++++
>
>
>  I can't remember I had this kind error while installing earlier  
> version.
>
>  Would you please help me on DB_File installation ?
>
>  Thanks.
>
>  Pan
>
>
> ---------------------------------
> You rock. That's why Blockbuster's offering you one month of  
> Blockbuster Total Access, No Cost.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Apr  4 17:25:41 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 16:25:41 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
Message-ID: <15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>

Do you need something to access eutils via BioPerl, or are you looking  
for a specific set of classes?  I wrote an interface to eutils  
(Bio::DB::EUtilities), you could do something like this:

#!/usr/bin/perl -w

use strict;
use warnings;
use Bio::DB::EUtilities;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
                                      -term => 'dihydroorotate',
                                      -db => 'pcsubstance',
                                      -retmax => 1000);

print join(',',$eutil->get_ids)."\n";

chris

On Apr 2, 2008, at 7:24 AM, Robert Citek wrote:

> Hello all,
>
> I have a list of chemical compounds that have some kind of interaction
> with proteins or genes.  The current list contains names or SMILES and
> I would like to get the CID number for those compounds.  Currently,
> I'm using perl to query the NCBI's eutils[1], which works great.  But
> I was just curious to know of there was a bioperl module to do
> something similar.  A quick google didn't turn up anything, so I
> thought I'd ask.
>
> [1] http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
>
> Regards,
> - Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ekeen at mail.tongji.edu.cn  Mon Apr  7 02:57:04 2008
From: ekeen at mail.tongji.edu.cn (Jinyan Huang)
Date: Mon, 7 Apr 2008 14:57:04 +0800
Subject: [Bioperl-l] How to analysis the relationship of my interesting KEGG
	pathways?
Message-ID: <fb5dae380804062357ka7de019kb3451a5e169c0bf4@mail.gmail.com>

In my research, I got 25 interesting pathways. I want to know the
regulated relationship of these pathways. It is better if there some
software to connect these KEGG pathways.

Thank you very much in advance.

From miguel.pignatelli at uv.es  Mon Apr  7 06:12:58 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Mon, 07 Apr 2008 12:12:58 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
Message-ID: <47F9F3AA.2090003@uv.es>

Hi all,

Is there any way to obtain the date of creation of individual GenBank 
entries? I don't mean the "last revision" date that can be found in the 
first line of a GenBank file.

I can access this creation date by looking at the "revision history" of 
any GenBank entry (for example, see
http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105), 
but I need a systematic (and local=fast) way to access this information.

Any help would be very appreciated,
Thank you very much in advance,

M;

From Bank.Beszteri at awi.de  Mon Apr  7 07:46:43 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Mon, 07 Apr 2008 13:46:43 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
References: <47F22B35.1030502@awi.de>
	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
Message-ID: <47FA09A3.2070004@awi.de>

Hi Hilmar,

it was important to understand that the inconsistency in taxon names is 
apparently only between the Swissprot entries with "non-standard" names 
and the contents of the taxonomy tables and that it is best to use a 
pre-loaded taxonomy, thanks for that! We have now updated to 
bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have 
loaded everything OK in ~26 hours (with many of the "The supplied 
lineage does not start near..." warnings, but no other problems). Our 
next test is to try to load trembl (will try to do this in parallel in 
multiple chunks), hope it will work just as nicely!

Thanks for your tips & insights!

Bank

Hilmar Lapp wrote:

>
> On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
>
>> [...] So next we started to test BioSQL, by trying to load just  
>> Swissprot in a MySQL DB first, like:
>>
>> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser  
>> xyz --dbpass abc --driver mysql --namespace uniprot_sprot --format  
>> swiss uniprot_sprot.dat
>>
>> Here we get an error message
>>
>> ###########################################
>>
>> Loading /biodb/spinkern/uniprot_sprot.dat ...
>> Could not store Q6DAH5:
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: The supplied lineage does not start near 'Erwinia carotovora  
>> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |  
>> Pectobacterium | Enterobacteriaceae | Enterobacteriales |  
>> Gammaproteobacteria | Proteobacteria | Bacteria')
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/ 
>> bioperl-1.5.2_102/Bio/Root/Root.pm:359
>> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/ 
>> bioperl-1.5.2_102/Bio/Species.pm:174
>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 552
>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object / 
>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:1305
>> STACK:  Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
>> /biodb/ spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:973
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / 
>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:852
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:182
>> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 244
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:169
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:251
>> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/spinkern/ 
>> bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271
>> STACK: load_seqdatabase.pl:622
>> -----------------------------------------------------------
>>
>> at load_seqdatabase.pl line 635
>>
>> ############################################
>>
>> or similar, depending on whether we use a pre-loaded ncbi taxonomy  
>> or not
>
>
> I recommend to always use a pre-loaded NCBI taxonomy unless you know  
> there are only a few organisms that are straightforward (for the  
> parser, that is).
>
>> , and which Swissprot release we are trying to load. It often seems  
>> to come from sg. like here, subsp. or other special addition to the  
>> species line; but alternative genus names and other curious things  
>> also to appear. It looks like Species.pm tries to validate the  
>> species name against the lineage info already there in the BioSQL  
>> DB, and in several cases, it finds inconsistencies.
>
>
> It actually happens upon a successful lookup when the species object  
> is populated from the database.
>
>> [...]
>> The only workaround we have found so far was to comment out line  174 
>> in Species.pm:
>>
>> $self->throw("The supplied lineage does not start near '$name' (I  
>> was supplied '".join(" | ", @vals)."')");
>
>
> That should be OK if you work with a pre-loaded taxonomy. It's sort  
> of a sanity check that should catch a parser having messed up a  
> species. If you use a pre-loaded NCBI taxonomy the results of the  
> species parsing don't matter in all details so long as the NCBI  
> taxonID is parsed out correctly, and then found in the database.
>
> Note that this actually a warn() in the main trunk version of  
> BioPerl, so you might want to upgrade to that (or change throw() to  
> warn() in your version). You still get the records flagged with that,  
> but it isn't an exception.
>
>>
>> After doing so, load_seqdatabase.pl runs for several hours (until  it 
>> evetually crashes; I haven?t found out yet why), but proceeds  really 
>> slowly.
>
>
> It should certainly *not* crash. Note also that you can supply --safe  
> on the command line, in which case the script will continue with the  
> next record if one fails to load for whatever reason.
>
> You will want to adjust the width constraint of dbxref.accession, for  
> example to 128 chars. This will also be fixed for BioSQL 1.0.1.
> See http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>
>
>> I also found some info on this for Pg and Oracle in the mailing  
>> list, but has anyone some approximate numbers for MySQL, how long  
>> should a first Swissprot load take?
>
>
> Possibly around 20 hours according to Erik Rijkers:
> See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html
>
> You can use the --logchunks N option to have it print out performance  
> statistics every N records.
>
> Hope this helps,
>
>     -hilmar


From cjfields at uiuc.edu  Mon Apr  7 08:32:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 07:32:45 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FA09A3.2070004@awi.de>
References: <47F22B35.1030502@awi.de>
	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
	<47FA09A3.2070004@awi.de>
Message-ID: <E8A1ED59-830D-473F-8818-1BAC4E0A2FA2@uiuc.edu>

The warnings are something that we still need to resolve, but the only  
fix I can think of likely breaks backward compatibility with older  
bioperl-db installations (i.e. storing the given scientific name  
instead of the binomial name, which is used as a fallback when no  
taxid is found).  There is a full explanation here:

http://bugzilla.open-bio.org/show_bug.cgi?id=2092

Anyway, I think it needs further testing when someone, likely Hilmar  
or I, have time.

chris

On Apr 7, 2008, at 6:46 AM, B?nk Beszteri wrote:

> Hi Hilmar,
>
> it was important to understand that the inconsistency in taxon names  
> is apparently only between the Swissprot entries with "non-standard"  
> names and the contents of the taxonomy tables and that it is best to  
> use a pre-loaded taxonomy, thanks for that! We have now updated to  
> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to  
> have loaded everything OK in ~26 hours (with many of the "The  
> supplied lineage does not start near..." warnings, but no other  
> problems). Our next test is to try to load trembl (will try to do  
> this in parallel in multiple chunks), hope it will work just as  
> nicely!
>
> Thanks for your tips & insights!
>
> Bank
>
> Hilmar Lapp wrote:
>
>>
>> On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
>>
>>> [...] So next we started to test BioSQL, by trying to load just   
>>> Swissprot in a MySQL DB first, like:
>>>
>>> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser   
>>> xyz --dbpass abc --driver mysql --namespace uniprot_sprot -- 
>>> format  swiss uniprot_sprot.dat
>>>
>>> Here we get an error message
>>>
>>> ###########################################
>>>
>>> Loading /biodb/spinkern/uniprot_sprot.dat ...
>>> Could not store Q6DAH5:
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: The supplied lineage does not start near 'Erwinia carotovora   
>>> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |   
>>> Pectobacterium | Enterobacteriaceae | Enterobacteriales |   
>>> Gammaproteobacteria | Proteobacteria | Bacteria')
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/  
>>> bioperl-1.5.2_102/Bio/Root/Root.pm:359
>>> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/  
>>> bioperl-1.5.2_102/Bio/Species.pm:174
>>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm: 552
>>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object /  
>>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:1305
>>> STACK:   
>>> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key / 
>>> biodb/ spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:973
>>> STACK:  
>>> Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key /  
>>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:852
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:182
>>> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm: 244
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:169
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:251
>>> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/ 
>>> spinkern/ bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm:271
>>> STACK: load_seqdatabase.pl:622
>>> -----------------------------------------------------------
>>>
>>> at load_seqdatabase.pl line 635
>>>
>>> ############################################
>>>
>>> or similar, depending on whether we use a pre-loaded ncbi  
>>> taxonomy  or not
>>
>>
>> I recommend to always use a pre-loaded NCBI taxonomy unless you  
>> know  there are only a few organisms that are straightforward (for  
>> the  parser, that is).
>>
>>> , and which Swissprot release we are trying to load. It often  
>>> seems  to come from sg. like here, subsp. or other special  
>>> addition to the  species line; but alternative genus names and  
>>> other curious things  also to appear. It looks like Species.pm  
>>> tries to validate the  species name against the lineage info  
>>> already there in the BioSQL  DB, and in several cases, it finds  
>>> inconsistencies.
>>
>>
>> It actually happens upon a successful lookup when the species  
>> object  is populated from the database.
>>
>>> [...]
>>> The only workaround we have found so far was to comment out line   
>>> 174 in Species.pm:
>>>
>>> $self->throw("The supplied lineage does not start near '$name' (I   
>>> was supplied '".join(" | ", @vals)."')");
>>
>>
>> That should be OK if you work with a pre-loaded taxonomy. It's  
>> sort  of a sanity check that should catch a parser having messed up  
>> a  species. If you use a pre-loaded NCBI taxonomy the results of  
>> the  species parsing don't matter in all details so long as the  
>> NCBI  taxonID is parsed out correctly, and then found in the  
>> database.
>>
>> Note that this actually a warn() in the main trunk version of   
>> BioPerl, so you might want to upgrade to that (or change throw()  
>> to  warn() in your version). You still get the records flagged with  
>> that,  but it isn't an exception.
>>
>>>
>>> After doing so, load_seqdatabase.pl runs for several hours (until   
>>> it evetually crashes; I haven?t found out yet why), but proceeds   
>>> really slowly.
>>
>>
>> It should certainly *not* crash. Note also that you can supply -- 
>> safe  on the command line, in which case the script will continue  
>> with the  next record if one fails to load for whatever reason.
>>
>> You will want to adjust the width constraint of dbxref.accession,  
>> for  example to 128 chars. This will also be fixed for BioSQL 1.0.1.
>> See http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>>
>>
>>> I also found some info on this for Pg and Oracle in the mailing   
>>> list, but has anyone some approximate numbers for MySQL, how long   
>>> should a first Swissprot load take?
>>
>>
>> Possibly around 20 hours according to Erik Rijkers:
>> See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html
>>
>> You can use the --logchunks N option to have it print out  
>> performance  statistics every N records.
>>
>> Hope this helps,
>>
>>    -hilmar
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Apr  7 08:34:00 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 07 Apr 2008 13:34:00 +0100
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FA09A3.2070004@awi.de>
References: <47F22B35.1030502@awi.de>	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
	<47FA09A3.2070004@awi.de>
Message-ID: <47FA14B8.7000500@sendu.me.uk>

B?nk Beszteri wrote:
> Hi Hilmar,
> 
> it was important to understand that the inconsistency in taxon names is 
> apparently only between the Swissprot entries with "non-standard" names 
> and the contents of the taxonomy tables and that it is best to use a 
> pre-loaded taxonomy, thanks for that! We have now updated to 
> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have 
> loaded everything OK in ~26 hours (with many of the "The supplied 
> lineage does not start near..." warnings, but no other problems).

Can you provide some examples of these warnings (of the taxons that 
cause them)? If there's anything consistent about them perhaps 
Bio::Species can be improved to accommodate them properly (instead of 
just issuing the warning and getting the classification wrong).


From heikki at sanbi.ac.za  Mon Apr  7 08:48:34 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Mon, 7 Apr 2008 14:48:34 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47F9F3AA.2090003@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
Message-ID: <200804071448.34769.heikki@sanbi.ac.za>

Miguel,

You probably know this but:

- Your entry example below is a GenPept entry, not a GenBank entry
- The NCBI sequence format "genbank" has only the last modified date.
   I do not know about other formats (ASN.1, ...)
- NCBI Entrez is a great tool but it obscures the source database.
- If you really are working on real GenBank entries, you can use the accession 
number to see find corresponding EMBL (and Swiss-Prot) flat file formats that 
have both creation and last modified dates.

Post to the list if you have trouble getting the dates from EMBL/Swiss-Prot 
formats using bioperl.

Yours,

	-Heikki

On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
> Hi all,
>
> Is there any way to obtain the date of creation of individual GenBank
> entries? I don't mean the "last revision" date that can be found in the
> first line of a GenBank file.
>
> I can access this creation date by looking at the "revision history" of
> any GenBank entry (for example, see
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> but I need a systematic (and local=fast) way to access this information.
>
> Any help would be very appreciated,
> Thank you very much in advance,
>
> M;
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From granjeau at tagc.univ-mrs.fr  Mon Apr  7 09:30:10 2008
From: granjeau at tagc.univ-mrs.fr (Samuel GRANJEAUD - IR/ICIM)
Date: Mon, 07 Apr 2008 15:30:10 +0200
Subject: [Bioperl-l] help installing bioperl with cygwin
In-Reply-To: <B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>
References: <161313331084931@webmail.iastate.edu>
	<B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>
Message-ID: <47FA21E2.3010602@tagc.univ-mrs.fr>

Hi,

I'm using BioPerl under Cygwin, because Cygwin allows one to work in a 
Unix-like environment in a command line point of view.

So, I use the CVS version which runs out of the box
http://www.bioperl.org/wiki/Using_CVS
which has been replaced by SVN at the beginning of the year
http://www.bioperl.org/wiki/Using_Subversion

So if you really want to work under Cygwin, you can try this quick and 
dirty way, but you still have to become experienced because BioPerl is 
not supported under Cygwin.

You may try Strawberry, but in my experience in installing wxPerl, 
wxPerl fails on both flavours of Perl. ActiveState's Perl is still the 
easiest way to install many packages.

Regards,
Samuel


Chris Fields wrote:
> It's best if you use ActiveState's Perl installation (it's the only 
> one we really support at this moment, unless someone wants to give 
> StrawberryPerl a run).  See:
>
> http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
>
> chris
>
> On Apr 3, 2008, at 1:13 PM, slduncan at iastate.edu wrote:
>
>> I am trying to use cpan to install bioperl and I had an error message 
>> saying:
>> c:\Documents not recognized as and external or internal....
>> Any ideas here.  Also, I am new to the computer world so please be 
>> kind. :)
>>
>> Stacy Duncan
>> Iowa State University
>> Bioinformatics and Computational Biology
>> 1802 University Blvd.
>> VMRI Building 6
>> Ames, IA 50011-1240
>> office phone: (515) 294-8385
>> office fax: (515) 294-1401
>> home phone: (336) 965-5622
>> e-mail: slduncan at iastate.edu
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 

Samuel GRANJEAUD                   granjeau at tagc.univ-mrs.fr
INSERM - ICIM - TAGC               Tel: +33  (0)491 82 87 24
http://tagc.univ-mrs.fr            Fax: +33  (0)491 82 87 01
http://icim.marseille.inserm.fr/proteomique


From er at xs4all.nl  Mon Apr  7 10:36:57 2008
From: er at xs4all.nl (Erik)
Date: Mon, 7 Apr 2008 16:36:57 +0200 (CEST)
Subject: [Bioperl-l] Indexing large databases / BioSQL
Message-ID: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>

On Mon, April 7, 2008 14:34, Sendu Bala wrote:
> B?nk Beszteri wrote:
>> Hi Hilmar,
>>
>> it was important to understand that the inconsistency in taxon names is
>> apparently only between the Swissprot entries with "non-standard" names
>> and the contents of the taxonomy tables and that it is best to use a
>> pre-loaded taxonomy, thanks for that! We have now updated to
>> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have
>> loaded everything OK in ~26 hours (with many of the "The supplied
>> lineage does not start near..." warnings, but no other problems).
>
> Can you provide some examples of these warnings (of the taxons that
> cause them)? If there's anything consistent about them perhaps
> Bio::Species can be improved to accommodate them properly (instead of
> just issuing the warning and getting the classification wrong).
>

I did this a little while ago and saved the output
(UniProtKB/Swiss-Prot Release 55.1 of 18-Mar-2008, I think).

All warnings (and a few errors) for swissprot are here:

   http://bugzilla.open-bio.org/show_bug.cgi?id=2474

as an attached file

I suppose the OP will have encountered similar output - I don't think there is
much RDBMS-type-dependency involved.

   regards,

   Erik Rijkers


From cjfields at uiuc.edu  Mon Apr  7 11:46:01 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 10:46:01 -0500
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <200804071448.34769.heikki@sanbi.ac.za>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
Message-ID: <2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>

Strangely enough, if you use NCBI's esummary you can get both dates.   
Via Bio::DB::EUtilities in bioperl-live, if you dump out DocSum data  
(using a debugging method I added in a while back):

---------------------------------------

use Bio::DB::EUtilities;

# for multiple IDs use an array ref; also only use GI's (not accessions)
my $factory = Bio::DB::EUtilities->new(
                         -eutil => 'esummary',
                         -db => 'protein',
                         -id => 1621261);

$factory->print_DocSums;

---------------------------------------

One gets the following tag/value pairs:

UID: 1621261
Caption             :CAB02640
Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN  
PYRR [Mycobacterium tuberculosis
		     H37Rv]
Extra               :gi|1621261|emb|CAB02640.1|[1621261]
Gi                  :1621261
CreateDate          :2003/11/21
UpdateDate          :2006/11/14
Flags               :
TaxId               :83332
Length              :193
Status              :live
ReplacedBy          :
Comment             :

I'll add in a method to grab the data element by tag (in this case,  
grab the creation date by asking for the 'CreateDate' key).  Might  
come in handy for scripts.

chris

On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:

> Miguel,
>
> You probably know this but:
>
> - Your entry example below is a GenPept entry, not a GenBank entry
> - The NCBI sequence format "genbank" has only the last modified date.
>   I do not know about other formats (ASN.1, ...)
> - NCBI Entrez is a great tool but it obscures the source database.
> - If you really are working on real GenBank entries, you can use the  
> accession
> number to see find corresponding EMBL (and Swiss-Prot) flat file  
> formats that
> have both creation and last modified dates.
>
> Post to the list if you have trouble getting the dates from EMBL/ 
> Swiss-Prot
> formats using bioperl.
>
> Yours,
>
> 	-Heikki
>
> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in  
>> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision  
>> history" of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi? 
>> val=74311105),
>> but I need a systematic (and local=fast) way to access this  
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From miguel.pignatelli at uv.es  Mon Apr  7 12:24:50 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Mon, 07 Apr 2008 18:24:50 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
Message-ID: <47FA4AD2.5030206@uv.es>


I've noticed that the ASN.1 version of those records has a 
"creation-date" tag.
But this is somehow strange, because the creation date obtained by you 
and that obtained via ASN.1 format is 2003/11/21, but if you look at the 
revision history of the record:

http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=CAB02640

reports a creation date of "Oct 19 1996 12:28 AM"

I don't know how to get this, because the EMBL version of this gene:

http://www.ebi.ac.uk/cgi-bin/dbfetch?db=emblcds&id=CAB02640&style=raw

doesn't has DT fields at all.

M;


Chris Fields wrote:
> Strangely enough, if you use NCBI's esummary you can get both dates.  
> Via Bio::DB::EUtilities in bioperl-live, if you dump out DocSum data 
> (using a debugging method I added in a while back):
> 
> ---------------------------------------
> 
> use Bio::DB::EUtilities;
> 
> # for multiple IDs use an array ref; also only use GI's (not accessions)
> my $factory = Bio::DB::EUtilities->new(
>                         -eutil => 'esummary',
>                         -db => 'protein',
>                         -id => 1621261);
> 
> $factory->print_DocSums;
> 
> ---------------------------------------
> 
> One gets the following tag/value pairs:
> 
> UID: 1621261
> Caption             :CAB02640
> Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN PYRR 
> [Mycobacterium tuberculosis
>              H37Rv]
> Extra               :gi|1621261|emb|CAB02640.1|[1621261]
> Gi                  :1621261
> CreateDate          :2003/11/21
> UpdateDate          :2006/11/14
> Flags               :
> TaxId               :83332
> Length              :193
> Status              :live
> ReplacedBy          :
> Comment             :
> 
> I'll add in a method to grab the data element by tag (in this case, grab 
> the creation date by asking for the 'CreateDate' key).  Might come in 
> handy for scripts.
> 
> chris
> 
> On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:
> 
>> Miguel,
>>
>> You probably know this but:
>>
>> - Your entry example below is a GenPept entry, not a GenBank entry
>> - The NCBI sequence format "genbank" has only the last modified date.
>>   I do not know about other formats (ASN.1, ...)
>> - NCBI Entrez is a great tool but it obscures the source database.
>> - If you really are working on real GenBank entries, you can use the 
>> accession
>> number to see find corresponding EMBL (and Swiss-Prot) flat file 
>> formats that
>> have both creation and last modified dates.
>>
>> Post to the list if you have trouble getting the dates from 
>> EMBL/Swiss-Prot
>> formats using bioperl.
>>
>> Yours,
>>
>>     -Heikki
>>
>> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>>> Hi all,
>>>
>>> Is there any way to obtain the date of creation of individual GenBank
>>> entries? I don't mean the "last revision" date that can be found in the
>>> first line of a GenBank file.
>>>
>>> I can access this creation date by looking at the "revision history" of
>>> any GenBank entry (for example, see
>>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>>> but I need a systematic (and local=fast) way to access this information.
>>>
>>> Any help would be very appreciated,
>>> Thank you very much in advance,
>>>
>>> M;
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> -- 
>> ______ _/      _/_____________________________________________________
>>      _/      _/
>>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>>  _/  _/  _/  University of Western Cape, South Africa
>>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/________________________________________________________
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> 

From cjfields at uiuc.edu  Mon Apr  7 13:48:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 12:48:45 -0500
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FA4AD2.5030206@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
	<47FA4AD2.5030206@uv.es>
Message-ID: <CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>

Note in the example I gave that, during the revision history, the  
DBSOURCE changed at the point of the creation date (the original nuc.  
record was a M. tuberculosis contig sequence, which later changed to  
an updated full M. tuberculosis genome record at the time of the  
'create date').

Couldn't find anything specific in the GenBank docs on this, but it  
appears (at least for a protein record) the creation date reflects the  
date in which the sequence was either originally deposited or  
originally derived from the nucleotide source record present in the  
record.  In other words, it may not reflect the original date of  
deposition (which could have come from a different record, as in this  
case).

chris

On Apr 7, 2008, at 11:24 AM, Miguel Pignatelli wrote:

>
> I've noticed that the ASN.1 version of those records has a "creation- 
> date" tag.
> But this is somehow strange, because the creation date obtained by  
> you and that obtained via ASN.1 format is 2003/11/21, but if you  
> look at the revision history of the record:
>
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=CAB02640
>
> reports a creation date of "Oct 19 1996 12:28 AM"
>
> I don't know how to get this, because the EMBL version of this gene:
>
> http://www.ebi.ac.uk/cgi-bin/dbfetch?db=emblcds&id=CAB02640&style=raw
>
> doesn't has DT fields at all.
>
> M;
>
>
> Chris Fields wrote:
>> Strangely enough, if you use NCBI's esummary you can get both  
>> dates.  Via Bio::DB::EUtilities in bioperl-live, if you dump out  
>> DocSum data (using a debugging method I added in a while back):
>> ---------------------------------------
>> use Bio::DB::EUtilities;
>> # for multiple IDs use an array ref; also only use GI's (not  
>> accessions)
>> my $factory = Bio::DB::EUtilities->new(
>>                        -eutil => 'esummary',
>>                        -db => 'protein',
>>                        -id => 1621261);
>> $factory->print_DocSums;
>> ---------------------------------------
>> One gets the following tag/value pairs:
>> UID: 1621261
>> Caption             :CAB02640
>> Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN  
>> PYRR [Mycobacterium tuberculosis
>>             H37Rv]
>> Extra               :gi|1621261|emb|CAB02640.1|[1621261]
>> Gi                  :1621261
>> CreateDate          :2003/11/21
>> UpdateDate          :2006/11/14
>> Flags               :
>> TaxId               :83332
>> Length              :193
>> Status              :live
>> ReplacedBy          :
>> Comment             :
>> I'll add in a method to grab the data element by tag (in this case,  
>> grab the creation date by asking for the 'CreateDate' key).  Might  
>> come in handy for scripts.
>> chris
>> On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:
>>> Miguel,
>>>
>>> You probably know this but:
>>>
>>> - Your entry example below is a GenPept entry, not a GenBank entry
>>> - The NCBI sequence format "genbank" has only the last modified  
>>> date.
>>>  I do not know about other formats (ASN.1, ...)
>>> - NCBI Entrez is a great tool but it obscures the source database.
>>> - If you really are working on real GenBank entries, you can use  
>>> the accession
>>> number to see find corresponding EMBL (and Swiss-Prot) flat file  
>>> formats that
>>> have both creation and last modified dates.
>>>
>>> Post to the list if you have trouble getting the dates from EMBL/ 
>>> Swiss-Prot
>>> formats using bioperl.
>>>
>>> Yours,
>>>
>>>    -Heikki
>>>
>>> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>>>> Hi all,
>>>>
>>>> Is there any way to obtain the date of creation of individual  
>>>> GenBank
>>>> entries? I don't mean the "last revision" date that can be found  
>>>> in the
>>>> first line of a GenBank file.
>>>>
>>>> I can access this creation date by looking at the "revision  
>>>> history" of
>>>> any GenBank entry (for example, see
>>>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105) 
>>>> ,
>>>> but I need a systematic (and local=fast) way to access this  
>>>> information.
>>>>
>>>> Any help would be very appreciated,
>>>> Thank you very much in advance,
>>>>
>>>> M;
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>> -- 
>>> ______ _/      _/ 
>>> _____________________________________________________
>>>     _/      _/
>>>    _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>>   _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>>  _/  _/  _/  SANBI, South African National Bioinformatics Institute
>>> _/  _/  _/  University of Western Cape, South Africa
>>>    _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>>> ___ _/_/_/_/_/ 
>>> ________________________________________________________
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Tue Apr  8 03:35:43 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Tue, 08 Apr 2008 09:35:43 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
Message-ID: <47FB204F.90405@awi.de>


>>Can you provide some examples of these warnings (of the taxons that
>>cause them)? If there's anything consistent about them perhaps
>>Bio::Species can be improved to accommodate them properly (instead of
>>just issuing the warning and getting the classification wrong).
>>    
>>
>
>All warnings (and a few errors) for swissprot are here:
>
>   http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>
>as an attached file
>
>I suppose the OP will have encountered similar output - I don't think there is
>much RDBMS-type-dependency involved.
>  
>
Hi Erik & Sendu,

yes, the same kind of thing, probably no DBMS-type dependency; in case 
it could be useful, I uploaded my output as a second attachment to the 
bugzilla report cited above.

Bank

From heikki at sanbi.ac.za  Tue Apr  8 04:32:12 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Tue, 8 Apr 2008 10:32:12 +0200
Subject: [Bioperl-l] Blast database sequence retrieval perl script
In-Reply-To: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
References: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
Message-ID: <200804081032.12312.heikki@sanbi.ac.za>


Dear Nelson,

I am cc:ing the bioperl mailing list where all these kind of queries should 
go. More people can help you that way.


Since you have your own local data set, you need to create an index that 
catalogues you sequences for easy retrieval.

You need to install bioperl-live first. See for example: 	
	http://www.bioperl.org/wiki/Using_Subversion

Then you can follow this HOWTO:
	http://www.bioperl.org/wiki/HOWTO:Flat_databases

The other HOWTOs will help you dealing with BioPerl sequence objects that are 
retrieved: http://www.bioperl.org/wiki/HOWTOs. 


Yours,

	-Heikki


On Monday 07 April 2008 14:50:23 Ndegwa, Nelson (IITA-Nairobi) wrote:
> Dear Prof. Heikki,
>
> Hi. We met at the Pathogen Bioinformatics Conference held in Nairobi
> Kenya in May 2007 at ICIPE. I recall you are a developer of Bioperl and
> Perl. I have managed to install a local Blast, having just cowpea Contig
> sequences, about 50,000 in total. This runs fine, as I can perform
> various queries and get results. However, any good match/hit on the
> local Blast database is hard to retrieve and the only option seems to go
> back to that database and search manually for the top hit sequence - an
> exceedingly manual task. Might you perhaps be having a Perl script I
> could adopt to my database to help with this task Such that the hits
> have a hyperlink which can be used to retrieve that specific entry? I
> have limited knowledge of Perl. Thank you.
>
> With Kind Regards,
>
> Nelson.


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From David.Messina at sbc.su.se  Tue Apr  8 07:29:12 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Tue, 8 Apr 2008 13:29:12 +0200
Subject: [Bioperl-l] How to analysis the relationship of my interesting KEGG
	pathways?
In-Reply-To: <628aabb70804080053g1fd9120ex9d5fd12f65f216f9@mail.gmail.com>
References: <fb5dae380804062357ka7de019kb3451a5e169c0bf4@mail.gmail.com>
	<628aabb70804080053g1fd9120ex9d5fd12f65f216f9@mail.gmail.com>
Message-ID: <628aabb70804080429k2aa17a6eu12197709d4cc1af0@mail.gmail.com>

Hi Jinyan,

You asked a similar question last week and received a couple of suggestions
-- did you take a look at those?

I'm not an expert on this topic, but I believe that since regulatory
information is much harder to obtain experimentally and therefore much less
well known, there isn't a lot of it in pathway databases like KEGG. You may
have to look through the literature and start trying to put together
possible regulatory links on your own.

Dave

From hrh at sanger.ac.uk  Tue Apr  8 08:48:32 2008
From: hrh at sanger.ac.uk (Hans Rudolf Hotz)
Date: Tue, 8 Apr 2008 13:48:32 +0100 (BST)
Subject: [Bioperl-l] Blast database sequence retrieval perl script
In-Reply-To: <200804081032.12312.heikki@sanbi.ac.za>
References: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
	<200804081032.12312.heikki@sanbi.ac.za>
Message-ID: <Pine.LNX.4.64.0804081340180.7147@deskpro50.dynamic.sanger.ac.uk>

Nelson

or simply use the BLAST indices for the sequence retrieval as well.

All you need to do is adding the "-o" option to the 'formatdb' command for 
the BLAST index creation (this will create some extra files). Then you can 
use 'fastacmd' (which is also part of the NCBI BLAST package) to retrieve 
the sequences.


Hans

On Tue, 8 Apr 2008, Heikki Lehvaslaiho wrote:

>
> Dear Nelson,
>
> I am cc:ing the bioperl mailing list where all these kind of queries should
> go. More people can help you that way.
>
>
> Since you have your own local data set, you need to create an index that
> catalogues you sequences for easy retrieval.
>
> You need to install bioperl-live first. See for example:
> 	http://www.bioperl.org/wiki/Using_Subversion
>
> Then you can follow this HOWTO:
> 	http://www.bioperl.org/wiki/HOWTO:Flat_databases
>
> The other HOWTOs will help you dealing with BioPerl sequence objects that are
> retrieved: http://www.bioperl.org/wiki/HOWTOs.
>
>
> Yours,
>
> 	-Heikki
>
>
> On Monday 07 April 2008 14:50:23 Ndegwa, Nelson (IITA-Nairobi) wrote:
>> Dear Prof. Heikki,
>>
>> Hi. We met at the Pathogen Bioinformatics Conference held in Nairobi
>> Kenya in May 2007 at ICIPE. I recall you are a developer of Bioperl and
>> Perl. I have managed to install a local Blast, having just cowpea Contig
>> sequences, about 50,000 in total. This runs fine, as I can perform
>> various queries and get results. However, any good match/hit on the
>> local Blast database is hard to retrieve and the only option seems to go
>> back to that database and search manually for the top hit sequence - an
>> exceedingly manual task. Might you perhaps be having a Perl script I
>> could adopt to my database to help with this task Such that the hits
>> have a hyperlink which can be used to retrieve that specific entry? I
>> have limited knowledge of Perl. Thank you.
>>
>> With Kind Regards,
>>
>> Nelson.
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

From robert.citek at gmail.com  Tue Apr  8 10:09:27 2008
From: robert.citek at gmail.com (Robert Citek)
Date: Tue, 8 Apr 2008 09:09:27 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
	<15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
Message-ID: <4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>

Wrapping bioperl around eutils will work just fine.  Thanks for the pointer.

http://search.cpan.org/~sendu/bioperl-1.5.2_102/Bio/DB/EUtilities.pm

Regards,
- Robert

On Fri, Apr 4, 2008 at 4:25 PM, Chris Fields <cjfields at uiuc.edu> wrote:
> Do you need something to access eutils via BioPerl, or are you looking for a
> specific set of classes?  I wrote an interface to eutils
> (Bio::DB::EUtilities), you could do something like this:
>
>  #!/usr/bin/perl -w
>
>  use strict;
>  use warnings;
>  use Bio::DB::EUtilities;
>
>  my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>                                      -term => 'dihydroorotate',
>                                      -db => 'pcsubstance',
>                                      -retmax => 1000);
>
>  print join(',',$eutil->get_ids)."\n";
>
>  chris

From cjfields at uiuc.edu  Tue Apr  8 11:10:26 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 8 Apr 2008 10:10:26 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
	<15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
	<4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>
Message-ID: <32D210FC-575E-4D95-95DA-FC6F5BE1FC24@uiuc.edu>

Just to note, the the API has changed significantly from the interface  
in the 1.5.2 release.  The up-to-date (supported) interface is in  
subversion; there are some example recipes here:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

I'm working on a full HOWTO, just haven't had time to get it up on the  
wiki yet.

chris

On Apr 8, 2008, at 9:09 AM, Robert Citek wrote:

> Wrapping bioperl around eutils will work just fine.  Thanks for the  
> pointer.
>
> http://search.cpan.org/~sendu/bioperl-1.5.2_102/Bio/DB/EUtilities.pm
>
> Regards,
> - Robert
>
> On Fri, Apr 4, 2008 at 4:25 PM, Chris Fields <cjfields at uiuc.edu>  
> wrote:
>> Do you need something to access eutils via BioPerl, or are you  
>> looking for a
>> specific set of classes?  I wrote an interface to eutils
>> (Bio::DB::EUtilities), you could do something like this:
>>
>> #!/usr/bin/perl -w
>>
>> use strict;
>> use warnings;
>> use Bio::DB::EUtilities;
>>
>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>>                                     -term => 'dihydroorotate',
>>                                     -db => 'pcsubstance',
>>                                     -retmax => 1000);
>>
>> print join(',',$eutil->get_ids)."\n";
>>
>> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cuiw at ncbi.nlm.nih.gov  Tue Apr  8 16:41:58 2008
From: cuiw at ncbi.nlm.nih.gov (Cui, Wenwu (NIH/NLM/NCBI) [C])
Date: Tue, 8 Apr 2008 16:41:58 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47F9F3AA.2090003@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
Message-ID: <6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>

Hi, Miguel:

id1_fetch can do it. Detailed instruction can be found at:  

http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
1_fetch.html

Here is an example:

>id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
GI        Loaded      DB    Retrieval No.
--        ------      --    -------------
74311105  12/07/2007  NCBI  19766263
74311105  01/23/2007  NCBI  16325656
74311105  03/30/2006  NCBI  13131204
74311105  03/03/2006  NCBI  12915541
74311105  03/02/2006  NCBI  12885275
74311105  12/03/2005  NCBI  12259793
74311105  09/09/2005  NCBI  11257262
74311105  09/09/2005  NCBI  11242667

Wenwu Cui PhD
NCBI/NLM/NIH

> -----Original Message-----
> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> Sent: Monday, April 07, 2008 6:13 AM
> Cc: bioperl-l at bioperl.org
> Subject: [Bioperl-l] GenBank entries creation dates
> 
> Hi all,
> 
> Is there any way to obtain the date of creation of individual GenBank
> entries? I don't mean the "last revision" date that can be found in
the
> first line of a GenBank file.
> 
> I can access this creation date by looking at the "revision history"
of
> any GenBank entry (for example, see
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> but I need a systematic (and local=fast) way to access this
> information.
> 
> Any help would be very appreciated,
> Thank you very much in advance,
> 
> M;
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From miguel.pignatelli at uv.es  Wed Apr  9 07:32:39 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Wed, 09 Apr 2008 13:32:39 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
	<6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
Message-ID: <47FCA957.5040409@uv.es>

Wow, impressive, thanks Wenwu for the information, I have never used 
this tool before. The problem is that I need to know all the revision 
history (or at least the creation date) for *all* the GIs present in nr 
(well, or at least a significant portion of it) and this tool queries 
via web.

The existence of this tool confirms me that this information is 
available somewhere, is it possible to download the data that contains 
this information?

Thanks again,

M;


Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at:  
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
> 1_fetch.html
> 
> Here is an example:
> 
>> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD
> NCBI/NLM/NIH
> 
>> -----Original Message-----
>> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
>> Sent: Monday, April 07, 2008 6:13 AM
>> Cc: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] GenBank entries creation dates
>>
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in
> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision history"
> of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>> but I need a systematic (and local=fast) way to access this
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

From cuiw at ncbi.nlm.nih.gov  Wed Apr  9 09:25:16 2008
From: cuiw at ncbi.nlm.nih.gov (Cui, Wenwu (NIH/NLM/NCBI) [C])
Date: Wed, 9 Apr 2008 09:25:16 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FCA957.5040409@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
	<6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
	<47FCA957.5040409@uv.es>
Message-ID: <6F230E9769AA8D4EB4BC401DF133EDB7180BE1@NIHCESMLBX15.nih.gov>

Hi, Miguel,

I do not know whether the data file is publically available. However,
you can perform 'real time' query via id1_fetch:

####step 1: generate GI file #####
id1_fetch -query 'YOUR-GENBANK-QUERY-STRING' -lt none -db Nucleotide
-out qfile

####step 2: retrieve revisions for GIs stored in qfile #####

id1_fetch -lt revisions -qf qfile  -fmt fasta -db Nucleotide

Good luck!

Wenwu Cui

> -----Original Message-----
> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> Sent: Wednesday, April 09, 2008 7:33 AM
> To: Cui, Wenwu (NIH/NLM/NCBI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] GenBank entries creation dates
> 
> Wow, impressive, thanks Wenwu for the information, I have never used
> this tool before. The problem is that I need to know all the revision
> history (or at least the creation date) for *all* the GIs present in
nr
> (well, or at least a significant portion of it) and this tool queries
> via web.
> 
> The existence of this tool confirms me that this information is
> available somewhere, is it possible to download the data that contains
> this information?
> 
> Thanks again,
> 
> M;
> 
> 
> Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> > Hi, Miguel:
> >
> > id1_fetch can do it. Detailed instruction can be found at:
> >
> >
>
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.i
> d
> > 1_fetch.html
> >
> > Here is an example:
> >
> >> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> > GI        Loaded      DB    Retrieval No.
> > --        ------      --    -------------
> > 74311105  12/07/2007  NCBI  19766263
> > 74311105  01/23/2007  NCBI  16325656
> > 74311105  03/30/2006  NCBI  13131204
> > 74311105  03/03/2006  NCBI  12915541
> > 74311105  03/02/2006  NCBI  12885275
> > 74311105  12/03/2005  NCBI  12259793
> > 74311105  09/09/2005  NCBI  11257262
> > 74311105  09/09/2005  NCBI  11242667
> >
> > Wenwu Cui PhD
> > NCBI/NLM/NIH
> >
> >> -----Original Message-----
> >> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> >> Sent: Monday, April 07, 2008 6:13 AM
> >> Cc: bioperl-l at bioperl.org
> >> Subject: [Bioperl-l] GenBank entries creation dates
> >>
> >> Hi all,
> >>
> >> Is there any way to obtain the date of creation of individual
> GenBank
> >> entries? I don't mean the "last revision" date that can be found in
> > the
> >> first line of a GenBank file.
> >>
> >> I can access this creation date by looking at the "revision
history"
> > of
> >> any GenBank entry (for example, see
> >>
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> >> but I need a systematic (and local=fast) way to access this
> >> information.
> >>
> >> Any help would be very appreciated,
> >> Thank you very much in advance,
> >>
> >> M;
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From CALLEY_JOHN_N at LILLY.COM  Wed Apr  9 09:45:23 2008
From: CALLEY_JOHN_N at LILLY.COM (John N Calley)
Date: Wed, 9 Apr 2008 09:45:23 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FCA957.5040409@uv.es>
Message-ID: <OF73E5AA49.8E1EF918-ON85257426.004AF961-85257426.004B915C@EliLilly.lilly.com>

You might want to keep in mind that the creation date is not always 
reliable. I am aware of one example where the recorded creation date 
precedes the sequencing date by several months (as determined by the trace 
file date). NCBI was not able to explain exactly what happened but (as I 
recall) hypothesized that some dates had been scrambled in a database 
rebuild. If there was interest I could probably pull up more details.

John Calley


Miguel Pignatelli <miguel.pignatelli at uv.es> 
Sent by: bioperl-l-bounces at lists.open-bio.org
04/09/2008 07:32 AM
Please respond to
miguel.pignatelli at uv.es


To
"Cui, Wenwu (NIH/NLM/NCBI) [C]" <cuiw at ncbi.nlm.nih.gov>
cc
bioperl-l at bioperl.org
Subject
Re: [Bioperl-l] GenBank entries creation dates


Wow, impressive, thanks Wenwu for the information, I have never used 
this tool before. The problem is that I need to know all the revision 
history (or at least the creation date) for *all* the GIs present in nr 
(well, or at least a significant portion of it) and this tool queries 
via web.

The existence of this tool confirms me that this information is 
available somewhere, is it possible to download the data that contains 
this information?

Thanks again,

M;


Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at: 
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
> 1_fetch.html
> 
> Here is an example:
> 
>> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD
> NCBI/NLM/NIH
> 
>> -----Original Message-----
>> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
>> Sent: Monday, April 07, 2008 6:13 AM
>> Cc: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] GenBank entries creation dates
>>
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in
> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision history"
> of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>> but I need a systematic (and local=fast) way to access this
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From frederic.romagne at gmail.com  Wed Apr  9 16:45:50 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 09 Apr 2008 15:45:50 -0500
Subject: [Bioperl-l] question about clustalw module.
Message-ID: <1207773950.483.13.camel@kiss-laptop>

Hello,

i have a problem when using Bio::Tools::Run::Alignment::Clustalw :

I give it an array_ref scalar (the array contains some fasta sequences)
and all the good parameters and i write the result via  Bio::SeqIO.

The fact is that my result file only contains the Accession number in
the header... An example :

the initial stream is : 

>NM_052854 Homo sapiens cAMP responsive element binding protein 3-like 1
(CREB3L1), mRNA.
AGAAGACGTGCGGAGGGAGACGCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGG
GGGAGCACTTAGCTCCCCCGCCCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTC
AGCCCCAACCCCGGGCTCCCCATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGT
GGAGTCGGCTGAATGCCCACGGTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCG
CTGCCCTAAGGCCCCCGCGCGCCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCG
CCCCTCCCCCGGGGCTTCGCCCCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAG
GAGCTCTGGACTGGGCGCGCCGCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCC
CGGGAGCCGGCTGCGATGGACGCCGTCTTGGAACCCTTCCCGGCCGACAGGCTGTTCCCC
GGATCCAGCTTCCTGGACTTGGGGGATCTGAACGAGTCGGACTTCCTCAACAATGCGCAC

...

the result file is :

>NM_052854
---------------------------------------AGAAGACGTGCGGAGGGAGAC
GCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGGGGGAGCACTTAGCTCCCCCGC
CCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTCAGCCCCAACCCCGGGCTCCCC
ATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGTGGAGTCGGCTGAATGCCCACG
GTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCGCTGCCCTAAGGCCCCCGCGCG
CCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCGCCCCTCCCCCGGGGCTTCGCC
CCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAGGAGCTCTGGACTGGGCGCGCC
GCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCCCGGGAGCCGGCTGCGATGGAC

...

?So i lost the other informations provided by the header...

?Is there any option to keep these informations?

Here is a part of my code with my options :


 my $seq_ref=\@seq;
 my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'quiet' => 1,
		'output' => 'FASTA');
 my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
 my $aln = $factory->align($seq_ref);


Thank you.


From jason at bioperl.org  Wed Apr  9 16:55:13 2008
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 9 Apr 2008 13:55:13 -0700
Subject: [Bioperl-l] question about clustalw module.
In-Reply-To: <1207773950.483.13.camel@kiss-laptop>
References: <1207773950.483.13.camel@kiss-laptop>
Message-ID: <C126E560-1A36-461E-ADAD-774446B9DB9E@bioperl.org>

the clustal alignment format does not allow for the description - if  
you want to preserve it you'll have to add it back, make a hash  
indexed by sequence ID and store the description, then when you get  
your alignment back you can update the description field before  
writing it out with AlignIO.

-jason
On Apr 9, 2008, at 1:45 PM, Fr?d?ric Romagn? wrote:

> Hello,
>
> i have a problem when using Bio::Tools::Run::Alignment::Clustalw :
>
> I give it an array_ref scalar (the array contains some fasta  
> sequences)
> and all the good parameters and i write the result via  Bio::SeqIO.
>
> The fact is that my result file only contains the Accession number in
> the header... An example :
>
> the initial stream is :
>
>> NM_052854 Homo sapiens cAMP responsive element binding protein 3- 
>> like 1
> (CREB3L1), mRNA.
> AGAAGACGTGCGGAGGGAGACGCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGG
> GGGAGCACTTAGCTCCCCCGCCCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTC
> AGCCCCAACCCCGGGCTCCCCATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGT
> GGAGTCGGCTGAATGCCCACGGTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCG
> CTGCCCTAAGGCCCCCGCGCGCCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCG
> CCCCTCCCCCGGGGCTTCGCCCCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAG
> GAGCTCTGGACTGGGCGCGCCGCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCC
> CGGGAGCCGGCTGCGATGGACGCCGTCTTGGAACCCTTCCCGGCCGACAGGCTGTTCCCC
> GGATCCAGCTTCCTGGACTTGGGGGATCTGAACGAGTCGGACTTCCTCAACAATGCGCAC
>
> ...
>
> the result file is :
>
>> NM_052854
> ---------------------------------------AGAAGACGTGCGGAGGGAGAC
> GCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGGGGGAGCACTTAGCTCCCCCGC
> CCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTCAGCCCCAACCCCGGGCTCCCC
> ATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGTGGAGTCGGCTGAATGCCCACG
> GTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCGCTGCCCTAAGGCCCCCGCGCG
> CCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCGCCCCTCCCCCGGGGCTTCGCC
> CCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAGGAGCTCTGGACTGGGCGCGCC
> GCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCCCGGGAGCCGGCTGCGATGGAC
>
> ...
>
> So i lost the other informations provided by the header...
>
> Is there any option to keep these informations?
>
> Here is a part of my code with my options :
>
>
>  my $seq_ref=\@seq;
>  my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'quiet' => 1,
> 		'output' => 'FASTA');
>  my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
>  my $aln = $factory->align($seq_ref);
>
>
> Thank you.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From lamq at usal.es  Thu Apr 10 11:52:24 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Thu, 10 Apr 2008 17:52:24 +0200
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
Message-ID: <47FE37B8.9090404@usal.es>

I am not able to add xyplot glyphs to one panel because I have some
problems with the aggregations.

Using that GFF file:

##sequence-region chr1 1 5578650
chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
. . .


And this source code for preparing the aggregated features necessary for
the xyplot glyph:

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,
                            -adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}'
                           );
my $segment  = $db->segment('chr1');
my @features1 = $db->features('atpc');
print "$#features1 \n";
my @features2 = $segment->features('atpc');
print "$#features2 \n";
my @features3 = $db->features('at');
print "$#features3 \n";
my @features4 = $segment->features('at');
print "$#features4 \n";

I obtain:

111572
111572
0
0

What I am doing wrong with the aggregator?

Many thanks.


From lamq at usal.es  Thu Apr 10 11:45:55 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Thu, 10 Apr 2008 17:45:55 +0200
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
Message-ID: <47FE3633.70908@usal.es>

I am not able to add xyplot glyphs to one panel because I have some 
problems with the aggregations.

Using that GFF file:

##sequence-region chr1 1 5578650
chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
. . .


And this source code for preparing the aggregated features necessary for 
the xyplot glyph:

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,
                            -adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}'
                           );
my $segment  = $db->segment('chr1');                           
my @features1 = $db->features('atpc');
print "$#features1 \n";
my @features2 = $segment->features('atpc');
print "$#features2 \n";
my @features3 = $db->features('at');
print "$#features3 \n";
my @features4 = $segment->features('at');
print "$#features4 \n";

I obtain:

111572
111572
0
0

What I am doing wrong with the aggregator?

Many thanks.


From lincoln.stein at gmail.com  Thu Apr 10 13:55:06 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 10 Apr 2008 13:55:06 -0400
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
In-Reply-To: <47FE37B8.9090404@usal.es>
References: <47FE37B8.9090404@usal.es>
Message-ID: <6dce9a0b0804101055w65e22abfgaa4f155751fef40f@mail.gmail.com>

Hi Luis,

When you aggregate the atpc 1 features together, you end up with one
feature. Thus @features3 is an array of size 1. The $# operator returns the
index of the last element, which is 0. If @features3 were empty, $#features3
would return -1.

Lincoln

On Thu, Apr 10, 2008 at 11:52 AM, Luis A. M. Quintales <lamq at usal.es> wrote:

> I am not able to add xyplot glyphs to one panel because I have some
> problems with the aggregations.
>
> Using that GFF file:
>
> ##sequence-region chr1 1 5578650
> chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
> chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
> chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
> chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
> . . .
>
>
> And this source code for preparing the aggregated features necessary for
> the xyplot glyph:
>
> my $filin  = $ARGV[0];
> my $db = Bio::DB::GFF->new( -dsn => $filin,
>                           -adaptor => 'memory',
>                           -aggregator => 'at{atpc:atfreq}'
>                          );
> my $segment  = $db->segment('chr1');
> my @features1 = $db->features('atpc');
> print "$#features1 \n";
> my @features2 = $segment->features('atpc');
> print "$#features2 \n";
> my @features3 = $db->features('at');
> print "$#features3 \n";
> my @features4 = $segment->features('at');
> print "$#features4 \n";
>
> I obtain:
>
> 111572
> 111572
> 0
> 0
>
> What I am doing wrong with the aggregator?
>
> Many thanks.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From adsj at novozymes.com  Fri Apr 11 04:53:23 2008
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Fri, 11 Apr 2008 10:53:23 +0200
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
Message-ID: <87d4owixh8.fsf@topper.koldfront.dk>

  Hi.

I am trying to make Bio::SeqIO return objects of my own type (a small
extension of Bio::Seq::RichSeq), by setting -seqfactory. I am having a
little trouble creating the correct object to pass with -seqfactory:

Following the example given in SYNOPSIS of Bio::Factory::SequenceFactoryI,
I get this error:

 $ perl -e '
 >            use Bio::Seq::SeqFactory;
 >            my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq');
 > 
 >            my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 >                                          -display_id => 'exampleseq');
 > 
 >            print "seq is a ", ref($seq), "\n";
 > '

 ------------- EXCEPTION: Bio::Root::Exception -------------
 MSG: Can't locate type.pm in @INC (@INC contains: /z/bio/biotools/bioinfperlmodules/ /z/bio/adm/modules /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl .) at (eval 13) line 3.
 : Unrecognized Sequence type for SeqFactory 'type'
 STACK: Error::throw
 STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:357
 STACK: Bio::Seq::SeqFactory::type /usr/share/perl/5.8/Bio/Seq/SeqFactory.pm:134
 STACK: Bio::Seq::SeqFactory::new /usr/share/perl/5.8/Bio/Seq/SeqFactory.pm:93
 STACK: -e:3
 -----------------------------------------------------------
 $ 

If I go "Bio::Seq::SeqFactory('Bio::PrimarySeq'=>1)" instead, for
instance, it seems to work:

 $ perl -e '
 >            use Bio::Seq::SeqFactory;
 >            my $seqbuilder = Bio::Seq::SeqFactory->new('Bio::PrimarySeq'=>1);
 > 
 >            my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 >                                          -display_id => 'exampleseq');
 > 
 >            print "seq is a ", ref($seq), "\n";
 > '
 seq is a Bio::PrimarySeq
 $ 

I was about to write a patch for the pod, when I realized that I'd
better start by asking: Is this a buglet in the pod or the code?

  Best regards,

    Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com

From hlapp at gmx.net  Fri Apr 11 11:35:54 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 11 Apr 2008 11:35:54 -0400
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <87d4owixh8.fsf@topper.koldfront.dk>
References: <87d4owixh8.fsf@topper.koldfront.dk>
Message-ID: <0037240B-F469-4388-972A-324101B11621@gmx.net>


On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:
>  $ perl -e '
>>            use Bio::Seq::SeqFactory;
>>            my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>  
>> 'Bio::PrimarySeq');


You need to prefix the argument with a dash: '-type', not 'type'.  
Otherwise, it assumes that the class you want instantiated is 'type.pm'.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From 1zoujing at 163.com  Thu Apr 10 01:08:52 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 22:08:52 -0700 (PDT)
Subject: [Bioperl-l]  Bio::ASN1::EntrezGene parse so slowly?
Message-ID: <16602210.post@talk.nabble.com>


  I want to parse a file "gene_info" from NCBI. The format of Gene in NCBI is
ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
properly/too slow. The file is about 500M. 
  The code is following:
  use Bio::ASN1::EntrezGene;
  my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
  my $i = 0;
  while(my $result = $parser->next_seq)
  { last; #something to do there, here use last for test}

  When it goes to the "while" part, it is processing on and on, it does not
went out, even I used "last" in the "while" part. 
   So I wonder whether it is too slow or the module is not fit for this job,
or I did something wrong?

  Thank you!
-- 
View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 02:17:41 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 23:17:41 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl Sus_scrofa.ags"
Message-ID: <16602770.post@talk.nabble.com>


   I am a geen hand in Bioperl. When I run perl with
"parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
information:
     Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
  
   But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
should be the same as Homo_sapiens in the example. So it should be no error
as the code is the example from Mingyi.
   I wonder why this happen, and should I change something about the file? 
    
-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16602770.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 02:56:52 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 23:56:52 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16602770.post@talk.nabble.com>
References: <16602770.post@talk.nabble.com>
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:03:56 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:03:56 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
per GeneID, Column header line is the first line in the file
) is not the right format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:04:32 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:04:32 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
per GeneID, Column header line is the first line in the file) is not the
right format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:09:40 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:09:40 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz. It doesn't work.Is
that means "gene_info.gz"( tab-delimited,one line per GeneID, Column header
line is the first line in the file) is not the right format for
Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:10:26 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:10:26 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there is still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz.
   It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line per
GeneID, Column header line is the first line in the file) is not the right
format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From stefan.kirov at bms.com  Fri Apr 11 15:59:29 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Fri, 11 Apr 2008 15:59:29 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16602770.post@talk.nabble.com>
References: <16602770.post@talk.nabble.com>
Message-ID: <Pine.WNT.4.64.0804111557210.2384@A161887.one.ads.bms.com>

AGS is a binary ASN.1 format and WILL NOT be parsed! You have to use 
gene2xml( weird, but this is NCBI) with these flags: -c -x -b -i. This 
will spit out text ASN which can be parsed.
Stefan

On Wed, 9 Apr 2008, zoujing wrote:

>
>   I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>     Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>
>   But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no error
> as the code is the example from Mingyi.
>   I wonder why this happen, and should I change something about the file?
>
> -- 
> View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16602770.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From stefan.kirov at bms.com  Fri Apr 11 16:01:30 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Fri, 11 Apr 2008 16:01:30 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16603225.post@talk.nabble.com>
References: <16603225.post@talk.nabble.com>
Message-ID: <Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>

It is not. If you use this file, why would you need a parser for it 
anyway? Just split on \t or read with OpenOffice or equiv.
Stefan

On Thu, 10 Apr 2008, zoujing wrote:

>
> Seached  the web and found the answer now, quote the answer as following:
>   The error was thrown by my Bio::ASN1::EntrezGene module because it
> expects a text file, while you fed it with a binary file.  To use
> gzipped ASN binary file from NCBI, download the NCBI gene2xml
> (ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml),
> then use this syntax to run my parser on the binary files:
>
> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped
> binary file directly downloaded from NCBI
>
> Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene).
> Mingyi
>
>   But there still one thing, I want to parse "gene_info.gz" in Gene of
> NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
> per GeneID, Column header line is the first line in the file
> ) is not the right format for Bio::ASN1::EntrezGene?
>
>
>
> zoujing wrote:
>>
>>    I am a geen hand in Bioperl. When I run perl with
>> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
>> information:
>>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>>
>>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
>> should be the same as Homo_sapiens in the example. So it should be no
>> error as the code is the example from Mingyi.
>>    I wonder why this happen, and should I change something about the file?
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From asjo at koldfront.dk  Fri Apr 11 15:39:59 2008
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Fri, 11 Apr 2008 21:39:59 +0200
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <0037240B-F469-4388-972A-324101B11621@gmx.net> (Hilmar Lapp's
	message of "Fri, 11 Apr 2008 11:35:54 -0400")
References: <87d4owixh8.fsf@topper.koldfront.dk>
	<0037240B-F469-4388-972A-324101B11621@gmx.net>
Message-ID: <877if4i3jk.fsf@topper.koldfront.dk>

On Fri, 11 Apr 2008 11:35:54 -0400, Hilmar wrote:

> On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:

>>> my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>
>>> 'Bio::PrimarySeq');

> You need to prefix the argument with a dash: '-type', not 'type'. 
> Otherwise, it assumes that the class you want instantiated is
> 'type.pm'.

I guess that means I should submit a patch for the SYNOPSIS. Attached.


   Thanks,

    Adam


Index: Bio/Factory/SequenceFactoryI.pm
===================================================================
--- Bio/Factory/SequenceFactoryI.pm	(revision 14654)
+++ Bio/Factory/SequenceFactoryI.pm	(working copy)
@@ -20,7 +20,7 @@
 # get a Bio::Factory::SequenceFactoryI object like
 
     use Bio::Seq::SeqFactory;
-    my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq');
+    my $seqbuilder = Bio::Seq::SeqFactory->new('-type' => 'Bio::PrimarySeq');
 
     my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 				  -display_id => 'exampleseq');

-- 
 "Well, I'm a moon around you"                                Adam Sj?gren
                                                         asjo at koldfront.dk


From bamboowarrior at gmail.com  Fri Apr 11 19:10:35 2008
From: bamboowarrior at gmail.com (Arkady)
Date: Fri, 11 Apr 2008 18:10:35 -0500
Subject: [Bioperl-l] Nucleotide Links in Gene DB (GenBank)
Message-ID: <91656c3f0804111610r24c8fa5es5bcb56b7a59e0208@mail.gmail.com>

Hi everyone, I'm a bioperl n00b. Actually, kind of a genbank n00b,
too, as I'm from a CS background and just started bio things last
June.

I'm trying to set up an analysis pipeline of primate protein CDSs (the
nucleotide seqs). I've written a script which does a pretty decent job
of downloading these from GenBank--but it's inconsistent, because a
lot of sequences in nucleotide are 'predicted' and named LOCthisorthat
instead of by gene name.

So what I was thinking was this (assume ANKRD43 is the gene for this example):

1. Search 'gene' database for ANKRD43 AND (PRI*[ORGN])
On NCBI, there's an option to show all nucleotide links. How do I get
a list of those in bioperl? Can bioperl even search 'gene', or just
'nucleotide'?

2. Search 'nucleotide' for the referenced items from #1, and also for
ANKRD43[TITL] AND (PRI*[ORGN]), save CDSes.

3. BLAST mRNA for one of those CDSes, see if we pick up any other matches.

4. BLAT other primates for CDSes, see if we find anything not in GenBank.


On the other hand, I always get the feeling I'm doing things the hard
way--especially here, with #1 and #2. Is there a much more obvious,
simple way to do this?

Thanks, folks.


Cheers,
John Woods

Institute for Cellular and Molecular Biology
The University of Texas at Austin

From hlapp at gmx.net  Fri Apr 11 19:19:44 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 11 Apr 2008 19:19:44 -0400
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <877if4i3jk.fsf@topper.koldfront.dk>
References: <87d4owixh8.fsf@topper.koldfront.dk>
	<0037240B-F469-4388-972A-324101B11621@gmx.net>
	<877if4i3jk.fsf@topper.koldfront.dk>
Message-ID: <B4B3CAD0-C346-470C-98D7-D6CBFE116109@gmx.net>

Thanks, applied. -hilmar

On Apr 11, 2008, at 3:39 PM, Adam Sj?gren wrote:
> On Fri, 11 Apr 2008 11:35:54 -0400, Hilmar wrote:
>
>> On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:
>
>>>> my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>
>>>> 'Bio::PrimarySeq');
>
>> You need to prefix the argument with a dash: '-type', not 'type'.
>> Otherwise, it assumes that the class you want instantiated is
>> 'type.pm'.
>
> I guess that means I should submit a patch for the SYNOPSIS. Attached.
>
>
>    Thanks,
>
>     Adam
>
>
> Index: Bio/Factory/SequenceFactoryI.pm
> ===================================================================
> --- Bio/Factory/SequenceFactoryI.pm	(revision 14654)
> +++ Bio/Factory/SequenceFactoryI.pm	(working copy)
> @@ -20,7 +20,7 @@
>  # get a Bio::Factory::SequenceFactoryI object like
>
>      use Bio::Seq::SeqFactory;
> -    my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>  
> 'Bio::PrimarySeq');
> +    my $seqbuilder = Bio::Seq::SeqFactory->new('-type' =>  
> 'Bio::PrimarySeq');
>
>      my $seq = $seqbuilder->create(-seq => 'ACTGAT',
>  				  -display_id => 'exampleseq');
>
> -- 
>  "Well, I'm a moon around you"                                Adam  
> Sj?gren
>                                                           
> asjo at koldfront.dk
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From mmokrejs at ribosome.natur.cuni.cz  Fri Apr 11 21:32:14 2008
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Sat, 12 Apr 2008 03:32:14 +0200
Subject: [Bioperl-l] [BioSQL-l] Loading sequences with novel NCBI
	taxon_id
In-Reply-To: <CE3675B2-2AFD-46AA-A348-16C9FEA51E0E@uiuc.edu>
References: <320fb6e00803130806w46148bacm54c3ead9a50b038f@mail.gmail.com>	<32EB5B0C-4CC8-4C33-9F41-5D4465B6AC48@gmx.net>	<320fb6e00803131613o20eae2b7y325814ef26d2738f@mail.gmail.com>	<CEA4F4E7-A66B-4C62-AE32-511E177BC485@gmx.net>	<93b45ca50803140648s5098a7d0sec621f448ef03040@mail.gmail.com>
	<CE3675B2-2AFD-46AA-A348-16C9FEA51E0E@uiuc.edu>
Message-ID: <4800111E.3030802@ribosome.natur.cuni.cz>

Chris Fields wrote:
> The counter to that perspective (using new sequences with old tax info) 
> would be to regularly update NCBI taxonomy, particularly in 
> circumstances prior to adding new sequences.  Hilmar mentioned that once 
> tax is loaded it doesn't take as long to update, so you could set up a 
> cron job to update regularly.
> 
> I remember someone mentioning weekly or monthly updates on the list 
> quite a while ago, but I'm unsure how often NCBI updates tax information 
> (i.e. with every release, monthly, weekly, etc).  I can see instances 
> popping up where you used the an up-to-date taxonomy but a new sequence 
> contains a tax ID not present.  I think bioperl-db handles these but I'm 
> not sure what other Bio* do.
> 

I spent some time benchmarking this and inspecting the mysql log files.
The current load_ncbi_taxonomy.pl script with minor modification to
show timestamps does this on initial import into mysql and then update
of the database using exactly same dataset (but anyway it has to walk
through all the data):

$ ./load_ncbi_taxonomy.pl --dbname=biosqldb --driver=mysql --host=127.0.01 \
  --port=3306 --directory=/home/mmokrejs/bioinformatics/databases/ncbitax/dump \
  --chunksize=0 --verbose=2 --mycnf=~/.my.cnf
Sat Apr 12 01:58:43 MEST 2008
Loading NCBI taxon database in /home/mmokrejs/bioinformatics/databases/ncbitax/dump:
       ... retrieving all taxon nodes in the database
Sat Apr 12 01:58:43 MEST 2008
       ... reading in taxon nodes from nodes.dmp
Sat Apr 12 01:58:58 MEST 2008
       ... insert / update / delete taxon nodes
                10000/421098 done (in 5 secs, 2000.0 rows/s)
                20000/421098 done (in 4 secs, 2500.0 rows/s)
...
                420000/421098 done (in 4 secs, 2500.0 rows/s)
Sat Apr 12 02:02:21 MEST 2008
       ... (committing nodes)
Sat Apr 12 02:02:21 MEST 2008
       ... rebuilding nested set left/right values
                10000 done (in 24 secs, 416.7 rows/s)
                20000 done (in 26 secs, 384.6 rows/s)
                30000 done (in 24 secs, 416.7 rows/s)
...
                420004 done (in 23 secs, 434.8 rows/s)
Sat Apr 12 02:19:25 MEST 2008
       ... reading in taxon names from names.dmp
Sat Apr 12 02:19:25 MEST 2008
       ... deleting old taxon names
Sat Apr 12 02:19:25 MEST 2008
       ... inserting new taxon names
                10000 done (in 8 secs, 1250.0 rows/s)
                20000 done (in 8 secs, 1250.0 rows/s)
...
                580000 done (in 5 secs, 2000.0 rows/s)
Sat Apr 12 02:24:48 MEST 2008
       ... cleaning up
Sat Apr 12 02:24:49 MEST 2008
Done.
$


I decided to re-import the same data to mimic at least somehow
the future updates, although no record should be UPDATEd,
except zapping left and right values with NULL. :((

$ ./load_ncbi_taxonomy.pl --dbname=biosqldb --driver=mysql --host=127.0.01
  --port=3306 --directory=/home/mmokrejs/bioinformatics/databases/ncbitax/dump \
  --chunksize=0 --verbose=2 --mycnf=~/.my.cnf
Sat Apr 12 02:35:20 MEST 2008
Loading NCBI taxon database in /home/mmokrejs/bioinformatics/databases/ncbitax/dump:
        ... retrieving all taxon nodes in the database
Sat Apr 12 02:35:26 MEST 2008
       ... reading in taxon nodes from nodes.dmp
Sat Apr 12 02:35:46 MEST 2008
       ... insert / update / delete taxon nodes
                10000/421098 done (in 0 secs, 10000.0 rows/s)
                20000/421098 done (in 0 secs, 10000.0 rows/s)
...
                410000/421098 done (in 0 secs, 10000.0 rows/s)
                420000/421098 done (in 0 secs, 10000.0 rows/s)
Sat Apr 12 02:35:55 MEST 2008
       ... (committing nodes)
Sat Apr 12 02:35:55 MEST 2008
       ... rebuilding nested set left/right values
                10000 done (in 9 secs, 1111.1 rows/s)
                20000 done (in 9 secs, 1111.1 rows/s)
...
                410004 done (in 8 secs, 1250.0 rows/s)
                420004 done (in 9 secs, 1111.1 rows/s)
Sat Apr 12 02:41:54 MEST 2008
       ... reading in taxon names from names.dmp
Sat Apr 12 02:41:54 MEST 2008
       ... deleting old taxon names
Sat Apr 12 02:41:55 MEST 2008
       ... inserting new taxon names
                10000 done (in 5 secs, 2000.0 rows/s)
                20000 done (in 5 secs, 2000.0 rows/s)
...
                570000 done (in 6 secs, 1666.7 rows/s)
                580000 done (in 5 secs, 2000.0 rows/s)
Sat Apr 12 02:47:27 MEST 2008
       ... cleaning up
Sat Apr 12 02:47:27 MEST 2008
Done.
$ ls -la /var/log/mysql/mysql.log 
-rw-rw---- 1 mysql mysql 483443314 Apr 12 03:15 /var/log/mysql/mysql.log
$

Pentium4 M laptop, 1.8GHz, 1 GB RAM, mysql-5.0.56 with enabled
SQL text logging, the slow version of logging all SQL commands
compared to binary logging. The log was cleared before the tests.
I could provide some bits from the log or upload it somewhere
if anybody else would like to dig into the details.


I believe the recalculation step could be made faster. See what
happens:

                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '1' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '10239' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12333' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12335' ORDER BY ncbi_taxon_id
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE left_value = '4'
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE right_value = '5'
                     31 Query       UPDATE taxon SET left_value = '4', right_value = '5' WHERE taxon_id = '12335'
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12340' ORDER BY ncbi_taxon_id
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE left_value = '6'
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE right_value = '7'
                     31 Query       UPDATE taxon SET left_value = '6', right_value = '7' WHERE taxon_id = '12340'


The columns left_value and right_value have NULL value upon
the table is created, so no need to write again NULL into
them. This would mean writing a wrapper function which would
mimic update() but before doing that it would do 'SELECT * FROM',
compare the values with those to be written and include in the
final UPDATE statement only those columns for which values have
been changed. We use such a smart wrapper for our code in python.
;-)

When the columns for left and right are to be made NULL during
update of an existing database, I think it would be much faster
to drop the columns and re-create them again with NULL values.


I think it could be investigated more the possibility to create
empty taxon and taxon_name tables as MyISAM tables and only after
all the import and updates they could be converted into InnoDB
tables. One would have to probably think a bit more of the foreign
keys but it might be they would not even be lost during the conversion
back and forth.

Actually, easy to check. Dump your current taxon and taxon_name
tables (maybe even without sql data using --without-data), run
'ALTER TABLE taxon ... type=MyISAM'
followed by
'ALTER TABLE taxon ... type=InnoDB'
dump again the database structure and compare by diff with
the original.

But, time for sleep here.
Martin


From sdavis2 at mail.nih.gov  Fri Apr 11 23:50:44 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 11 Apr 2008 23:50:44 -0400
Subject: [Bioperl-l] Bio::ASN1::EntrezGene parse so slowly?
In-Reply-To: <16602210.post@talk.nabble.com>
References: <16602210.post@talk.nabble.com>
Message-ID: <264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>

gene_info is a tab-delimited text file, if I recall correctly.  Have
you looked at it?  If it is, you should be able to parse it in a few
seconds with just a couple lines of code.

Sean


On Thu, Apr 10, 2008 at 1:08 AM, zoujing <1zoujing at 163.com> wrote:
>
>   I want to parse a file "gene_info" from NCBI. The format of Gene in NCBI is
>  ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
>  properly/too slow. The file is about 500M.
>   The code is following:
>   use Bio::ASN1::EntrezGene;
>   my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
>   my $i = 0;
>   while(my $result = $parser->next_seq)
>   { last; #something to do there, here use last for test}
>
>   When it goes to the "while" part, it is processing on and on, it does not
>  went out, even I used "last" in the "while" part.
>    So I wonder whether it is too slow or the module is not fit for this job,
>  or I did something wrong?
>
>   Thank you!
>  --
>  View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
>  Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
>  _______________________________________________
>  Bioperl-l mailing list
>  Bioperl-l at lists.open-bio.org
>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From david at burt7259.freeserve.co.uk  Sat Apr 12 13:01:57 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sat, 12 Apr 2008 18:01:57 +0100
Subject: [Bioperl-l] bioperl-db
Message-ID: <BFCB174E-B59E-4249-BDF8-4B0F2E2273C9@burt7259.freeserve.co.uk>

Hi Hilmar,

Hope you can help ? I am using bioperl-db to create a biosql database

I have used scripts load_seqdatabase.pl and load_ontology.pl to  
install human swissprot entries, gene ontology, sequence ontology and  
now want to load interpro

Here?s the command line I have tried

perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
root --dbpass chicken --driver mysql \
--namespace "InterPro" --format InterPro interpro.xml

But I get this message

Can't call method "identifier" on an undefined value at  /cygdrive/c/ 
Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
SimpleOntologyEngine.pm line 395

Any ideas?

Dave

PS: here?s the top of the interpro.xml file

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE interprodb SYSTEM "interpro.dtd">


<interprodb>
     <release>
       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
file_date="04-OCT-2006 00:00:00" />
       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
file_date="22-NOV-2006 00:00:00" />
       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
file_date="12-JUN-2007 00:00:00" />
       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
file_date="22-SEP-2005 00:00:00" />
       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
file_date="23-APR-2004 00:00:00" />
       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
file_date="14-NOV-2006 00:00:00" />
       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
file_date="27-JUL-2007 00:00:00" />
       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
file_date="28-SEP-2007 00:00:00" />
       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
file_date="11-SEP-2006 00:00:00" />
       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
file_date="30-NOV-2006 00:00:00" />
       <dbinfo dbname="SWISSPROT" version="55.1" entry_count="359942"  
file_date="18-MAR-2008 00:00:00" />
       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
file_date="18-MAR-2008 00:00:00" />
       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
file_date="19-MAR-2008 00:00:00" />
       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
file_date="27-MAR-2007 00:00:00" />
       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
file_date="12-JUL-2007 16:56:17" />
     </release>
   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
protein_count="352">
     <name>Kringle</name>
     <abstract>

  
From hlapp at gmx.net  Sat Apr 12 14:10:44 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 14:10:44 -0400
Subject: [Bioperl-l] personal vs list email
Message-ID: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>

I'm not sure why but I have received several Bioperl or BioSQL- 
related email inquiries directed to me *personally* over the past few  
weeks.

I have been responding as I get to them, but I feel that I am doing  
both the senders and this community a poor service, because sometimes  
someone else on the list could have responded much faster, and when I  
respond, others on the list who happen to be interested in the same  
question don't get to see the answer.

So from now on as a policy I will redirect *every* email sent to me  
personally and that asks a question related to one of the projects to  
the respective mailing list. If you don't want this, please  
conspicuously say so at the top of your email, and in that case if  
you do ask a project-related question be prepared to wait and to  
possibly needing to follow up.

As an aside, it's a pretty safe assumption to make that all other  
core developers, and quite possibly *all* developers are following a  
similar policy, whether expressly or not.

Isn't this somewhere in the FAQ too?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Apr 12 14:16:13 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 14:16:13 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
Message-ID: <C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>

Hi Burt,

can you try format interprosax instead of interpro? That variant is  
also much more graceful regarding required space.

	-hilmar

On Apr 12, 2008, at 1:01 PM, David Burt wrote:

> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Apr 12 16:17:43 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 12 Apr 2008 15:17:43 -0500
Subject: [Bioperl-l] [BioSQL-l] personal vs list email
In-Reply-To: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
References: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
Message-ID: <E7962E90-8309-4ADA-B002-950793B61D74@uiuc.edu>


On Apr 12, 2008, at 1:10 PM, Hilmar Lapp wrote:

> I'm not sure why but I have received several Bioperl or BioSQL- 
> related email inquiries directed to me *personally* over the past  
> few weeks.
>
> I have been responding as I get to them, but I feel that I am doing  
> both the senders and this community a poor service, because  
> sometimes someone else on the list could have responded much faster,  
> and when I respond, others on the list who happen to be interested  
> in the same question don't get to see the answer.
>
> So from now on as a policy I will redirect *every* email sent to me  
> personally and that asks a question related to one of the projects  
> to the respective mailing list. If you don't want this, please  
> conspicuously say so at the top of your email, and in that case if  
> you do ask a project-related question be prepared to wait and to  
> possibly needing to follow up.
>
> As an aside, it's a pretty safe assumption to make that all other  
> core developers, and quite possibly *all* developers are following a  
> similar policy, whether expressly or not.

I agree; I'm sure several other core devs feel the same way.  I always  
try to forward these to the list if I feel it is more relevant there.

> Isn't this somewhere in the FAQ too?
>
> 	-hilmar

No, but I've added it to the bioperl FAQ; might be worth checking over  
and editing.

chris


From hlapp at gmx.net  Sat Apr 12 18:40:53 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 18:40:53 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89ce2$5400a710$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce2$5400a710$0202a8c0@STUDYPC>
Message-ID: <3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>

Burt - please keep your replies on the list. Others may have input  
too, or benefit from the answer too.

As there is no name() method call on line 914 in the current version  
let's check first that you run a current version of BioPerl. It will  
need to be at least 1.5.2.

However, I do suspect a problem in either the InterPro file itself  
(wouldn't be the first time), or the InterPro parser.

	-hilmar

On Apr 12, 2008, at 5:15 PM, David Burt wrote:

> Hilmar
>
> Many thanks seems to be working
>
> But got this output ? any comments/ideas what it means ?
>
> Dave
>
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> > --namespace "InterPro" --format interprosax interpro.xml
>         ...deleting all relationships for InterPro
>         ...parsing and loading InterPro
> Can't call method "name" on an undefined value at load_ontology.pl  
> line 914.
>
> HERE?S the name and definition in the ontology table
>
> Name = InterPro
>
> Definition =
>
> PANTHER version 6.1, 30128 entries, 04-OCT-2006
> PFAM version 21.0, 8957 entries, 22-NOV-2006
> PIRSF version 2.70, 2877 entries, 12-JUN-2007
> PRINTS version 38.0, 1900 entries, 22-SEP-2005
> PRODOM version 2005.1, 1522 entries, 23-APR-2004
> PROSITE version 20.0, 2006 entries, 14-NOV-2006
> SMART version 5.1, 724 entries, 27-JUL-2007
> TIGRFAMs version 7.0, 3423 entries, 28-SEP-2007
> GENE3D version 3.0.0, 2147 entries, 11-SEP-2006
> SSF version 1.69, 1538 entries, 30-NOV-2006
> SWISSPROT version 55.1, 359942 entries, 18-MAR-2008
> TREMBL version 38.1, 5443281 entries, 18-MAR-2008
> INTERPRO version 17.0, 16175 entries, 19-MAR-2008
> GO version N/A, 23937 entries, 27-MAR-2007
> MEROPS version 7.8, 2831 entries, 12-JUL-2007 |
>
>
>
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: 12 April 2008 19:16
> To: David Burt
> Cc: Bioperl BioPerl
> Subject: Re: bioperl-db
>
> Hi Burt,
>
> can you try format interprosax instead of interpro? That variant is  
> also much more graceful regarding required space.
>
>             -hilmar
>
> On Apr 12, 2008, at 1:01 PM, David Burt wrote:
>
>
> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Apr 12 18:43:25 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 18:43:25 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
Message-ID: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>

I'm not sure what you mean by 'Check interpro.xml', but you can use  
the --safe command-line option to keep going if an individual term  
fails to load for whatever reason.

Can you post the data for the seemingly offending record? (and please  
cc the list)

	-hilmar

On Apr 12, 2008, at 5:39 PM, David Burt wrote:

> Hi Hilmar
>
> Just checked mysql database and only have 39 entries under interpro  
> and loaded up to IPR000035
>
> Check unterpro.xml looks OK from IPR000036 and onwards
>
> So seems to have crashed at IPR000035 ?
>
> dave
>
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: 12 April 2008 19:16
> To: David Burt
> Cc: Bioperl BioPerl
> Subject: Re: bioperl-db
>
> Hi Burt,
>
> can you try format interprosax instead of interpro? That variant is  
> also much more graceful regarding required space.
>
>             -hilmar
>
> On Apr 12, 2008, at 1:01 PM, David Burt wrote:
>
>
> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From Russell.Smithies at agresearch.co.nz  Sun Apr 13 22:51:41 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 14 Apr 2008 14:51:41 +1200
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC><C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net><000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>

Has anyone tried TRF? 
I notice UCSC is using it for all their simple repeat annotations and thought it might be better than what we're currently using (Sputnik)

And is there a BioPerl parser for it's output or am I going to have to write my own ?

Thanx,


Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E? russell.smithies at agresearch.co.nz 

Invermay? Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T? +64 3 489 3809?? 
F? +64 3 489 9174? 
www.agresearch.co.nz 


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From Russell.Smithies at agresearch.co.nz  Sun Apr 13 22:53:46 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 14 Apr 2008 14:53:46 +1200
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C03B09DE9@imail.agresearch.co.nz>
References: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
	<D5DBA313349A4B458528BE63B387F36C03B09DE9@imail.agresearch.co.nz>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEA881@imail.agresearch.co.nz>

Scratch the need for a parser.
I turned off html output and it's all nice white-space separated text  :-)

Russell

> -----Original Message-----
> From: Smithies, Russell
> Sent: Monday, 14 April 2008 2:52 p.m.
> To: 'Bioperl BioPerl'
> Subject: Tandem Repeats Finder?
> 
> Has anyone tried TRF?
> I notice UCSC is using it for all their simple repeat annotations and thought it might
> be better than what we're currently using (Sputnik)
> 
> And is there a BioPerl parser for it's output or am I going to have to write my own ?
> 
> Thanx,
> 
> 
> Russell Smithies
> 
> Bioinformatics Applications Developer
> T +64 3 489 9085
> E? russell.smithies at agresearch.co.nz
> 
> Invermay? Research Centre
> Puddle Alley,
> Mosgiel,
> New Zealand
> T? +64 3 489 3809
> F? +64 3 489 9174
> www.agresearch.co.nz
> 

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From csaba.ortutay at gmail.com  Mon Apr 14 00:15:22 2008
From: csaba.ortutay at gmail.com (Ortutay Csaba =?iso-8859-1?q?P=E9ter?=)
Date: Mon, 14 Apr 2008 07:15:22 +0300
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
	<D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>
Message-ID: <200804140715.22702.csaba.ortutay@gmail.com>

Hello, I have used TRF in my earlier projects. It is nice and quick tool.

There was not ready made parsers those times (5-6 years ago) so we have 
written our own.

Csaba

> Has anyone tried TRF?
> I notice UCSC is using it for all their simple repeat annotations and
> thought it might be better than what we're currently using (Sputnik)
>
> And is there a BioPerl parser for it's output or am I going to have to
> write my own ?
>
> Thanx,


-- 
Csaba Ortutay PhD
IMT Bioinformatics
University of Tampere
Finland

From avilella at gmail.com  Mon Apr 14 07:13:26 2008
From: avilella at gmail.com (Albert Vilella)
Date: Mon, 14 Apr 2008 12:13:26 +0100
Subject: [Bioperl-l] how can I print a Bio::Tree newick sortby given list?
Message-ID: <358f4d650804140413x4271f18bx40af1b9054306df8@mail.gmail.com>

Hi,

I have a newick file that I want to sort by a given order and print again as
newick.
For example, if I have

(((ENSPTRG00000013811:0.0011,ENSG00000142192:0.0021):0.0033,ENSPPYG00000003902:0.0326):0.0000,ENSMMUG00000014384:0.0366):0.3638;

I want to sort it by "ENSG:ENSPTRG:ENSPPYG:ENSMMUG".

Any suggestions on how to do this in bioperl?

Cheers,

    Albert.

From lamq at usal.es  Mon Apr 14 11:01:51 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Mon, 14 Apr 2008 17:01:51 +0200
Subject: [Bioperl-l] xyplot glyph: scale problems
Message-ID: <480371DF.7040900@usal.es>

I have some problem with the xyplot scale numbers calculated by the glyph.

The shape of the graph looks fine, but the scale number 10 and his 
position in the ouput is not correct.

I send the source code, simplified input file and the png output.

Thank you


Source code

ex1.pl  (also in http://avellano.usal.es/~luis/bioperl-l/ex1.pl)
============================
#!/usr/bin/perl
use Bio::DB::GFF;
use Bio::Graphics::Panel;
use strict;

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,-adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}' );
my $segment  = $db->segment('chr1');
my @features = $segment->features('at');
my $panel = Bio::Graphics::Panel->new(
       -offset    => 0, -grid    => 100,                               
       -length    => 500, -width     => 800,
       -pad_left  => 50, -pad_right => 50 );
$panel->add_track($segment, -glyph   => 'generic',
                           -bgcolor => 'blue', -label   => 
1);                                    
$panel->add_track(\@features,
                    -glyph => 'xyplot',
                    -graph_type=>'boxes',
                    -scale=>'left',
                    -height=>200,
 );
open (FI,"> sal.png");
============================

in1.gff file (also in http://avellano.usal.es/~luis/bioperl-l/in1.gff)
============================
##sequence-region chr1 1 5578650
chr1    atfreq    atpc    1    10       64.0000    .    .    atpc 1
chr1    atfreq    atpc    11    20       63.0000    .    .    atpc 1
chr1    atfreq    atpc    21    30       62.0000    .    .    atpc 1
chr1    atfreq    atpc    31    40       59.0000    .    .    atpc 1
chr1    atfreq    atpc    41    50       59.0000    .    .    atpc 1
chr1    atfreq    atpc    51    60       59.0000    .    .    atpc 1
chr1    atfreq    atpc    61    70       59.0000    .    .    atpc 1
chr1    atfreq    atpc    71    80       59.0000    .    .    atpc 1
chr1    atfreq    atpc    81    90       61.0000    .    .    atpc 1
chr1    atfreq    atpc    91    100       60.0000    .    .    atpc 1
chr1    atfreq    atpc    101    110       60.0000    .    .    atpc 1
chr1    atfreq    atpc    111    120       64.0000    .    .    atpc 1
chr1    atfreq    atpc    121    130       64.0000    .    .    atpc 1
chr1    atfreq    atpc    131    140       60.0000    .    .    atpc 1
chr1    atfreq    atpc    141    150       60.0000    .    .    atpc 1
chr1    atfreq    atpc    151    160       63.0000    .    .    atpc 1
chr1    atfreq    atpc    161    170       62.0000    .    .    atpc 1
chr1    atfreq    atpc    171    180       59.0000    .    .    atpc 1
chr1    atfreq    atpc    181    190       54.0000    .    .    atpc 1
chr1    atfreq    atpc    191    200       53.0000    .    .    atpc 1
chr1    atfreq    atpc    201    210       54.0000    .    .    atpc 1
chr1    atfreq    atpc    211    220       50.0000    .    .    atpc 1
chr1    atfreq    atpc    221    230       51.0000    .    .    atpc 1
chr1    atfreq    atpc    231    240       56.0000    .    .    atpc 1
chr1    atfreq    atpc    241    250       58.0000    .    .    atpc 1
chr1    atfreq    atpc    251    260       55.0000    .    .    atpc 1
chr1    atfreq    atpc    261    270       54.0000    .    .    atpc 1
chr1    atfreq    atpc    271    280       56.0000    .    .    atpc 1
chr1    atfreq    atpc    281    290       59.0000    .    .    atpc 1
chr1    atfreq    atpc    291    300       58.0000    .    .    atpc 1
chr1    atfreq    atpc    301    310       60.0000    .    .    atpc 1
chr1    atfreq    atpc    311    320       59.0000    .    .    atpc 1
chr1    atfreq    atpc    321    330       59.0000    .    .    atpc 1
chr1    atfreq    atpc    331    340       57.0000    .    .    atpc 1
chr1    atfreq    atpc    341    350       56.0000    .    .    atpc 1
chr1    atfreq    atpc    351    360       57.0000    .    .    atpc 1
chr1    atfreq    atpc    361    370       57.0000    .    .    atpc 1
chr1    atfreq    atpc    371    380       58.0000    .    .    atpc 1
chr1    atfreq    atpc    381    390       56.0000    .    .    atpc 1
chr1    atfreq    atpc    391    400       58.0000    .    .    atpc 1
chr1    atfreq    atpc    401    410       56.0000    .    .    atpc 1
chr1    atfreq    atpc    411    420       59.0000    .    .    atpc 1
chr1    atfreq    atpc    421    430       58.0000    .    .    atpc 1
chr1    atfreq    atpc    431    440       59.0000    .    .    atpc 1
chr1    atfreq    atpc    441    450       58.0000    .    .    atpc 1
chr1    atfreq    atpc    451    460       58.0000    .    .    atpc 1
chr1    atfreq    atpc    461    470       56.0000    .    .    atpc 1
chr1    atfreq    atpc    471    480       57.0000    .    .    atpc 1
chr1    atfreq    atpc    481    490       59.0000    .    .    atpc 1
============================


The sal.png :
http://avellano.usal.es/~luis/bioperl-l/sal.png

Thank you.


-- 
==================================================
 Luis Antonio Miguel Quintales
 Departamento de Inform?tica y Autom?tica
 Facultad de Ciencias
 Universidad de Salamanca
 Plaza de la Merced s/n
 37008-SALAMANCA
 SPAIN
==================================================
 Tel.: +34-923-294400(ext.1513)
 Fax.: +34-923-294584
 E-mail: lamq at usal.es
==================================================


From aaron.j.mackey at gsk.com  Mon Apr 14 09:00:52 2008
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Mon, 14 Apr 2008 09:00:52 -0400
Subject: [Bioperl-l] personal vs list email
In-Reply-To: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
Message-ID: <OF3ED0BD19.1CBA005A-ON8525742B.00473A95-8525742B.00477DEC@gsk.com>

I try to take it even one step further: I require the person to re-ask 
their question on the mailing list (and then try to answer it there). This 
has the added benefit of causing the person to pause a moment to reflect 
on their question, and (sometimes) to spend a bit more time preparing the 
question for more broader public consumption.

-Aaron


From sutripa at vbi.vt.edu  Mon Apr 14 12:54:47 2008
From: sutripa at vbi.vt.edu (Sucheta Tripathy)
Date: Mon, 14 Apr 2008 12:54:47 -0400 (EDT)
Subject: [Bioperl-l] Error installing XML::Parser
Message-ID: <1285.99.152.150.87.1208192087.squirrel@webmail.vbi.vt.edu>


Hello List,

I have recently installed bioperl using the following command. The
installation was successful. Now I am trying to install XML::Parser but it
returns with  error messages. Any clue what I may be doing wrong?

Thanks

Sucheta

Following is the last part of the error message:

### Error Message #######

Expat.c: In function ??~XS_XML__Parser__Expat_SkipUntil??T:
Expat.c:2664: error: ??~XML_Parser??T undeclared (first use in this function)
Expat.c:2664: error: expected ??~;??T before ??~parser??T
Expat.c:2665: warning: ISO C90 forbids mixed declarations and code
Expat.xs:2179: error: ??~parser??T undeclared (first use in this function)
Expat.xs:2179: warning: cast to pointer from integer of different size
Expat.xs:2180: error: ??~CallbackVector??T has no member named
??~st_serial??T
Expat.xs:2182: error: ??~CallbackVector??T has no member named
??~skip_until??T
Expat.c: In function ??~XS_XML__Parser__Expat_Do_External_Parse??T:
Expat.c:2687: error: ??~XML_Parser??T undeclared (first use in this function)
Expat.c:2687: error: expected ??~;??T before ??~parser??T
Expat.c:2688: warning: ISO C90 forbids mixed declarations and code
Expat.xs:2194: error: ??~parser??T undeclared (first use in this function)
Expat.xs:2194: warning: cast to pointer from integer of different size
Expat.xs:2205: warning: unused variable ??~pret??T
Expat.xs:2194: warning: unused variable ??~cbv??T
Expat.xs:2192: warning: unused variable ??~type??T
make[1]: *** [Expat.o] Error 1
make[1]: Leaving directory `/root/.cpan/build/XML-Parser-2.36/Expat'
make: *** [subdirs] Error 2
  /usr/bin/make  -- NOT OK
Running make test
  Can't test without successful make
Running make install
  make had returned bad status, install seems impossible

#####

-- 
Sucheta Tripathy, Ph.D.
Virginia Bioinformatics Institute Phase-I
Washington street.
Virginia Tech.
Blacksburg,VA 24061-0447
phone:(540)231-8138
Fax:  (540) 231-2606


From mmokrejs at ribosome.natur.cuni.cz  Tue Apr 15 06:45:48 2008
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Tue, 15 Apr 2008 12:45:48 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>	<47F9F3AA.2090003@uv.es>
	<200804071448.34769.heikki@sanbi.ac.za>	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>	<47FA4AD2.5030206@uv.es>
	<CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>
Message-ID: <4804875C.80506@ribosome.natur.cuni.cz>

Chris Fields wrote:
> Note in the example I gave that, during the revision history, the 
> DBSOURCE changed at the point of the creation date (the original nuc.
>  record was a M. tuberculosis contig sequence, which later changed to
> an updated full M. tuberculosis genome record at the time of the
> 'create date').
> 
> Couldn't find anything specific in the GenBank docs on this, but it 
> appears (at least for a protein record) the creation date reflects
> the date in which the sequence was either originally deposited or
> originally derived from the nucleotide source record present in the
> record.  In other words, it may not reflect the original date of
> deposition (which could have come from a different record, as in this
> case).
> 
> chris

Hi,
I have few answers from the past from NCBI staff to my similar questions
regarding DATE issues and VERSION numbers not being increased upon
"changes" in a record.
I tried below to put into a more readable form my former correspondence.
Hope this helps everybody to understand what happens in the black box. ;)
Martin


Date: Thu, 17 Jan 2002 15:40:07 -0500 (EST)
From: David Wheeler
Subject: Brucella_melitensis on ftp site

> Hi, I'd like to point you to the fact, that the descriptions of 
> Brucella_melitensis differ in 
> ftp.ncbi.nih.nlm.gov/genomes/Bacteria/Brucella_melitensis and 
> ftp.ncbi.nih.nlm.gov/genbank/genomes/Bacteria/Brucella_melitensis
> 
> Namely, the description of the strain is retained in *.gbk files
> under /genomes/Bacteria/Brucella_melitensis only under the strain
> description field, but not in the DEFINITION line, where it is
> present in *.gbk files under
> /genbank/genomes/Bacteria/Brucella_melitensis.
> 
> LOCUS       NC_003318 1177787 bp    DNA   circular  BCT
> 13-NOV-2001 DEFINITION  Brucella melitensis chromosome II, complete
> sequence. ACCESSION   NC_003318 VERSION     NC_003318.1  GI:17988344
> 
> compared to
> 
> LOCUS       AE008918  1177787 bp    DNA   circular  BCT
> 27-DEC-2001 DEFINITION  Brucella melitensis strain 16M chromosome II,
> complete sequence. ACCESSION   AE008918 VERSION     AE008918
> 
> This makes me worried about the data. Why is the release date of 
> NON-curated files (AE008918) newer than the release data of CURATED
> data (NC_003318)? Is it expected case? Could someone explain me the
> difference between them (i.e. CURATED vs. NONCURATED)?

The curated record is initially a copy of the non-curated record with certain 
changes in documentation made in order to comply with the NCBI standard for 
reference genomes. One change which you have noticed is the difference in 
Definition line format.  Curated genomic records are created in order to 
standardize annotation for genomes in the Entrez Genomes database while leaving 
editorial control for the parent GenBank records in the hands of the original 
submitters.

Regardles of the date you see on the record, the curated version is derived from 
the non-curated one.  In this case, it appears that the processing of the 
non-curated version lagged a little bit relative to that of the curated version. 
Normally, however, the non-curated version will have the earlier date.


Date: Sun, 27 Jan 2002 00:16:55 -0500 (EST)
From: David Wheeler
Subject: Re: CONSULT: Brucella_melitensis on ftp site

> Are the raw sequence data always same in non-curated and curated 
> flatfiles?
> 
> Is the annotation of orf's/proteins different between them?
> 
> Are there any new or withdrawn orf's or proteins in the curated
> flatfiles compared to non-curated ones?
> 
> My feeling is that no-one except original submitters can modify
> submitted data, so you cannot modify non-curated files, i.e. cannot
> modify them and increase the version number.
> 
> Because of that, you've introduced curated versions, which are just
> copies of original but public data so you are free to modify it. So
> once again, are the differences between non-curated and curated
> flatfiles only in structure of the file? I don't think so. Examples
> would be Listeria genomes or the 2 Agrobacterium's, if I remember
> right.

Initially, there should be no or very few differences, however, as time
goes by, differences in the annotation will materialize.  There may also
be differences in the sequence, if errors in the original sequence come to
light, but these differences should be very rare.

So, practically speaking, you will probably find few differences but,
since the purpose of the Refseq is to curate, there may well be some
differences.


Date: Mon, 17 Dec 2001 11:57:06 -0500 (EST)
From: Dawn Lipshultz
Subject: Re: Buggy date in Staphylococcus aureus N315

>>>> Hi, I've found there has been released Staphylococcus aureus
>>>> N315 on 01-JAN-1900, which is nonsense. I guss you had y2K bug.
>>>> 
>>>> 
>>>> Please see
>>>> 
>> ftp://ncbi.nlm.nih.gov/genbank/genomes/Bacteria/Staphylococcus_aureus_N315/BA000018.gbk
>> 
>>>> 
>>>> Can you please tell me the real release date?
>>>> 
>>>> Also, is newer the NC_xxxx for Staphylococcus aureus N315 under
>>>>  
>>>> ftp://ncbi.nlm.nih.gov/genomes/Bacteria/Staphylococcus_aureus_N315/
>>>>  or this BA000018 non-cured version?
>>>> 
>>>> 
>>>> LOCUS       BA000018  2814816 bp    DNA   circular  BCT
>>>> 01-JAN-1900 DEFINITION  Staphylococcus aureus strain N315,
>>>> complete genome.

>>> AP003129-AP003138. They are all dated June 2001.
>>> 
>>> The date for the record in the ftp file is April 2001. The record
>>> in GenBank (NC_002745) is dated October 2001. This version is
>>> apparently more updated than the one on the ftp site. Therefore,
>>> you may want to download the sequence from GenBank rather than
>>> the ftp site.
>>> 
>>> Regards, Dawn S. Lipshultz

>> I cannot find the record to which you refer in your message. When I
>>  did a search for accession number BA000018, I received results for
>>  accession numbers AP003129-AP003138. They are all dated June 2001.
>> 
>> 
>> The date for the record in the ftp file is April 2001. The record
>> in GenBank (NC_002745) is dated October 2001. This version is
>> apparently more updated than the one on the ftp site. Therefore,
>> you may want to download the sequence from GenBank rather than the
>> ftp site. Regards, Dawn S. Lipshultz

> 
> Hmm, but I do get: 
> http://www.ncbi.nlm.nih.gov:80/cgi-bin/Entrez/framik?db=genome&gi=179
> 
> 
> look at the "GenBank: NC_002745" text in left upper part of the
> window, it points to that OLD ftp file. The "RefSeq: NC_002745"
> points to the April 2001 version. So what is the right way to get the
> October 2001 release?
> 
> Where can I find the difference between NC_002745 from April compared
>  to NC_002745 from October?
> 
> What do you mean with "you may want to download the sequence from 
> GenBank rather than the ftp site."?
> 
> BOTH ftp directories at ftp://ncbi.nlm.nih.gov are outdated. I mean 
> the genomes/Bacteria/Staphylococcus_aureus_N315/NC_002745.* version 
> and also the 
> genbank/genomes/Bacteria/Staphylococcus_aureus_N315/BA000018.* 
> version.
> 
> The web links from www.ncbi.nlm.nih.gov:80/cgi-bin/Entrez/ point 
> anyway to the ftp site. Do you want to say that the ftp version
> aren't updated anymore?

The genome was originally released into the database on 4/20/2001
as 10 pieces with secondary accession number BA000018.  You can 
find these pieces in Entrez nucleotides by querying with BA000018.

The Genomes group here will fix the date on the record that is available
from Entrez genomes.

Regards,
Dawn


Date: Fri, 16 Nov 2001 16:09:59 -0500 (EST)
From: Susan Dombrowski
Subject: Re: Agrobacterium tumefaciens C58

> Dear colleague, I've noticed that there're somehow updated on Oct 17
> the genomic flatfiles of Agrobacterium tumefaciens C58 at 
> ftp://ncbi.nlm.nih.gov/genbank/genomes/Bacteria/Agrobacterium_tumefaciens/.
>  However, for example the AE007869.gbs does NOT self-explain what has
> been changed and also the VERSION number is not increased. Would you
> please explain what's the change, when can I find such information
> next time on web?
> 
> I've used the published sequence from your ftp site on 2001-08-29
> with same ID and would like to know, what differs.
> 
> LOCUS       AE007869  2841581 bp    DNA   circular  CON
> 17-OCT-2001 DEFINITION  Agrobacterium tumefaciens strain C58 circular
> chromosome, complete sequence. ACCESSION   AE007869 VERSION
> AE007869

Dear Colleague,
The version number of a sequence will *only* change if the content of the actual 
sequence has changed in any way since it was first made available. Although the 
date has changed, this date refers to the last time the actual record was 
manipulated by an NCBI staff member. Even if there is something simple, like 
adding a reference, changing a spelling mistake, etc., this will cause a change 
in the date field of the record. 

Thus, since the version has not changed, there are no differences to report.
Best Regards,
Susan


Date: Wed, 26 Jun 2002 11:04:48 -0400 (EDT)
From: Eric Sayers
Subject: Re: Mesorhizobium_loti flatfiles

>>>>> Hi,
>>>>>   I've found that you again silently changed flatfiles lying on your ftp
>>>>> some time ago without changing the revision number. Please apologize me,
>>>>> but this really causes troubles to other people working in this so called
>>>>> bioinformatics. :(
>>>>> 
>>>>> A week ago there was:
>>>>> 
>>>>> LOCUS       NC_002678            7036074 bp    DNA     circular BCT 10-SEP-2001
>>>>> DEFINITION  Mesorhizobium loti, complete genome.
>>>>> ACCESSION   NC_002678
>>>>> VERSION     NC_002678.1  GI:13470324
>>>>> 
>>>>> 
>>>>> and two other plasmid sequences. This yelds 7275 proteins.
>>>>> 
>>>>> But, last autumn there was:
>>>>> 
>>>>> LOCUS       NC_002678 7036074 bp    DNA   circular  BCT       28-MAR-2001
>>>>> DEFINITION  Mesorhizobium loti, complete genome.
>>>>> ACCESSION   NC_002678
>>>>> VERSION     NC_002678.1  GI:13470324
>>>>> 
>>>>> 
>>>>> That version had 7281 proteins in total.
>>>>> I have simple questions: "Why was NOT changed the VERSION number?".
>>>>>
>>>>> Do I understand it wrong, that it should get updated whenever a single
>>>>> character in the file contents is changed?
>>> 
>>>> The version number of a sequence only changes if the sequence itself is
>>>> modified. If anything else in the flat file is changed (ie spelling, authors,
>>>> annotations, etc) the version will not change. However, the modification date in
>>> 
>>> Sorry, do you under annotation also mean number of predicted genes, their
>>> coordinates(position) etc?
>>> 
>>>> the top line of the flat file will change for any of these modifications. (Note
>>>> that the dates are different in the file you display: Mar 28, 2001 vs Sept 10,
>>>> 2001.) I would track the modification date rather than or as well as the version
>>>> number to catch all changes in the files.
>>>> Regards,
>>>> Eric W. Sayers, Ph.D.
>>> 
>>> OK, but unless some of our programs have been buggy before or now (in
>>> either of those cases have failed to extract genes from flatfiles), I do
>>> not have an explanation for the differencies in amount of
>>> predicted/annotated genes.
>>> 
>>> I do not have anymore available the old flatfiles from Mar 28, but it
>>> seems to me that these were newly introduced in the Sept. 10 version:
>>> gi_15600768, gi_15600770, gi_15600769, gi_15600766, gi_15600767
>> 
>> Dear Colleague,
>> Again, the only reason the version number will change is if the sequence itself 
>> changes. The number of annotated/predicted genes is merely an annotation on the 
>> sequence, and does not change the sequence itself. Therefore, the version will 
>> not change when the number of annotations changes. The modification date on the 
>> flat file will (and did) change, of course.
>> 
>> Regards,
>> Eric W. Sayers, Ph.D.
> 
> Finally I've heard that from someone, thanks!
> Now just tell me, how can I figure out what changed between those
> different "date" releases? Is there a changelog available?
> I consider annotations changes very important.

We do not provide the details of flat file changes on our public websites, 
except for changes in the version number (ie actual sequence changes). In that 
particular case, all of the previous versions are linked to the current one. My 
advice to you if you want to chronicle non-sequence changes would be to check 
the flat files of interest periodically (by a script, for example) and look for 
changes in the modification dates. You could then simply compare the before and 
after flat files.

Regards,
Eric W. Sayers, Ph.D.


> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at:  
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id1_fetch.html
> 
> Here is an example:
> 
>> >id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD


From david at burt7259.freeserve.co.uk  Sun Apr 13 10:32:31 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sun, 13 Apr 2008 15:32:31 +0100
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce2$5400a710$0202a8c0@STUDYPC>
	<3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>
Message-ID: <000001c89d73$3b49eec0$0202a8c0@STUDYPC>

Hi Hilmar

 
Many thanks for info - tried a few things

 
1. First tried --safe flag

 
perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser root
--dbpass chicken --driver mysql --safe \

 --namespace "InterPro" --format interprosax interpro.xml

 
Still got same output as before

 
        ...deleting all relationships for InterPro

        ...parsing and loading InterPro

 
Can't call method "name" on an undefined value at load_ontology.pl line 914

 
Only 35 interpro entries entered into database

 
2. I am using bioperl 1.5.2

 
3. I downloaded Release 17.0, 20 March 2008 of the interpro.xml file from
ftp://ftp.ebi.ac.uk/pub/databases/interpro/

 
I did not send this file, sine it was ~10Mb gzipped

 
Dave

 
From david at burt7259.freeserve.co.uk  Sun Apr 13 10:53:43 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sun, 13 Apr 2008 15:53:43 +0100
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
Message-ID: <000001c89d76$319be060$0202a8c0@STUDYPC>

Hilmar

 
Also updated copy of bioperl - see output below

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ perl -MBio::Perl -le 'print Bio::Perl->VERSION;'

1.005002101

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl login

Logging in to :pserver:cvs at cvs.bioperl.org:2401/home/repository/bioperl

CVS password:

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ cd bioperl-live

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src/bioperl-live

$ cvs -q update -d -P -r bioperl-release-1-5-2

P Build.PL

P ModuleBuildBioperl.pm

P Bio/Root/Version.pm

cvs update: warning: t/data/taxdump/names.dmp was lost

U t/data/taxdump/names.dmp

cvs update: warning: t/data/taxdump/nodes.dmp was lost

U t/data/taxdump/nodes.dmp

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src/bioperl-live

$ perl -MBio::Perl -le 'print Bio::Perl->VERSION;'

1.0050021

 
Why is the VERSION 1.0050021 rather than 1.5.2 ?

 
Dave


From heikki at sanbi.ac.za  Wed Apr 16 07:36:16 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 16 Apr 2008 13:36:16 +0200
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
Message-ID: <200804161336.16879.heikki@sanbi.ac.za>

FYI,

Christoper Jones has just published 
[http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an 
article in Bioinformatics] about his 
[http://search.cpan.org/perldoc?Microarray Microarray perl module] in CPAN.

(The text added into BioPerl wiki.)

	-Heikki


On Friday 26 January 2007 16:05:01 Chris Fields wrote:
> Don't know if it's worth it, but could the microarray package be
> modified so that it deals with data generated from or interacts
> directly with Bioconductor (i.e. maybe including some specialized
> bioperl-run set of classes to run Bioconductor tasks, return
> lightweight bioperl microarray classes)?  Allen pointed out in a
> previous post that Bioconductor is the best pick for certain tasks,
> while Perl excels at others:
>
> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>
> Might be nice if we could merge both strengths together in some way.
>
> chris
>
> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
> >> Eh, there is some discussion activity on the list, but not much.  You
> >> are really better off moving to Bioconductor.
> >
> > Ok, thanks. I added that to the wiki page:
> >
> >     http://www.bioperl.org/wiki/Microarray_package
> >
> > j
> > seqlab.net
> > http://www.bioperl.org/wiki/User:Jhannah
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From heikki at sanbi.ac.za  Wed Apr 16 07:36:16 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 16 Apr 2008 13:36:16 +0200
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
Message-ID: <200804161336.16879.heikki@sanbi.ac.za>

FYI,

Christoper Jones has just published 
[http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an 
article in Bioinformatics] about his 
[http://search.cpan.org/perldoc?Microarray Microarray perl module] in CPAN.

(The text added into BioPerl wiki.)

	-Heikki


On Friday 26 January 2007 16:05:01 Chris Fields wrote:
> Don't know if it's worth it, but could the microarray package be
> modified so that it deals with data generated from or interacts
> directly with Bioconductor (i.e. maybe including some specialized
> bioperl-run set of classes to run Bioconductor tasks, return
> lightweight bioperl microarray classes)?  Allen pointed out in a
> previous post that Bioconductor is the best pick for certain tasks,
> while Perl excels at others:
>
> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>
> Might be nice if we could merge both strengths together in some way.
>
> chris
>
> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
> >> Eh, there is some discussion activity on the list, but not much.  You
> >> are really better off moving to Bioconductor.
> >
> > Ok, thanks. I added that to the wiki page:
> >
> >     http://www.bioperl.org/wiki/Microarray_package
> >
> > j
> > seqlab.net
> > http://www.bioperl.org/wiki/User:Jhannah
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________

From pan.mueller at yahoo.de  Wed Apr 16 08:34:51 2008
From: pan.mueller at yahoo.de (=?iso-8859-1?Q?Peter_M=FCller?=)
Date: Wed, 16 Apr 2008 12:34:51 +0000 (GMT)
Subject: [Bioperl-l] load_seqdatabase.pl --pipeline
Message-ID: <297809.47580.qm@web28203.mail.ukl.yahoo.com>

Dear list,

a want to add gene symbols to unigene-cluster which were in a biosql database and lacks this information.

So one way is to make a post-update script:
my $adp = $db->get_object_adaptor('Bio::ClusterI');
my $pseq = $adp->find_by_primary_key(n);
$adp->remove($pseq);
$pseq->gene('symbol');
$adp->store($pseq);
$adp->commit();

O.k., this works (I ask me why to remove the cluster first - bug or feature...?)

Second way - perhaps:
Using the --pipeline option, but it looks like useable only for seq-objects (Bio::Factory::SeqProcessoI) right?

regards
pan


      Machen Sie Yahoo! zu Ihrer Startseite. Los geht's: 
http://de.yahoo.com/set


From cjfields at uiuc.edu  Wed Apr 16 11:00:51 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 16 Apr 2008 10:00:51 -0500
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <479BD5A4-9C9A-4733-889D-65942F24A7F3@uiuc.edu>

That would be worth looking into at some point, if anyone's interested  
(though it may be best to build a 'bridging' module).  Wonder if it  
uses BioConductor and, if not, how performance is vs BioConductor?

chris

On Apr 16, 2008, at 6:36 AM, Heikki Lehvaslaiho wrote:

> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/ 
> 24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module]  
> in CPAN.
>
> (The text added into BioPerl wiki.)
>
> 	-Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>>> On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>>>> Eh, there is some discussion activity on the list, but not much.   
>>>> You
>>>> are really better off moving to Bioconductor.
>>>
>>> Ok, thanks. I added that to the wiki page:
>>>
>>>    http://www.bioperl.org/wiki/Microarray_package
>>>
>>> j
>>> seqlab.net
>>> http://www.bioperl.org/wiki/User:Jhannah
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From j-keller2 at md.northwestern.edu  Wed Apr 16 12:12:27 2008
From: j-keller2 at md.northwestern.edu (Jacob Keller)
Date: Wed, 16 Apr 2008 11:12:27 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>

Hello All,

I am new to this list, so am not totally sure this is the right forum, so 
please forgive if this is not the right place to asl the following question: 
I am seeking to get all sequences that have a given domain architecture, or 
at least that contain two given domains. I have thought of a few ways to do 
this.

1. Blast/Psi-blast for each domain, then compare the results for common 
sequences between the two lists, and fetch those. I would need to write a 
(simple) script to do this, but would prefer not to re-invent the wheel.

2. Search with a paradigm sequence of desired architecture/domain 
composition, somehow tweaking the psiblast parameters to find only matches 
over the whole search sequence, thereby finding both desired domains. I am 
not sure how to tweak blast to do this, though.

3. Pfam has this capability, i.e. to show all domains with a given 
architecture, but it is difficult to get at the actual sequences or even a 
list of accession numbers.

Does anybody have any suggestions as to how optimally to get these seq's?

Thanks for your consideration,

Jacob

*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-keller2 at northwestern.edu
*******************************************

----- Original Message ----- 
From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za>
To: <bioperl-l at lists.open-bio.org>
Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay Hannah" 
<jay at jays.net>; <bioperl-l at bioperl.org>
Sent: Wednesday, April 16, 2008 6:36 AM
Subject: Re: [Bioperl-l] bioperl-microarray: status?


> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module] in 
> CPAN.
>
> (The text added into BioPerl wiki.)
>
> -Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>> >> Eh, there is some discussion activity on the list, but not much.  You
>> >> are really better off moving to Bioconductor.
>> >
>> > Ok, thanks. I added that to the wiki page:
>> >
>> >     http://www.bioperl.org/wiki/Microarray_package
>> >
>> > j
>> > seqlab.net
>> > http://www.bioperl.org/wiki/User:Jhannah
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From j-keller2 at md.northwestern.edu  Wed Apr 16 12:12:27 2008
From: j-keller2 at md.northwestern.edu (Jacob Keller)
Date: Wed, 16 Apr 2008 11:12:27 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>

Hello All,

I am new to this list, so am not totally sure this is the right forum, so 
please forgive if this is not the right place to asl the following question: 
I am seeking to get all sequences that have a given domain architecture, or 
at least that contain two given domains. I have thought of a few ways to do 
this.

1. Blast/Psi-blast for each domain, then compare the results for common 
sequences between the two lists, and fetch those. I would need to write a 
(simple) script to do this, but would prefer not to re-invent the wheel.

2. Search with a paradigm sequence of desired architecture/domain 
composition, somehow tweaking the psiblast parameters to find only matches 
over the whole search sequence, thereby finding both desired domains. I am 
not sure how to tweak blast to do this, though.

3. Pfam has this capability, i.e. to show all domains with a given 
architecture, but it is difficult to get at the actual sequences or even a 
list of accession numbers.

Does anybody have any suggestions as to how optimally to get these seq's?

Thanks for your consideration,

Jacob

*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-keller2 at northwestern.edu
*******************************************

----- Original Message ----- 
From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za>
To: <bioperl-l at lists.open-bio.org>
Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay Hannah" 
<jay at jays.net>; <bioperl-l at bioperl.org>
Sent: Wednesday, April 16, 2008 6:36 AM
Subject: Re: [Bioperl-l] bioperl-microarray: status?


> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module] in 
> CPAN.
>
> (The text added into BioPerl wiki.)
>
> -Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>> >> Eh, there is some discussion activity on the list, but not much.  You
>> >> are really better off moving to Bioconductor.
>> >
>> > Ok, thanks. I added that to the wiki page:
>> >
>> >     http://www.bioperl.org/wiki/Microarray_package
>> >
>> > j
>> > seqlab.net
>> > http://www.bioperl.org/wiki/User:Jhannah
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From frederic.romagne at gmail.com  Wed Apr 16 13:25:18 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 16 Apr 2008 12:25:18 -0500
Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
Message-ID: <1208366718.19084.15.camel@kiss-laptop>

Hello,
i made a program which use Bio::Index::GenBank and i tested it under
unix, that worked well.

But i have to launch it under windows and it seems not to work on.

Here is the problem : 

my $dbobj = Bio::Index::Abstract->new("Data/$db");
?my $seq = $dbobj->get_Seq_by_acc($id);
print $seq->display_id."\n";

did not print the same number than $id !!! So i don't work on the
sequence expected...

I use the SVN sources on unix and the Perl package manager for
windows...

Thanks.


From cjfields at uiuc.edu  Wed Apr 16 13:52:59 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 16 Apr 2008 12:52:59 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
Message-ID: <BAA878A0-94B4-481F-B01C-A12086FD41E3@uiuc.edu>

You can try CDART:

http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi?cmd=rps

There are probably other tools out there as well.

If you want to roll your own, you can use bioperl wrappers for all of  
these (Bio::Tools::Run::StandAloneBlast is in bioperl-live,  
Bio::Tools::Run::Hmmer in bioperl-run), tweaking the parameters as you  
see fit, and either parse while running them or store the file for  
parsing later using Bio::SearchIO.  Personally, I wouldn't go with (2)  
unless you are absolutely sure the domains are found only once per  
sequence, are spatially conserved, and don't overlap.  For instance,  
with many proteins you could have a domain structure like dom1-dom2,  
dom2-dom1, dom1-dom1-dom2, etc.

If you just want accessions from Pfam's Stockholm format (which are  
UniProt, I believe) you can get at accessions using  
Bio::AlignIO::stockholm (using perl 5.10):

use Bio::AlignIO;
use feature 'say';

my $file = shift || die "Must pass file as argument\n";

my $in = Bio::AlignIO->new(-format => 'stockholm',
                            -file => $file);

while (my $aln = $in->next_aln) {
     my @accs;
     for my $seq ($aln->each_seq) {
         push @accs, $seq->accession_number;
     }
     say join(',', at accs);
}

chris

On Apr 16, 2008, at 11:12 AM, Jacob Keller wrote:

> Hello All,
>
> I am new to this list, so am not totally sure this is the right  
> forum, so please forgive if this is not the right place to asl the  
> following question: I am seeking to get all sequences that have a  
> given domain architecture, or at least that contain two given  
> domains. I have thought of a few ways to do this.
>
> 1. Blast/Psi-blast for each domain, then compare the results for  
> common sequences between the two lists, and fetch those. I would  
> need to write a (simple) script to do this, but would prefer not to  
> re-invent the wheel.
>
> 2. Search with a paradigm sequence of desired architecture/domain  
> composition, somehow tweaking the psiblast parameters to find only  
> matches over the whole search sequence, thereby finding both desired  
> domains. I am not sure how to tweak blast to do this, though.
>
> 3. Pfam has this capability, i.e. to show all domains with a given  
> architecture, but it is difficult to get at the actual sequences or  
> even a list of accession numbers.
>
> Does anybody have any suggestions as to how optimally to get these  
> seq's?
>
> Thanks for your consideration,
>
> Jacob
>
> *******************************************
> Jacob Pearson Keller
> Northwestern University
> Medical Scientist Training Program
> Dallos Laboratory
> F. Searle 1-240
> 2240 Campus Drive
> Evanston IL 60208
> lab: 847.491.2438
> cel: 773.608.9185
> email: j-keller2 at northwestern.edu
> *******************************************
>
> ----- Original Message ----- From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za 
> >
> To: <bioperl-l at lists.open-bio.org>
> Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay  
> Hannah" <jay at jays.net>; <bioperl-l at bioperl.org>
> Sent: Wednesday, April 16, 2008 6:36 AM
> Subject: Re: [Bioperl-l] bioperl-microarray: status?
>
>
>> FYI,
>>
>> Christoper Jones has just published
>> [http://bioinformatics.oxfordjournals.org/cgi/content/short/ 
>> 24/8/1102 an
>> article in Bioinformatics] about his
>> [http://search.cpan.org/perldoc?Microarray Microarray perl module]  
>> in CPAN.
>>
>> (The text added into BioPerl wiki.)
>>
>> -Heikki
>>
>>
>> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>>> Don't know if it's worth it, but could the microarray package be
>>> modified so that it deals with data generated from or interacts
>>> directly with Bioconductor (i.e. maybe including some specialized
>>> bioperl-run set of classes to run Bioconductor tasks, return
>>> lightweight bioperl microarray classes)?  Allen pointed out in a
>>> previous post that Bioconductor is the best pick for certain tasks,
>>> while Perl excels at others:
>>>
>>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>>
>>> Might be nice if we could merge both strengths together in some way.
>>>
>>> chris
>>>
>>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>>> >> Eh, there is some discussion activity on the list, but not  
>>> much.  You
>>> >> are really better off moving to Bioconductor.
>>> >
>>> > Ok, thanks. I added that to the wiki page:
>>> >
>>> >     http://www.bioperl.org/wiki/Microarray_package
>>> >
>>> > j
>>> > seqlab.net
>>> > http://www.bioperl.org/wiki/User:Jhannah
>>> >
>>> > _______________________________________________
>>> > Bioperl-l mailing list
>>> > Bioperl-l at lists.open-bio.org
>>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> -- 
>> ______ _/      _/ 
>> _____________________________________________________
>>     _/      _/
>>    _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>   _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>  _/  _/  _/  SANBI, South African National Bioinformatics Institute
>> _/  _/  _/  University of Western Cape, South Africa
>>    _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/ 
>> ________________________________________________________
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From David.Messina at sbc.su.se  Wed Apr 16 14:23:27 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 16 Apr 2008 20:23:27 +0200
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
Message-ID: <628aabb70804161123s453bd96bqd2213b938dfdb3a2@mail.gmail.com>

Hey Jacob,

This forum is mostly geared toward the BioPerl software package rather than
general bioinformatics assistance.

That being said, I would recommend using Pfam's Sequence Search to determine
the domain content of your sequences and then simply looking at those which
have the same two domains of interest.

If there are more sequences matching this criterion than can be examined
manually, you could write up something (potentially using BioPerl) to then
look at the relative order and number of those domains in your sequences.

However, if these sequences have UniProt IDs, you can start with the domains
and Pfam will hand you a list of all the UniProt seqs having those domains.
On the Pfam website's main page, click on "Help" (right side of menu at the
top of the page) and then "Tools and Services" (left side menu).


Dave

From Russell.Smithies at agresearch.co.nz  Wed Apr 16 16:49:49 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Apr 2008 08:49:49 +1200
Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
In-Reply-To: <1208366718.19084.15.camel@kiss-laptop>
References: <1208366718.19084.15.camel@kiss-laptop>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>

Did you check the format of your input file?
i.e. DOS or UNIX line endings?

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-
> bio.org] On Behalf Of Fr?d?ric Romagn?
> Sent: Thursday, 17 April 2008 5:25 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> 
> Hello,
> i made a program which use Bio::Index::GenBank and i tested it under
> unix, that worked well.
> 
> But i have to launch it under windows and it seems not to work on.
> 
> Here is the problem :
> 
> my $dbobj = Bio::Index::Abstract->new("Data/$db");
> ?my $seq = $dbobj->get_Seq_by_acc($id);
> print $seq->display_id."\n";
> 
> did not print the same number than $id !!! So i don't work on the
> sequence expected...
> 
> I use the SVN sources on unix and the Perl package manager for
> windows...
> 
> Thanks.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From frederic.romagne at gmail.com  Wed Apr 16 17:39:07 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 16 Apr 2008 16:39:07 -0500
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
Message-ID: <1208381947.16620.6.camel@kiss-laptop>

Well, if with input file you mean the database used, it's created
with ?Bio::Index::GenBank from a ncbi FTP's genbank file.

$id is an accession number read from a file but i chomp the line...

I am trying to install the svn version of bioperl under windows to see
if there is an improvement.

Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
> Did you check the format of your input file?
> i.e. DOS or UNIX line endings?
> 
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-
> > bio.org] On Behalf Of Fr?d?ric Romagn?
> > Sent: Thursday, 17 April 2008 5:25 a.m.
> > To: bioperl-l at lists.open-bio.org
> > Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> > 
> > Hello,
> > i made a program which use Bio::Index::GenBank and i tested it under
> > unix, that worked well.
> > 
> > But i have to launch it under windows and it seems not to work on.
> > 
> > Here is the problem :
> > 
> > my $dbobj = Bio::Index::Abstract->new("Data/$db");
> > ?my $seq = $dbobj->get_Seq_by_acc($id);
> > print $seq->display_id."\n";
> > 
> > did not print the same number than $id !!! So i don't work on the
> > sequence expected...
> > 
> > I use the SVN sources on unix and the Perl package manager for
> > windows...
> > 
> > Thanks.
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


From hubert.gaynor at yahoo.com  Thu Apr 17 02:19:11 2008
From: hubert.gaynor at yahoo.com (Hubert Gaynor)
Date: Wed, 16 Apr 2008 23:19:11 -0700 (PDT)
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
Message-ID: <657734.41592.qm@web46008.mail.sp1.yahoo.com>

Hi,

As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier? 

Thanks!

Hubert.


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

From sdavis2 at mail.nih.gov  Thu Apr 17 06:36:32 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 17 Apr 2008 06:36:32 -0400
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
In-Reply-To: <657734.41592.qm@web46008.mail.sp1.yahoo.com>
References: <657734.41592.qm@web46008.mail.sp1.yahoo.com>
Message-ID: <264855a00804170336o2a2bcff9xfcb05a33bac4c8dc@mail.gmail.com>

On Thu, Apr 17, 2008 at 2:19 AM, Hubert Gaynor <hubert.gaynor at yahoo.com> wrote:
> Hi,
>
>  As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier?
>

formatdb is used to make a representation that can be used efficiently
by blast.  That representation already makes blast faster.  MySQL
can't be used for such things.  As for speeding blast, if you have a
multiprocessor machine, you can take advantage of those using blast
and increasing the number of processors.  Also, while blast is a very
versatile program, it is not the only alignment program available.
Depending on your needs, you could look at other programs such as blat
or gmap that can be 2-3 orders of magnitude faster than blast.

Sean

From stefan.kirov at bms.com  Thu Apr 17 09:40:29 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 09:40:29 -0400
Subject: [Bioperl-l] bioperl-db woes
Message-ID: <4807534D.80105@bms.com>

I'm having problems passing all the tests for bioperl-db. There are 2
distinct errors, first one:
Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
   ***Which by the way is embed deep into several layers of eval, so I
am getting the actual error from the test:
    ***t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"
via package "Bio::Ontology::Term" at    
       
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
line 552, <GEN0> line 78.
       or
       ------------- EXCEPTION: Bio::Root::Exception -------------

    MSG: Annotation of class Bio::Annotation::Collection not
    type-mapped. Internal error?
    STACK: Error::throw
    STACK: Bio::Root::Root::throw
    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
    STACK:
    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
    STACK: Bio::DB::Persistent::PersistentObject::store
    Bio/DB/Persistent/PersistentObject.pm:271
    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
    Bio/DB/BioSQL/SeqAdaptor.pm:224
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
    STACK: Bio::DB::Persistent::PersistentObject::create
    Bio/DB/Persistent/PersistentObject.pm:244
    STACK: t/04swiss.t:36
    -----------------------------------------------------------

It turns out the adaptor is really not there???
My bioperl-db is from
dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
bioperl-db (revision 14661)
Is this module being deprecated (I am sure it is not) my download
incomplete....?
The other problem was:
DBD::Oracle::st execute failed: ORA-02292: integrity constraint
(BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
ParamValues: :p1=9606] at
/home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 320.
not ok 76
# Test 76 got: <UNDEF> (t/02species.t at line 71)
I have not tried to debug this one....
Thanks!
Stefan

From stefan.kirov at bms.com  Thu Apr 17 10:18:30 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 10:18:30 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
Message-ID: <Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>


On Thu, 17 Apr 2008, Chris Fields wrote:

> The 'get_dbxrefs' problem looks related to recent changes I made when rolling 
> back the significant feature/annotation changes introduced just prior to the 
> 1.5 release, none which were fully implemented.  I can check that one out. 
> Odd though; these passed for me, but I'm using MySQL not oracle.
get_dbxref is not the problem- I think the error message is misleading:
kirovs at horta:~/bioperl-db> grep get_dbxrefs 
/home/kirovs/bioperl-live/Bio/Ontology/Term.pm
            get_dbxrefs() instead, which handles both strings and DBLink
                       "Use get_dbxrefs() instead");
     $self->get_dbxrefs($context);
=head2 get_dbxrefs
  Title   : get_dbxrefs()
  Usage   : @ds = $term->get_dbxrefs();
sub get_dbxrefs {
} # get_dbxrefs
     my @old = $self->get_dbxrefs($context);
sub each_dblink {shift->throw("use of each_dblink() is deprecated; use 
get_dbxrefs() instead")}

So it is there.
In any case I debugged and tracked that down to the RichSeq adaptor module 
missing. It is not in the distro I downloaded, so I think this is my 
problem. It is a different question why...
I looked at different repos (SVN, CVS, trunk, different tags) and I did 
not see RichSeq.pm. I am not sure what is going on. Perhaps Hilmar will be 
able to help when he is around.
Thanks for the help Chris.... 
Stefan

>
> You may want to make sure you are using bioperl-live and that there isn't an 
> older bioperl installation getting into the mix.
>
> chris
>
> On Apr 17, 2008, at 8:40 AM, Stefan Kirov wrote:
>
>> I'm having problems passing all the tests for bioperl-db. There are 2
>> distinct errors, first one:
>> Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
>>  ***Which by the way is embed deep into several layers of eval, so I
>> am getting the actual error from the test:
>>   ***t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::Term" at
>> 
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 78.
>>      or
>>      ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>   MSG: Annotation of class Bio::Annotation::Collection not
>>   type-mapped. Internal error?
>>   STACK: Error::throw
>>   STACK: Bio::Root::Root::throw
>>   /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>   STACK:
>>   Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>   STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>   STACK: Bio::DB::Persistent::PersistentObject::store
>>   Bio/DB/Persistent/PersistentObject.pm:271
>>   STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>   Bio/DB/BioSQL/SeqAdaptor.pm:224
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>   STACK: Bio::DB::Persistent::PersistentObject::create
>>   Bio/DB/Persistent/PersistentObject.pm:244
>>   STACK: t/04swiss.t:36
>>   -----------------------------------------------------------
>> 
>> It turns out the adaptor is really not there???
>> My bioperl-db is from
>> dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
>> bioperl-db (revision 14661)
>> Is this module being deprecated (I am sure it is not) my download
>> incomplete....?
>> The other problem was:
>> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
>> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
>> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
>> ParamValues: :p1=9606] at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
>> line 320.
>> not ok 76
>> # Test 76 got: <UNDEF> (t/02species.t at line 71)
>> I have not tried to debug this one....
>> Thanks!
>> Stefan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>

From cjfields at uiuc.edu  Thu Apr 17 09:59:57 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 08:59:57 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <4807534D.80105@bms.com>
References: <4807534D.80105@bms.com>
Message-ID: <82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>

The 'get_dbxrefs' problem looks related to recent changes I made when  
rolling back the significant feature/annotation changes introduced  
just prior to the 1.5 release, none which were fully implemented.  I  
can check that one out.  Odd though; these passed for me, but I'm  
using MySQL not oracle.

You may want to make sure you are using bioperl-live and that there  
isn't an older bioperl installation getting into the mix.

chris

On Apr 17, 2008, at 8:40 AM, Stefan Kirov wrote:

> I'm having problems passing all the tests for bioperl-db. There are 2
> distinct errors, first one:
> Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
>   ***Which by the way is embed deep into several layers of eval, so I
> am getting the actual error from the test:
>    ***t/04swiss.........ok 3/52Can't locate object method  
> "get_dbxrefs"
> via package "Bio::Ontology::Term" at
>
> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
> line 552, <GEN0> line 78.
>       or
>       ------------- EXCEPTION: Bio::Root::Exception -------------
>
>    MSG: Annotation of class Bio::Annotation::Collection not
>    type-mapped. Internal error?
>    STACK: Error::throw
>    STACK: Bio::Root::Root::throw
>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>    STACK:
>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>    STACK: Bio::DB::Persistent::PersistentObject::store
>    Bio/DB/Persistent/PersistentObject.pm:271
>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>    STACK: Bio::DB::Persistent::PersistentObject::create
>    Bio/DB/Persistent/PersistentObject.pm:244
>    STACK: t/04swiss.t:36
>    -----------------------------------------------------------
>
> It turns out the adaptor is really not there???
> My bioperl-db is from
> dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
> bioperl-db (revision 14661)
> Is this module being deprecated (I am sure it is not) my download
> incomplete....?
> The other problem was:
> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
> ParamValues: :p1=9606] at
> /home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm
> line 320.
> not ok 76
> # Test 76 got: <UNDEF> (t/02species.t at line 71)
> I have not tried to debug this one....
> Thanks!
> Stefan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stefan.kirov at bms.com  Thu Apr 17 10:52:32 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 10:52:32 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>
References: <4807534D.80105@bms.com>
	<9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>
Message-ID: <Pine.WNT.4.64.0804171052070.2732@A161887.one.ads.bms.com>

That is correct and I assumed I should not be concerned with this error.
Thanks
Stefan

On Thu, 17 Apr 2008, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 9:40 AM, Stefan Kirov wrote:
>> The other problem was:
>> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
>> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
>> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
>> ParamValues: :p1=9606] at
>
>
> This sounds like you are running the tests against a non-empty database?
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>

From hlapp at gmx.net  Thu Apr 17 10:47:58 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Apr 2008 10:47:58 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
Message-ID: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>


On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
> In any case I debugged and tracked that down to the RichSeq adaptor  
> module missing.


That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
Bio::Seq and a SeqAdaptor is present.

I'm afraid it gets stuck somewhere else and frankly I didn't see the  
RichSeqAdaptor failing to load in your stack trace:

>        ------------- EXCEPTION: Bio::Root::Exception -------------
>
>     MSG: Annotation of class Bio::Annotation::Collection not
>     type-mapped. Internal error?
>     STACK: Error::throw
>     STACK: Bio::Root::Root::throw
>     /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>     STACK:
>     Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>     Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>     STACK:  
> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>     Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>     STACK: Bio::DB::Persistent::PersistentObject::store
>     Bio/DB/Persistent/PersistentObject.pm:271
>     STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>     Bio/DB/BioSQL/SeqAdaptor.pm:224
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>     STACK: Bio::DB::Persistent::PersistentObject::create
>     Bio/DB/Persistent/PersistentObject.pm:244
>     STACK: t/04swiss.t:36
>     -----------------------------------------------------------

What that tells me is that when bioperl-db tries to store the  
annotation bundle of the (SwissProt) sequence, one of the annotations  
that it encounters is of type Bio::Annotation::Collection. At present  
bioperl-db doesn't know what to do with it; i.e., bioperl-db can't  
yet handle hierarchical annotation collections (collections within  
collections).

I believe this is due to recent changes in how the GN line is parsed  
in BioPerl - Chris does this ring the right bell? I thought though  
you had built in a method would allow flattening out?

It's worth noting that BioSQL itself can't really represent nested  
annotation collections other than by using ontology terms and their  
hierarchy, which at present I think isn't really appropriate, but I  
have to think through the issue more. In other words, in BioSQL you  
can't directly tie together a bunch of qualifier value pairs into a  
"bag" and then nest this bag within another. The way to make this  
work with the current schema is to flatten out the nesting.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Thu Apr 17 10:48:52 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Apr 2008 10:48:52 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <4807534D.80105@bms.com>
References: <4807534D.80105@bms.com>
Message-ID: <9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>


On Apr 17, 2008, at 9:40 AM, Stefan Kirov wrote:
> The other problem was:
> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
> ParamValues: :p1=9606] at


This sounds like you are running the tests against a non-empty database?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From stefan.kirov at bms.com  Thu Apr 17 11:28:42 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 11:28:42 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
Message-ID: <Pine.WNT.4.64.0804171052430.2732@A161887.one.ads.bms.com>

Hilmar,
I think I saw what happens with this adaptor-
In Bio::DB::BioSQL::DBAdaptor::_load_object_adaptor (call from 
create_persistent) there is request that this module is loaded:
Bio/DB/BioSQL/RichSeqAdaptor.pm
There is no such module... This always fails, but since it is evaled, 
there is no actual error- instead. Perhaps this is leftover...?
This got me fooled...

I guess Chris could be right-
  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key is 
being passed Bio::Annotation::Collection as a value for $obj->obj(). Or 
recursing too far?
Anyway, I am just guessing here- I do not know the architecture of 
bioperl-db...
Thanks again for the help...
Stefan

  On Thu, 17 Apr 2008, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>> In any case I debugged and tracked that down to the RichSeq adaptor module 
>> missing.
>
>
> That almost can't be the problem. Every Bio::Seq::RichSeq is-a Bio::Seq and a 
> SeqAdaptor is present.
>
> I'm afraid it gets stuck somewhere else and frankly I didn't see the 
> RichSeqAdaptor failing to load in your stack trace:
>
>>       ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>    MSG: Annotation of class Bio::Annotation::Collection not
>>    type-mapped. Internal error?
>>    STACK: Error::throw
>>    STACK: Bio::Root::Root::throw
>>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>    STACK:
>>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>    STACK: Bio::DB::Persistent::PersistentObject::store
>>    Bio/DB/Persistent/PersistentObject.pm:271
>>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::Persistent::PersistentObject::create
>>    Bio/DB/Persistent/PersistentObject.pm:244
>>    STACK: t/04swiss.t:36
>>    -----------------------------------------------------------
>
> What that tells me is that when bioperl-db tries to store the annotation 
> bundle of the (SwissProt) sequence, one of the annotations that it encounters 
> is of type Bio::Annotation::Collection. At present bioperl-db doesn't know 
> what to do with it; i.e., bioperl-db can't yet handle hierarchical annotation 
> collections (collections within collections).
>
> I believe this is due to recent changes in how the GN line is parsed in 
> BioPerl - Chris does this ring the right bell? I thought though you had built 
> in a method would allow flattening out?
>
> It's worth noting that BioSQL itself can't really represent nested annotation 
> collections other than by using ontology terms and their hierarchy, which at 
> present I think isn't really appropriate, but I have to think through the 
> issue more. In other words, in BioSQL you can't directly tie together a bunch 
> of qualifier value pairs into a "bag" and then nest this bag within another. 
> The way to make this work with the current schema is to flatten out the 
> nesting.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>

From cjfields at uiuc.edu  Thu Apr 17 12:26:41 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 11:26:41 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
Message-ID: <AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>


On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>> In any case I debugged and tracked that down to the RichSeq adaptor  
>> module missing.
>
>
> That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
> Bio::Seq and a SeqAdaptor is present.
>
> I'm afraid it gets stuck somewhere else and frankly I didn't see the  
> RichSeqAdaptor failing to load in your stack trace:
>
>>       ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>    MSG: Annotation of class Bio::Annotation::Collection not
>>    type-mapped. Internal error?
>>    STACK: Error::throw
>>    STACK: Bio::Root::Root::throw
>>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>    STACK:
>>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>    STACK:  
>> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>    STACK: Bio::DB::Persistent::PersistentObject::store
>>    Bio/DB/Persistent/PersistentObject.pm:271
>>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::Persistent::PersistentObject::create
>>    Bio/DB/Persistent/PersistentObject.pm:244
>>    STACK: t/04swiss.t:36
>>    -----------------------------------------------------------
>
> What that tells me is that when bioperl-db tries to store the  
> annotation bundle of the (SwissProt) sequence, one of the  
> annotations that it encounters is of type  
> Bio::Annotation::Collection. At present bioperl-db doesn't know what  
> to do with it; i.e., bioperl-db can't yet handle hierarchical  
> annotation collections (collections within collections).
>
> I believe this is due to recent changes in how the GN line is parsed  
> in BioPerl - Chris does this ring the right bell? I thought though  
> you had built in a method would allow flattening out

This appears to be using an older bioperl-live checkout, one where  
Heikki changed GN parsing to use a nested Annotation::Collection.  I  
reverted that back in a later commit to svn specifically b/c of  
bioperl-db problems.  bioperl-live's swiss.pm now uses a new subclass  
of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that  
represents nested values via Data::Stag's itext output (we can change  
that to alternatives if needed).

Here are the last few relevant revisions in bioperl-live's main trunk  
(mine is the latest):

------------------------------------------------------------------------
r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1  
line

bug 1825: updating swiss.pm/tests to try out TagTree (passes all  
tests).  Need to update Handler.t and related modules still...
------------------------------------------------------------------------
r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line

documentation for the GN line parsing and management
------------------------------------------------------------------------
r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line

GN (Gene Name) line parsing rewrite. Breaks backward compatibility.  
Can now deal with >1 gene per entry and four categories of names per  
gene. Parses old style syntax (...OR ... OR ... ) into one gene name  
and synonyms for each gene. Docs to follow.

....

I just updated all code from dev and reran bioperl-db tests w/o  
problems.  Maybe someone else could do the same to see what happens?

> It's worth noting that BioSQL itself can't really represent nested  
> annotation collections other than by using ontology terms and their  
> hierarchy, which at present I think isn't really appropriate, but I  
> have to think through the issue more. In other words, in BioSQL you  
> can't directly tie together a bunch of qualifier value pairs into a  
> "bag" and then nest this bag within another. The way to make this  
> work with the current schema is to flatten out the nesting.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Might be worth looking into for a future BioSQL release, but we have a  
decent workaround in place for now, as long as it works cross-platform  
and cross-RDB.

chris


From stefan.kirov at bms.com  Thu Apr 17 12:40:14 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 12:40:14 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
Message-ID: <Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>

Hilmar,
sorry, I missed the part after the stack trace... In any case this is 
still problem for me after I updated bioperl-live.
I see this with a number of other tests:
t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 78.
t/04swiss.........dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 6-52
         Failed 47/52 tests, 9.62% okay
t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 72.
t/05seqfeature....FAILED tests 9-48
         Failed 40/48 tests, 16.67% okay
t/06comment.......ok
t/07dblink........ok
t/08genbank.......ok
t/09fuzzy2........ok
t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1420.
t/10ensembl.......dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 3-15
         Failed 13/15 tests, 13.33% okay
t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs" via 
package "Bio::Annotation::OntologyTerm" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1.
t/11locuslink.....dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 5-110
         Failed 106/110 tests, 3.64% okay
t/12ontology......ok 1/738Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::GOterm" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 98.
t/12ontology......dubious
         Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 5-738
         Failed 734/738 tests, 0.54% okay
t/13remove........ok 2/59Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 145.
t/13remove........FAILED tests 11-59
         Failed 49/59 tests, 16.95% okay
t/14query.........ok
t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1.
t/15cluster.......dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 6-160
         Failed 155/160 tests, 3.12% okay
t/16obda..........ok

On Thu, 17 Apr 2008, Chris Fields wrote:

>
> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>
>> 
>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>> In any case I debugged and tracked that down to the RichSeq adaptor module 
>>> missing.
>> 
>> 
>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a Bio::Seq and 
>> a SeqAdaptor is present.
>> 
>> I'm afraid it gets stuck somewhere else and frankly I didn't see the 
>> RichSeqAdaptor failing to load in your stack trace:
>>
>>>      ------------- EXCEPTION: Bio::Root::Exception -------------
>>>
>>>   MSG: Annotation of class Bio::Annotation::Collection not
>>>   type-mapped. Internal error?
>>>   STACK: Error::throw
>>>   STACK: Bio::Root::Root::throw
>>>   /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>   STACK:
>>>   Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>   STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>   STACK: Bio::DB::Persistent::PersistentObject::store
>>>   Bio/DB/Persistent/PersistentObject.pm:271
>>>   STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>   Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>   STACK: Bio::DB::Persistent::PersistentObject::create
>>>   Bio/DB/Persistent/PersistentObject.pm:244
>>>   STACK: t/04swiss.t:36
>>>   -----------------------------------------------------------
>> 
>> What that tells me is that when bioperl-db tries to store the annotation 
>> bundle of the (SwissProt) sequence, one of the annotations that it 
>> encounters is of type Bio::Annotation::Collection. At present bioperl-db 
>> doesn't know what to do with it; i.e., bioperl-db can't yet handle 
>> hierarchical annotation collections (collections within collections).
>> 
>> I believe this is due to recent changes in how the GN line is parsed in 
>> BioPerl - Chris does this ring the right bell? I thought though you had 
>> built in a method would allow flattening out
>
> This appears to be using an older bioperl-live checkout, one where Heikki 
> changed GN parsing to use a nested Annotation::Collection.  I reverted that 
> back in a later commit to svn specifically b/c of bioperl-db problems. 
> bioperl-live's swiss.pm now uses a new subclass of 
> Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that represents 
> nested values via Data::Stag's itext output (we can change that to 
> alternatives if needed).
>
> Here are the last few relevant revisions in bioperl-live's main trunk (mine 
> is the latest):
>
> ------------------------------------------------------------------------
> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1 line
>
> bug 1825: updating swiss.pm/tests to try out TagTree (passes all tests). 
> Need to update Handler.t and related modules still...
> ------------------------------------------------------------------------
> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line
>
> documentation for the GN line parsing and management
> ------------------------------------------------------------------------
> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line
>
> GN (Gene Name) line parsing rewrite. Breaks backward compatibility. Can now 
> deal with >1 gene per entry and four categories of names per gene. Parses old 
> style syntax (...OR ... OR ... ) into one gene name and synonyms for each 
> gene. Docs to follow.
>
> ....
>
> I just updated all code from dev and reran bioperl-db tests w/o problems. 
> Maybe someone else could do the same to see what happens?
>
>> It's worth noting that BioSQL itself can't really represent nested 
>> annotation collections other than by using ontology terms and their 
>> hierarchy, which at present I think isn't really appropriate, but I have to 
>> think through the issue more. In other words, in BioSQL you can't directly 
>> tie together a bunch of qualifier value pairs into a "bag" and then nest 
>> this bag within another. The way to make this work with the current schema 
>> is to flatten out the nesting.
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>
> Might be worth looking into for a future BioSQL release, but we have a decent 
> workaround in place for now, as long as it works cross-platform and 
> cross-RDB.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

From cjfields at uiuc.edu  Thu Apr 17 13:06:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 12:06:39 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
	<Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
Message-ID: <C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>

Stefan,

'get_dbxrefs' was introduced in bioperl-live a while back during the  
feature/annotation rollback detailed here:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

I still think this is an interfering old bioperl (and maybe bioperl- 
db) installation causing the problems; I had similar issues at one  
point and had to find and remove the old installation.  It might be  
worth (1) checking 'perldoc -l Bio::Root::Root', which will give the  
location of the Bio::Root::Root in lib path being used, and (2) using  
'./Build install uninst=1' to remove any old bioperl/bioperl-db  
installations.

chris

On Apr 17, 2008, at 11:40 AM, Stefan Kirov wrote:

> Hilmar,
> sorry, I missed the part after the stack trace... In any case this  
> is still problem for me after I updated bioperl-live.
> I see this with a number of other tests:
> t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 78.
> t/04swiss.........dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 6-52
>        Failed 47/52 tests, 9.62% okay
> t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 72.
> t/05seqfeature....FAILED tests 9-48
>        Failed 40/48 tests, 16.67% okay
> t/06comment.......ok
> t/07dblink........ok
> t/08genbank.......ok
> t/09fuzzy2........ok
> t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 1420.
> t/10ensembl.......dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 3-15
>        Failed 13/15 tests, 13.33% okay
> t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs"  
> via package "Bio::Annotation::OntologyTerm" at /home/kirovs/bioperl- 
> db/blib/lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0>  
> line 1.
> t/11locuslink.....dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 5-110
>        Failed 106/110 tests, 3.64% okay
> t/12ontology......ok 1/738Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::GOterm" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 98.
> t/12ontology......dubious
>        Test returned status 255 (wstat 65280, 0xff00)
> DIED. FAILED tests 5-738
>        Failed 734/738 tests, 0.54% okay
> t/13remove........ok 2/59Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 145.
> t/13remove........FAILED tests 11-59
>        Failed 49/59 tests, 16.95% okay
> t/14query.........ok
> t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 1.
> t/15cluster.......dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 6-160
>        Failed 155/160 tests, 3.12% okay
> t/16obda..........ok
>
> On Thu, 17 Apr 2008, Chris Fields wrote:
>
>>
>> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>>
>>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>>> In any case I debugged and tracked that down to the RichSeq  
>>>> adaptor module missing.
>>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
>>> Bio::Seq and a SeqAdaptor is present.
>>> I'm afraid it gets stuck somewhere else and frankly I didn't see  
>>> the RichSeqAdaptor failing to load in your stack trace:
>>>
>>>>     ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>
>>>>  MSG: Annotation of class Bio::Annotation::Collection not
>>>>  type-mapped. Internal error?
>>>>  STACK: Error::throw
>>>>  STACK: Bio::Root::Root::throw
>>>>  /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>>  STACK:
>>>>  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>>  STACK:  
>>>> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>>  STACK: Bio::DB::Persistent::PersistentObject::store
>>>>  Bio/DB/Persistent/PersistentObject.pm:271
>>>>  STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>>  Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>  STACK: Bio::DB::Persistent::PersistentObject::create
>>>>  Bio/DB/Persistent/PersistentObject.pm:244
>>>>  STACK: t/04swiss.t:36
>>>>  -----------------------------------------------------------
>>> What that tells me is that when bioperl-db tries to store the  
>>> annotation bundle of the (SwissProt) sequence, one of the  
>>> annotations that it encounters is of type  
>>> Bio::Annotation::Collection. At present bioperl-db doesn't know  
>>> what to do with it; i.e., bioperl-db can't yet handle hierarchical  
>>> annotation collections (collections within collections).
>>> I believe this is due to recent changes in how the GN line is  
>>> parsed in BioPerl - Chris does this ring the right bell? I thought  
>>> though you had built in a method would allow flattening out
>>
>> This appears to be using an older bioperl-live checkout, one where  
>> Heikki changed GN parsing to use a nested Annotation::Collection.   
>> I reverted that back in a later commit to svn specifically b/c of  
>> bioperl-db problems. bioperl-live's swiss.pm now uses a new  
>> subclass of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree)  
>> that represents nested values via Data::Stag's itext output (we can  
>> change that to alternatives if needed).
>>
>> Here are the last few relevant revisions in bioperl-live's main  
>> trunk (mine is the latest):
>>
>> ------------------------------------------------------------------------
>> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) |  
>> 1 line
>>
>> bug 1825: updating swiss.pm/tests to try out TagTree (passes all  
>> tests). Need to update Handler.t and related modules still...
>> ------------------------------------------------------------------------
>> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1  
>> line
>>
>> documentation for the GN line parsing and management
>> ------------------------------------------------------------------------
>> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1  
>> line
>>
>> GN (Gene Name) line parsing rewrite. Breaks backward compatibility.  
>> Can now deal with >1 gene per entry and four categories of names  
>> per gene. Parses old style syntax (...OR ... OR ... ) into one gene  
>> name and synonyms for each gene. Docs to follow.
>>
>> ....
>>
>> I just updated all code from dev and reran bioperl-db tests w/o  
>> problems. Maybe someone else could do the same to see what happens?
>>
>>> It's worth noting that BioSQL itself can't really represent nested  
>>> annotation collections other than by using ontology terms and  
>>> their hierarchy, which at present I think isn't really  
>>> appropriate, but I have to think through the issue more. In other  
>>> words, in BioSQL you can't directly tie together a bunch of  
>>> qualifier value pairs into a "bag" and then nest this bag within  
>>> another. The way to make this work with the current schema is to  
>>> flatten out the nesting.
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>
>> Might be worth looking into for a future BioSQL release, but we  
>> have a decent workaround in place for now, as long as it works  
>> cross-platform and cross-RDB.
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stefan.kirov at bms.com  Thu Apr 17 13:52:22 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 13:52:22 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
	<Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
	<C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>
Message-ID: <48078E56.9000404@bms.com>

Chris Fields wrote:
> Stefan,
>
> 'get_dbxrefs' was introduced in bioperl-live a while back during the
> feature/annotation rollback detailed here:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback
>
Chris was right!
> I still think this is an interfering old bioperl (and maybe
> bioperl-db) installation causing the problems; I had similar issues at
> one point and had to find and remove the old installation.  It might
> be worth (1) checking 'perldoc -l Bio::Root::Root',
This is the first thing I did and it seemed fine from command line.
So I checked a new copy (vs. updating), set PERL5LIB to the minimum
which is necessary (Build changes INC), which is
/home/kirovs/bioperl-db/bplive:/stf/sysdev/perl/newlib/perl/lib/5.8/ia64-linux-multi/
(/home/kirovs/bioperl-db/bplive being the fresh copy and the other
having Module::Build, etc., but definitely no bioperl).
This fixed the problem. I still do not see where the old module came
from, but that was a really good guess.
Thanks
Stefan
> which will give the location of the Bio::Root::Root in lib path being
> used, and (2) using './Build install uninst=1' to remove any old
> bioperl/bioperl-db installations.
Unfortunately this is not an option for me.
>
> chris
>
> On Apr 17, 2008, at 11:40 AM, Stefan Kirov wrote:
>
>> Hilmar,
>> sorry, I missed the part after the stack trace... In any case this is
>> still problem for me after I updated bioperl-live.
>> I see this with a number of other tests:
>> t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 78.
>> t/04swiss.........dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 6-52
>>        Failed 47/52 tests, 9.62% okay
>> t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 72.
>> t/05seqfeature....FAILED tests 9-48
>>        Failed 40/48 tests, 16.67% okay
>> t/06comment.......ok
>> t/07dblink........ok
>> t/08genbank.......ok
>> t/09fuzzy2........ok
>> t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1420.
>> t/10ensembl.......dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 3-15
>>        Failed 13/15 tests, 13.33% okay
>> t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs"
>> via package "Bio::Annotation::OntologyTerm" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1.
>> t/11locuslink.....dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 5-110
>>        Failed 106/110 tests, 3.64% okay
>> t/12ontology......ok 1/738Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::GOterm" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 98.
>> t/12ontology......dubious
>>        Test returned status 255 (wstat 65280, 0xff00)
>> DIED. FAILED tests 5-738
>>        Failed 734/738 tests, 0.54% okay
>> t/13remove........ok 2/59Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 145.
>> t/13remove........FAILED tests 11-59
>>        Failed 49/59 tests, 16.95% okay
>> t/14query.........ok
>> t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1.
>> t/15cluster.......dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 6-160
>>        Failed 155/160 tests, 3.12% okay
>> t/16obda..........ok
>>
>> On Thu, 17 Apr 2008, Chris Fields wrote:
>>
>>>
>>> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>>>
>>>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>>>> In any case I debugged and tracked that down to the RichSeq
>>>>> adaptor module missing.
>>>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a
>>>> Bio::Seq and a SeqAdaptor is present.
>>>> I'm afraid it gets stuck somewhere else and frankly I didn't see
>>>> the RichSeqAdaptor failing to load in your stack trace:
>>>>
>>>>>     ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>>
>>>>>  MSG: Annotation of class Bio::Annotation::Collection not
>>>>>  type-mapped. Internal error?
>>>>>  STACK: Error::throw
>>>>>  STACK: Bio::Root::Root::throw
>>>>>  /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>>>  STACK:
>>>>>  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>>>  STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>>>  STACK: Bio::DB::Persistent::PersistentObject::store
>>>>>  Bio/DB/Persistent/PersistentObject.pm:271
>>>>>  STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>>>  Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>>  STACK: Bio::DB::Persistent::PersistentObject::create
>>>>>  Bio/DB/Persistent/PersistentObject.pm:244
>>>>>  STACK: t/04swiss.t:36
>>>>>  -----------------------------------------------------------
>>>> What that tells me is that when bioperl-db tries to store the
>>>> annotation bundle of the (SwissProt) sequence, one of the
>>>> annotations that it encounters is of type
>>>> Bio::Annotation::Collection. At present bioperl-db doesn't know
>>>> what to do with it; i.e., bioperl-db can't yet handle hierarchical
>>>> annotation collections (collections within collections).
>>>> I believe this is due to recent changes in how the GN line is
>>>> parsed in BioPerl - Chris does this ring the right bell? I thought
>>>> though you had built in a method would allow flattening out
>>>
>>> This appears to be using an older bioperl-live checkout, one where
>>> Heikki changed GN parsing to use a nested Annotation::Collection.  I
>>> reverted that back in a later commit to svn specifically b/c of
>>> bioperl-db problems. bioperl-live's swiss.pm now uses a new subclass
>>> of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that
>>> represents nested values via Data::Stag's itext output (we can
>>> change that to alternatives if needed).
>>>
>>> Here are the last few relevant revisions in bioperl-live's main
>>> trunk (mine is the latest):
>>>
>>> ------------------------------------------------------------------------
>>>
>>> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1
>>> line
>>>
>>> bug 1825: updating swiss.pm/tests to try out TagTree (passes all
>>> tests). Need to update Handler.t and related modules still...
>>> ------------------------------------------------------------------------
>>>
>>> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line
>>>
>>> documentation for the GN line parsing and management
>>> ------------------------------------------------------------------------
>>>
>>> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line
>>>
>>> GN (Gene Name) line parsing rewrite. Breaks backward compatibility.
>>> Can now deal with >1 gene per entry and four categories of names per
>>> gene. Parses old style syntax (...OR ... OR ... ) into one gene name
>>> and synonyms for each gene. Docs to follow.
>>>
>>> ....
>>>
>>> I just updated all code from dev and reran bioperl-db tests w/o
>>> problems. Maybe someone else could do the same to see what happens?
>>>
>>>> It's worth noting that BioSQL itself can't really represent nested
>>>> annotation collections other than by using ontology terms and their
>>>> hierarchy, which at present I think isn't really appropriate, but I
>>>> have to think through the issue more. In other words, in BioSQL you
>>>> can't directly tie together a bunch of qualifier value pairs into a
>>>> "bag" and then nest this bag within another. The way to make this
>>>> work with the current schema is to flatten out the nesting.
>>>>
>>>>     -hilmar
>>>> --===========================================================
>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>> ===========================================================
>>>
>>> Might be worth looking into for a future BioSQL release, but we have
>>> a decent workaround in place for now, as long as it works
>>> cross-platform and cross-RDB.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


From hubert.gaynor at yahoo.com  Thu Apr 17 20:53:16 2008
From: hubert.gaynor at yahoo.com (Hubert Gaynor)
Date: Thu, 17 Apr 2008 17:53:16 -0700 (PDT)
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
Message-ID: <130971.67684.qm@web46007.mail.sp1.yahoo.com>

Hi Sean,

I got it. Thank you so much!

Hubert

----- Original Message ----
From: Sean Davis <sdavis2 at mail.nih.gov>
To: Hubert Gaynor <hubert.gaynor at yahoo.com>
Sent: Thursday, April 17, 2008 6:36:02 PM
Subject: Re: [Bioperl-l] Can I use BLAST against a database like MySQL

On Thu, Apr 17, 2008 at 2:19 AM, Hubert Gaynor <hubert.gaynor at yahoo.com> wrote:
> Hi,
>
>  As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier?
>

formatdb is used to make a representation that can be used efficiently
by blast.  That representation already makes blast faster.  MySQL
can't be used for such things.  As for speeding blast, if you have a
multiprocessor machine, you can take advantage of those using blast
and increasing the number of processors.  Also, while blast is a very
versatile program, it is not the only alignment program available.
Depending on your needs, you could look at other programs such as blat
or gmap that can be 2-3 orders of magnitude faster than blast.

Sean


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

From Russell.Smithies at agresearch.co.nz  Thu Apr 17 21:39:23 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Fri, 18 Apr 2008 13:39:23 +1200
Subject: [Bioperl-l] accessing params for custom glyphs?
In-Reply-To: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
References: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>

This is probably more of a Perl OO problem I'm having, but can anyone
tell me how to access a parameter when I create a custom glyph?

I've created a panel in the usual way then I add a feature with
'my_glyph' and want to pass the value of -new_parameter to the glyph
drawing code.

    $panel->add_track( $feature,
    			-font => gdSmallFont,
			-glyph => 'my_glyph' ,
			-height => 10,
                		-label  => 1,
                		-strand => "forward",
                		-new_parameter => "test",


In my_glyph.pm, I have the usual draw_component sub:

sub draw_component {
  my $self = shift;
  my $gd = shift;
  my ($x1,$y1,$x2,$y2) = $self->bounds(@_);
  my $fg = $self->fgcolor;
  my $params = $self->??????????   <<--- how do I access the value of
"new_parameter" set in the panel drawing code?

  $gd->line($x1,$y1,$x2,$y2,$fg);
  $gd->line($x1,$y2,$x2,$y1,$fg);

}

Any ideas?

Thanx,

Russell	Smithies			
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From David.Messina at sbc.su.se  Fri Apr 18 05:31:59 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 18 Apr 2008 11:31:59 +0200
Subject: [Bioperl-l]  Finding seqs of given domain architecture
In-Reply-To: <628aabb70804170155n4e5dfd81r7020c3e9e11094ff@mail.gmail.com>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
	<628aabb70804161112o6610ee1fkfb4b08e74730237d@mail.gmail.com>
	<1208420674.23342.15.camel@razor.sbc.su.se>
	<628aabb70804170155n4e5dfd81r7020c3e9e11094ff@mail.gmail.com>
Message-ID: <628aabb70804180231p2b9cef9dwd5441e85c31531fd@mail.gmail.com>

Jacob,

I talked about your question with a colleague of mine who has been working
in this area. Below is his reply.

[I'm reposting this *without* the attachment mentioned since the mailing
list wouldn't accept it otherwise. If anyone wants a copy of the code, just
email me.]

Dave

-------

> 3. Pfam has this capability, i.e. to show all domains with a given
> architecture, but it is difficult to get at the actual sequences or
> even a list of accession numbers.

First, this should be available right away in PfamAlyser:

http://pfamalyzer.sbc.su.se/pfamalyzer/index.html

although you might need to upgrade your browser to Java 1.6 to get it to
work.

If this does not work as suggested (an upgraded version is coming
eventually), have a look at the file:

ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/swisspfam.gz

which contains the Pfam architectures for all UniProt sequences. You can
parse that to get a file of <accession number>-<list of domain>
correspondences and just filter that to get the accession numbers.
(Please find attached a Perl script to do just that.)

Under UNIX, you can then just grep this for the domain IDs,

(like grep domainArchitectureFile.txt PF00008 | grep PF00456 >
resultFile.txt)

but I am sure there are solutions under other operating systems as well.
You could then write a script to parse out the corresponding sequences
from the UniProt fasta flatfile, if you wanted, or (again under UNIX) a
script to wget them of the webpage.

In case your sequences are not in UniProt, consider using HMMER and the
Pfam HMM files to assign domains to all sequences in your dataset. I
would then parse the HMMER output into the same format as the above, and
use the same approach following that.

Hope this helps,

Yours sincerely,

Kristoffer Forslund
krifo at sbc.su.se

From lincoln.stein at gmail.com  Fri Apr 18 15:16:19 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Fri, 18 Apr 2008 15:16:19 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] accessing params for custom glyphs?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>
References: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
	<D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>
Message-ID: <6dce9a0b0804181216q6564e580u8a805ae96c78df2e@mail.gmail.com>

Hi Russell,

It's very simple:

   my $params = $self->option('new_parameter');

Lincoln

On Thu, Apr 17, 2008 at 9:39 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

> This is probably more of a Perl OO problem I'm having, but can anyone
> tell me how to access a parameter when I create a custom glyph?
>
> I've created a panel in the usual way then I add a feature with
> 'my_glyph' and want to pass the value of -new_parameter to the glyph
> drawing code.
>
>    $panel->add_track( $feature,
>                        -font => gdSmallFont,
>                        -glyph => 'my_glyph' ,
>                        -height => 10,
>                                -label  => 1,
>                                -strand => "forward",
>                                -new_parameter => "test",
>
>
> In my_glyph.pm, I have the usual draw_component sub:
>
> sub draw_component {
>  my $self = shift;
>  my $gd = shift;
>  my ($x1,$y1,$x2,$y2) = $self->bounds(@_);
>  my $fg = $self->fgcolor;
>  my $params = $self->??????????   <<--- how do I access the value of
> "new_parameter" set in the panel drawing code?
>
>  $gd->line($x1,$y1,$x2,$y2,$fg);
>  $gd->line($x1,$y2,$x2,$y1,$fg);
>
> }
>
> Any ideas?
>
> Thanx,
>
> Russell Smithies
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Don't miss this year's exciting event. There's still time to save $100.
> Use priority code J8TL2D2.
>
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From jason at bioperl.org  Fri Apr 18 22:35:10 2008
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 18 Apr 2008 19:35:10 -0700
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <1208381947.16620.6.camel@kiss-laptop>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
	<1208381947.16620.6.camel@kiss-laptop>
Message-ID: <A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>

do you want the LOCUS or the ACCESSION?
Do you mean the result is the completely wrong record or just the  
wrong field?
accession number is available from the seq's accession_number() method.
-jason
On Apr 16, 2008, at 2:39 PM, Fr?d?ric Romagn? wrote:

> Well, if with input file you mean the database used, it's created
> with Bio::Index::GenBank from a ncbi FTP's genbank file.
>
> $id is an accession number read from a file but i chomp the line...
>
> I am trying to install the svn version of bioperl under windows to see
> if there is an improvement.
>
> Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
>> Did you check the format of your input file?
>> i.e. DOS or UNIX line endings?
>>
>>> -----Original Message-----
>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
>>> bounces at lists.open-
>>> bio.org] On Behalf Of Fr?d?ric Romagn?
>>> Sent: Thursday, 17 April 2008 5:25 a.m.
>>> To: bioperl-l at lists.open-bio.org
>>> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
>>>
>>> Hello,
>>> i made a program which use Bio::Index::GenBank and i tested it under
>>> unix, that worked well.
>>>
>>> But i have to launch it under windows and it seems not to work on.
>>>
>>> Here is the problem :
>>>
>>> my $dbobj = Bio::Index::Abstract->new("Data/$db");
>>> my $seq = $dbobj->get_Seq_by_acc($id);
>>> print $seq->display_id."\n";
>>>
>>> did not print the same number than $id !!! So i don't work on the
>>> sequence expected...
>>>
>>> I use the SVN sources on unix and the Perl package manager for
>>> windows...
>>>
>>> Thanks.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> ===================================================================== 
>> ==
>> Attention: The information contained in this message and/or  
>> attachments
>> from AgResearch Limited is intended only for the persons or entities
>> to which it is addressed and may contain confidential and/or  
>> privileged
>> material. Any review, retransmission, dissemination or other use  
>> of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipients is prohibited by  
>> AgResearch
>> Limited. If you have received this message in error, please notify  
>> the
>> sender immediately.
>> ===================================================================== 
>> ==
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bioperlanand at yahoo.com  Mon Apr 21 03:44:00 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Mon, 21 Apr 2008 00:44:00 -0700 (PDT)
Subject: [Bioperl-l] a question on obtaining HTML formatted Blast output
	along with the Blast hits image
Message-ID: <372845.37134.qm@web36808.mail.mud.yahoo.com>


 Hi everybody,

I would like to obtain a HTML formatted blast report output along with a picture of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I have gotten the HTML output working using "Bio::SearchIO::Writer::HTMLResultWriter".

My question: How do I integrate it with Bio:Graphics to render the blast hits image at the correct position in my Bioperl reformatted html file.

I ultimately want to be able to display my blast output files on a browser. 

Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;

my $infile = shift or die $!;

my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile );
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                -file   => ">${infile}.html");

$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------

Thanks in advance,

Anand


---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

From cjfields at uiuc.edu  Mon Apr 21 11:07:17 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:07:17 -0500
Subject: [Bioperl-l] [Proposed change] HSP::frame()
Message-ID: <ACE26E05-7C02-46E3-B973-E0529C0A0DEA@uiuc.edu>

I have noticed (in relation to bug 2485, http://bugzilla.open-bio.org/show_bug.cgi?id=2485) 
  that the Bio::Search::HSP::GenericHSP frame() method is implemented  
very differently from strand(), start(), end(), and most other HSP  
methods.  The current behavior is to return an array of two values  
(query and hit frame) under list conditions, the query frame if one  
value is passed, and the subject frame if no value is passed under  
scalar context and both under list context.  The latter behavior is  
unfortunately leading to the aforementioned bug above.  The method is  
also implied to be a getter/setter, but the implementation doesn't  
allow that; it always sets to the instantiated values (in fact,  
repeatedly so).

In order to fix that and make the interface more consistent I am  
changing frame() to behave like strand(), etc., in that the first  
argument is 'query/subject/hit/list' (default = 'query' if no arg  
specified) and the rest optional values for setting, in query/subject  
order.

One issue: I can catch and imitate most of the older behavior with a  
few additional checks, the one exception being the old frame() default  
return value which is now 'query' (not context-dependent).  If needed  
we can change the default to 'hit', but I believe method consistency  
is probably the better route, and I can always add a warning under old  
API circumstances indicating the change.

I am also modifying HSPTableWriter to print frame_hit and frame_query  
(previously it was only printing 'frame', which implied the hit  
frame).  I can see this being an issue with anyone expecting 'frame'  
instead of 'frame_hit';  I could hack in a fix for that if needed.

If there aren't any objections or suggestions, I'll commit this in the  
next day or two.

chris

From cjfields at uiuc.edu  Mon Apr 21 11:32:59 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:32:59 -0500
Subject: [Bioperl-l] Assembly.t test fails
Message-ID: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>

I'm getting some significant test failures in bioperl-live for  
Bio::Assembly:

t/Assembly......
1..35
ok 1 - use Bio::Assembly::IO;
ok 2 - The object isa Bio::Assembly::IO
ok 3 - The object isa Bio::Assembly::Scaffold
ok 4
not ok 5
ok 6 - The object isa Bio::AnnotationCollectionI
ok 7 - no annotations in Annotation collection?
ok 8

#   Failed test at t/Assembly.t line 35.
#          got: 'NoName'
#     expected: 'test'
Can't locate object method "get_contig_seq_ids" via package  
"Bio::Assembly::Contig" at /Users/cjfields/bioperl/bioperl-live/blib/ 
lib/Bio/Assembly/Scaffold.pm line 189, <GEN0> line 733.
# Looks like you planned 35 tests but only ran 8.
# Looks like you failed 1 test of 8 run.
# Looks like your test died just after 8.
  Dubious, test returned 255 (wstat 65280, 0xff00)
  Failed 28/35 subtests

Test Summary Report
-------------------
t/Assembly.t (Wstat: 65280 Tests: 8 Failed: 1)
   Failed test:  5
   Non-zero exit status: 255
   Parse errors: Bad plan.  You planned 35 tests but ran 8.
Files=1, Tests=8,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.22 cusr   
0.04 csys =  0.27 CPU)
Result: FAIL
Failed 1/1 test programs. 1/8 subtests failed.


chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Apr 21 11:44:21 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:44:21 -0500
Subject: [Bioperl-l] Assembly.t test fails
In-Reply-To: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>
References: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>
Message-ID: <2F199628-717E-4F88-85D7-408BD7BBE16D@uiuc.edu>

Scratch that, figured it out (easy fix).

chris

On Apr 21, 2008, at 10:32 AM, Chris Fields wrote:

> I'm getting some significant test failures in bioperl-live for  
> Bio::Assembly:
>
> t/Assembly......
> 1..35
> ok 1 - use Bio::Assembly::IO;
> ok 2 - The object isa Bio::Assembly::IO
> ok 3 - The object isa Bio::Assembly::Scaffold
> ok 4
> not ok 5
> ok 6 - The object isa Bio::AnnotationCollectionI
> ok 7 - no annotations in Annotation collection?
> ok 8
>
> #   Failed test at t/Assembly.t line 35.
> #          got: 'NoName'
> #     expected: 'test'
> Can't locate object method "get_contig_seq_ids" via package  
> "Bio::Assembly::Contig" at /Users/cjfields/bioperl/bioperl-live/blib/ 
> lib/Bio/Assembly/Scaffold.pm line 189, <GEN0> line 733.
> # Looks like you planned 35 tests but only ran 8.
> # Looks like you failed 1 test of 8 run.
> # Looks like your test died just after 8.
> Dubious, test returned 255 (wstat 65280, 0xff00)
> Failed 28/35 subtests
>
> Test Summary Report
> -------------------
> t/Assembly.t (Wstat: 65280 Tests: 8 Failed: 1)
>  Failed test:  5
>  Non-zero exit status: 255
>  Parse errors: Bad plan.  You planned 35 tests but ran 8.
> Files=1, Tests=8,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.22  
> cusr  0.04 csys =  0.27 CPU)
> Result: FAIL
> Failed 1/1 test programs. 1/8 subtests failed.
>
>
> chris
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From frederic.romagne at gmail.com  Mon Apr 21 11:53:11 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Mon, 21 Apr 2008 10:53:11 -0500
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
	<1208381947.16620.6.camel@kiss-laptop>
	<A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>
Message-ID: <1208793191.25906.9.camel@kiss-laptop>

In fact, i want the whole Bio::Seq object, but the i verified the
ACCESSION and the LOCUS are the same in my genbank files.
I saw that the program sometimes tells that it cannot find the entry :

 if( !defined $seq ) {
	warn("Sequence $id in Database $db is not present\n");
    }

i suspect the make_index function not to work properly on windows
instead of the ?get_Seq_by_acc function...

Le vendredi 18 avril 2008 ? 19:35 -0700, Jason Stajich a ?crit :
> do you want the LOCUS or the ACCESSION?
> Do you mean the result is the completely wrong record or just the  
> wrong field?
> accession number is available from the seq's accession_number() method.
> -jason
> On Apr 16, 2008, at 2:39 PM, Fr?d?ric Romagn? wrote:
> 
> > Well, if with input file you mean the database used, it's created
> > with Bio::Index::GenBank from a ncbi FTP's genbank file.
> >
> > $id is an accession number read from a file but i chomp the line...
> >
> > I am trying to install the svn version of bioperl under windows to see
> > if there is an improvement.
> >
> > Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
> >> Did you check the format of your input file?
> >> i.e. DOS or UNIX line endings?
> >>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> >>> bounces at lists.open-
> >>> bio.org] On Behalf Of Fr?d?ric Romagn?
> >>> Sent: Thursday, 17 April 2008 5:25 a.m.
> >>> To: bioperl-l at lists.open-bio.org
> >>> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> >>>
> >>> Hello,
> >>> i made a program which use Bio::Index::GenBank and i tested it under
> >>> unix, that worked well.
> >>>
> >>> But i have to launch it under windows and it seems not to work on.
> >>>
> >>> Here is the problem :
> >>>
> >>> my $dbobj = Bio::Index::Abstract->new("Data/$db");
> >>> my $seq = $dbobj->get_Seq_by_acc($id);
> >>> print $seq->display_id."\n";
> >>>
> >>> did not print the same number than $id !!! So i don't work on the
> >>> sequence expected...
> >>>
> >>> I use the SVN sources on unix and the Perl package manager for
> >>> windows...
> >>>
> >>> Thanks.
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> ===================================================================== 
> >> ==
> >> Attention: The information contained in this message and/or  
> >> attachments
> >> from AgResearch Limited is intended only for the persons or entities
> >> to which it is addressed and may contain confidential and/or  
> >> privileged
> >> material. Any review, retransmission, dissemination or other use  
> >> of, or
> >> taking of any action in reliance upon, this information by persons or
> >> entities other than the intended recipients is prohibited by  
> >> AgResearch
> >> Limited. If you have received this message in error, please notify  
> >> the
> >> sender immediately.
> >> ===================================================================== 
> >> ==
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From ewijaya at gmail.com  Tue Apr 22 10:03:07 2008
From: ewijaya at gmail.com (Edward Wijaya)
Date: Tue, 22 Apr 2008 22:03:07 +0800
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
Message-ID: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>

Hi,

Is there any module that can parse the following output
of BLAT. This is taken from UCSC browser.

The idea is to parse it and then extract the conserved block
of aligned sequences.


__DATA__
Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
B D   D. melanogaster
tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
B D       D. simulans
tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
B D      D. sechellia
tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
B D         D. yakuba
tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
            D. erecta
tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
         D. ananassae
taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
     D. pseudoobscura
tata----ccagtacac-cttatatg------------tttttaaata--------------------
B D     D. persimilis
tata----ccagtacac-attatatg------------tttttaaata--------------------
        D. willistoni
aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
           D. virilis
-------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
        D. mojavensis
-------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
         D. grimshawi
====================================================================
         T. castaneum
====================================================================

Inserts between block 3 and 4 in window
    D. pseudoobscura 2008bp
B D    D. persimilis 1421bp
          D. virilis 5bp
       D. mojavensis 4640bp

Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
B D   D. melanogaster
----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
B D       D. simulans
----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
B D      D. sechellia
----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
B D         D. yakuba
----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
            D. erecta
----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
     D. pseudoobscura
====================================================================
B D     D. persimilis
====================================================================
        D. willistoni
----aggattacgaagttcctttat-------------------aaag--------------------
           D. virilis
gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
        D. mojavensis
====================================================================
         D. grimshawi
====================================================================
         T. castaneum
====================================================================

__ END__

From cjfields at uiuc.edu  Tue Apr 22 10:22:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 09:22:45 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
Message-ID: <766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>

A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!

chris

On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:

> Hi,
>
> Is there any module that can parse the following output
> of BLAT. This is taken from UCSC browser.
>
> The idea is to parse it and then extract the conserved block
> of aligned sequences.
>
>
> __DATA__
> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
> B D   D. melanogaster
> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
> B D       D. simulans
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
> B D      D. sechellia
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
> B D         D. yakuba
> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>            D. erecta
> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>         D. ananassae
> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>     D. pseudoobscura
> tata----ccagtacac-cttatatg------------tttttaaata--------------------
> B D     D. persimilis
> tata----ccagtacac-attatatg------------tttttaaata--------------------
>        D. willistoni
> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>           D. virilis
> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>        D. mojavensis
> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>         D. grimshawi
> ====================================================================
>         T. castaneum
> ====================================================================
>
> Inserts between block 3 and 4 in window
>    D. pseudoobscura 2008bp
> B D    D. persimilis 1421bp
>          D. virilis 5bp
>       D. mojavensis 4640bp
>
> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
> B D   D. melanogaster
> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
> B D       D. simulans
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D      D. sechellia
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D         D. yakuba
> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>            D. erecta
> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>     D. pseudoobscura
> ====================================================================
> B D     D. persimilis
> ====================================================================
>        D. willistoni
> ----aggattacgaagttcctttat-------------------aaag--------------------
>           D. virilis
> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>        D. mojavensis
> ====================================================================
>         D. grimshawi
> ====================================================================
>         T. castaneum
> ====================================================================
>
> __ END__
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 22 10:59:25 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 09:59:25 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
Message-ID: <4F3522BB-28F0-44A8-8DE1-7CF3F648402A@uiuc.edu>

A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!

chris

On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:

> Hi,
>
> Is there any module that can parse the following output
> of BLAT. This is taken from UCSC browser.
>
> The idea is to parse it and then extract the conserved block
> of aligned sequences.
>
>
> __DATA__
> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
> B D   D. melanogaster
> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
> B D       D. simulans
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
> B D      D. sechellia
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
> B D         D. yakuba
> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>           D. erecta
> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>        D. ananassae
> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>    D. pseudoobscura
> tata----ccagtacac-cttatatg------------tttttaaata--------------------
> B D     D. persimilis
> tata----ccagtacac-attatatg------------tttttaaata--------------------
>       D. willistoni
> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>          D. virilis
> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>       D. mojavensis
> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>        D. grimshawi
> ====================================================================
>        T. castaneum
> ====================================================================
>
> Inserts between block 3 and 4 in window
>   D. pseudoobscura 2008bp
> B D    D. persimilis 1421bp
>         D. virilis 5bp
>      D. mojavensis 4640bp
>
> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
> B D   D. melanogaster
> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
> B D       D. simulans
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D      D. sechellia
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D         D. yakuba
> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>           D. erecta
> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>    D. pseudoobscura
> ====================================================================
> B D     D. persimilis
> ====================================================================
>       D. willistoni
> ----aggattacgaagttcctttat-------------------aaag--------------------
>          D. virilis
> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>       D. mojavensis
> ====================================================================
>        D. grimshawi
> ====================================================================
>        T. castaneum
> ====================================================================
>
> __ END__
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Tue Apr 22 14:49:32 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 11:49:32 -0700
Subject: [Bioperl-l] Fwd: [blast-announce] New BLAST URL available at the
	NCBI
References: <EEEED756EF6626469B10653F745014389BAEAD@NIHCESMLBX15.nih.gov>
Message-ID: <F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>

Does anyone want to take a look at how to use these URLs in the  
RemoteBlast module, if the interface is the same?

-jason

Begin forwarded message:

> From: "Mcginnis, Scott (NIH/NLM/NCBI) [E]" <mcginnis at ncbi.nlm.nih.gov>
> Date: April 22, 2008 11:35:04 AM PDT
> To: <blast-announce at ncbi.nlm.nih.gov>
> Subject: [blast-announce] New BLAST URL available at the NCBI
>
> New BLAST URL available at the NCBI
>
>
>
> The NCBI has activated a new URL for BLAST searches at the NCBI:
> http://blast.ncbi.nlm.nih.gov.
>
>
>
> Searches sent to this URL can take advantage of a larger number of
> machines for searches and the system has a better overall fault
> tolerance.
>
>
>
> We recommend migration of all BLAST links and bookmarks (e.g.,
> http://www.ncbi.nlm.nih.gov/BLAST/ and
> http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) to the new URL.
>
>
>
> Links on the NCBI and BLAST home pages will start to change in the
> coming weeks.
>
>
>
> At this point in time the plans are to also maintain the current BLAST
> URL.
>
>
>
>
>


From jason at bioperl.org  Tue Apr 22 14:51:08 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 11:51:08 -0700
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
	<766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
Message-ID: <6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>

if you get it as axt it should parse fine in SearchIO but that is  
pairwise, if you can get an alignment blocks I can't remember what  
format this is from UCSC.
MSAs are going to be better handed through Bio::AlignIO though so it  
might be better to build a parser on that.

On Apr 22, 2008, at 7:22 AM, Chris Fields wrote:

> A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
> Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
> Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!
>
> chris
>
> On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:
>
>> Hi,
>>
>> Is there any module that can parse the following output
>> of BLAT. This is taken from UCSC browser.
>>
>> The idea is to parse it and then extract the conserved block
>> of aligned sequences.
>>
>>
>> __DATA__
>> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
>> B D   D. melanogaster
>> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
>> B D       D. simulans
>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
>> B D      D. sechellia
>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
>> B D         D. yakuba
>> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>>            D. erecta
>> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>>         D. ananassae
>> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>>     D. pseudoobscura
>> tata----ccagtacac-cttatatg------------tttttaaata--------------------
>> B D     D. persimilis
>> tata----ccagtacac-attatatg------------tttttaaata--------------------
>>        D. willistoni
>> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>>           D. virilis
>> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>>        D. mojavensis
>> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>>         D. grimshawi
>> ====================================================================
>>         T. castaneum
>> ====================================================================
>>
>> Inserts between block 3 and 4 in window
>>    D. pseudoobscura 2008bp
>> B D    D. persimilis 1421bp
>>          D. virilis 5bp
>>       D. mojavensis 4640bp
>>
>> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
>> B D   D. melanogaster
>> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
>> B D       D. simulans
>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>> B D      D. sechellia
>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>> B D         D. yakuba
>> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>>            D. erecta
>> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>>     D. pseudoobscura
>> ====================================================================
>> B D     D. persimilis
>> ====================================================================
>>        D. willistoni
>> ----aggattacgaagttcctttat-------------------aaag--------------------
>>           D. virilis
>> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>>        D. mojavensis
>> ====================================================================
>>         D. grimshawi
>> ====================================================================
>>         T. castaneum
>> ====================================================================
>>
>> __ END__
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Apr 22 15:02:14 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 14:02:14 -0500
Subject: [Bioperl-l] Fwd: [blast-announce] New BLAST URL available at
	the NCBI
In-Reply-To: <F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>
References: <EEEED756EF6626469B10653F745014389BAEAD@NIHCESMLBX15.nih.gov>
	<F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>
Message-ID: <13C2AD96-8297-40DD-ADCC-B2BEC923B9E0@uiuc.edu>

They work exactly the same as the old URL, at least on the surface; I  
haven't tried changing many URLAPI parameters.  I went ahead and  
changed the URL in RemoteBlast to http://blast.ncbi.nlm.nih.gov/Blast.cgi 
  as it works with RemoteBlast.t.

chris

On Apr 22, 2008, at 1:49 PM, Jason Stajich wrote:

> Does anyone want to take a look at how to use these URLs in the  
> RemoteBlast module, if the interface is the same?
>
> -jason
>
> Begin forwarded message:
>
>> From: "Mcginnis, Scott (NIH/NLM/NCBI) [E]"  
>> <mcginnis at ncbi.nlm.nih.gov>
>> Date: April 22, 2008 11:35:04 AM PDT
>> To: <blast-announce at ncbi.nlm.nih.gov>
>> Subject: [blast-announce] New BLAST URL available at the NCBI
>>
>> New BLAST URL available at the NCBI
>>
>>
>>
>> The NCBI has activated a new URL for BLAST searches at the NCBI:
>> http://blast.ncbi.nlm.nih.gov.
>>
>>
>>
>> Searches sent to this URL can take advantage of a larger number of
>> machines for searches and the system has a better overall fault
>> tolerance.
>>
>>
>>
>> We recommend migration of all BLAST links and bookmarks (e.g.,
>> http://www.ncbi.nlm.nih.gov/BLAST/ and
>> http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) to the new URL.
>>
>>
>>
>> Links on the NCBI and BLAST home pages will start to change in the
>> coming weeks.
>>
>>
>>
>> At this point in time the plans are to also maintain the current  
>> BLAST
>> URL.
>>
>>
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 22 14:58:40 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 13:58:40 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
	<766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
	<6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>
Message-ID: <43344C89-6B4D-4360-AF56-A6FDD065FFF3@uiuc.edu>

Related to that, I have thought about building a parser for some of  
the query-anchored alignments produced by blastall, just haven't had  
time to devote to it.  One of these days...

chris

On Apr 22, 2008, at 1:51 PM, Jason Stajich wrote:

> if you get it as axt it should parse fine in SearchIO but that is  
> pairwise, if you can get an alignment blocks I can't remember what  
> format this is from UCSC.
> MSAs are going to be better handed through Bio::AlignIO though so it  
> might be better to build a parser on that.
>
> On Apr 22, 2008, at 7:22 AM, Chris Fields wrote:
>
>> A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
>> Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
>> Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!
>>
>> chris
>>
>> On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:
>>
>>> Hi,
>>>
>>> Is there any module that can parse the following output
>>> of BLAT. This is taken from UCSC browser.
>>>
>>> The idea is to parse it and then extract the conserved block
>>> of aligned sequences.
>>>
>>>
>>> __DATA__
>>> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
>>> B D   D. melanogaster
>>> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
>>> B D       D. simulans
>>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
>>> B D      D. sechellia
>>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
>>> B D         D. yakuba
>>> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>>>           D. erecta
>>> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>>>        D. ananassae
>>> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>>>    D. pseudoobscura
>>> tata----ccagtacac-cttatatg------------tttttaaata--------------------
>>> B D     D. persimilis
>>> tata----ccagtacac-attatatg------------tttttaaata--------------------
>>>       D. willistoni
>>> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>>>          D. virilis
>>> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>>>       D. mojavensis
>>> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>>>        D. grimshawi
>>> ====================================================================
>>>        T. castaneum
>>> ====================================================================
>>>
>>> Inserts between block 3 and 4 in window
>>>   D. pseudoobscura 2008bp
>>> B D    D. persimilis 1421bp
>>>         D. virilis 5bp
>>>      D. mojavensis 4640bp
>>>
>>> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
>>> B D   D. melanogaster
>>> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
>>> B D       D. simulans
>>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>>> B D      D. sechellia
>>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>>> B D         D. yakuba
>>> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>>>           D. erecta
>>> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>>>    D. pseudoobscura
>>> ====================================================================
>>> B D     D. persimilis
>>> ====================================================================
>>>       D. willistoni
>>> ----aggattacgaagttcctttat-------------------aaag--------------------
>>>          D. virilis
>>> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>>>       D. mojavensis
>>> ====================================================================
>>>        D. grimshawi
>>> ====================================================================
>>>        T. castaneum
>>> ====================================================================
>>>
>>> __ END__
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bioperlanand at yahoo.com  Wed Apr 23 02:02:30 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Tue, 22 Apr 2008 23:02:30 -0700 (PDT)
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
Message-ID: <946658.12337.qm@web36802.mail.mud.yahoo.com>

Hi everybody,

I would like to use Bio::Graphics in conjunction with Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted blast report output along with an image of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I am able to get the HTML output using  "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the image using the examples outlined in the Bio::Graphics HOWTO: http://www.bioperl.org/wiki/HOWTO:Graphics

My question: How do I integrate Bio::Graphics with Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits image at the correct position in my BioPerl reformatted html file.

I also found that someone else has asked something similar to whatever I am asking & is listed under the "Orphans, Leftovers" category in the ListSummary:April 26-May 9,2006 document: 
http://www.bioperl.org/wiki/ListSummary:April_26-May_9%2C2006#Orphans.2C_Leftovers

Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;

my $infile = shift or die $!;

my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile);
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                                  -file   => ">${infile}.html");

$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------

Thanks in advance,

Anand

       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

From jason at bioperl.org  Wed Apr 23 02:15:28 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 23:15:28 -0700
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <946658.12337.qm@web36802.mail.mud.yahoo.com>
References: <946658.12337.qm@web36802.mail.mud.yahoo.com>
Message-ID: <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org>


Basically you want to inject your own IMG tags into the file with  
these routines:

     $writerhtml->start_report(\&my_start_report);
     $writerhtml->title(\&my_title);
     $writerhtml->hit_link_align(\&my_hit_link_align);
     $writerhtml->hit_link_desc(\&my_hit_link_desc);

fgblast shows a way to do this in part. It relies on Gbrowse to  
generate the image but you can replace the gbrowse_img reference to  
your own image generating software.

http://people.genome.duke.edu/~jes12/software/scripts/fgblast

-jason
On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:

> Hi everybody,
>
> I would like to use Bio::Graphics in conjunction with  
> Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted  
> blast report output along with an image of the blast hits as shown  
> on Slide 60 in this pdf: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
> I am able to get the HTML output using   
> "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the  
> image using the examples outlined in the Bio::Graphics HOWTO:  
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
> My question: How do I integrate Bio::Graphics with  
> Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits  
> image at the correct position in my BioPerl reformatted html file.
>
> I also found that someone else has asked something similar to  
> whatever I am asking & is listed under the "Orphans, Leftovers"  
> category in the ListSummary:April 26-May 9,2006 document:
> http://www.bioperl.org/wiki/ListSummary:April_26-May_9% 
> 2C2006#Orphans.2C_Leftovers
>
> Here is my code so far:
> ----------------------------------------------------------------
> #!/usr/bin/perl -w
> # usage: $0 <blast_report>
> use strict;
> use Bio::SearchIO;
> use Bio::SearchIO::Writer::HTMLResultWriter;
>
> my $infile = shift or die $!;
>
> my $searchio = new Bio::SearchIO( -format => 'blast',-file   =>  
> $infile);
> my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
> my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
>                                                   -file   => ">$ 
> {infile}.html");
>
> $outhtml->write_result($searchio->next_result);
> ----------------------------------------------------------------
>
> Thanks in advance,
>
> Anand
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bamboowarrior at gmail.com  Wed Apr 23 15:39:21 2008
From: bamboowarrior at gmail.com (Arkady)
Date: Wed, 23 Apr 2008 14:39:21 -0500
Subject: [Bioperl-l] WebBlat, where'd it go?
Message-ID: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>

Hi folks,

I'm trying to use BioPerl to run a BLAT search on the four primate
genomes on UCSC. I understand that the proper tool for this is
Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
bioperl distribution (nor do I even know how to figure out what
version that is, unfortunately, though it's a very recent install -- a
month ago?). I also can't find it on CPAN. Is this deprecated? Has
something else replaced it? Or are we always supposed to run local
BLAT?

Thanks.

John Woods

Institute for Cellular and Molecular Biology
The University of Texas at Austin

From spiros at lokku.com  Wed Apr 23 15:48:12 2008
From: spiros at lokku.com (Spiros Denaxas)
Date: Wed, 23 Apr 2008 20:48:12 +0100
Subject: [Bioperl-l] WebBlat, where'd it go?
In-Reply-To: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
References: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
Message-ID: <bba689ec0804231248s47034503y3cbf0512e4344843@mail.gmail.com>

Hey,

a quick look at the list of deprecated modules reveals that it has
indeed been removed,

http://www.bioperl.org/wiki/Deprecated_modules

Spiros

On Wed, Apr 23, 2008 at 8:39 PM, Arkady <bamboowarrior at gmail.com> wrote:
> Hi folks,
>
>  I'm trying to use BioPerl to run a BLAT search on the four primate
>  genomes on UCSC. I understand that the proper tool for this is
>  Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
>  bioperl distribution (nor do I even know how to figure out what
>  version that is, unfortunately, though it's a very recent install -- a
>  month ago?). I also can't find it on CPAN. Is this deprecated? Has
>  something else replaced it? Or are we always supposed to run local
>  BLAT?
>
>  Thanks.
>
>  John Woods
>
>  Institute for Cellular and Molecular Biology
>  The University of Texas at Austin
>  _______________________________________________
>  Bioperl-l mailing list
>  Bioperl-l at lists.open-bio.org
>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From cjfields at uiuc.edu  Wed Apr 23 15:56:14 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 23 Apr 2008 14:56:14 -0500
Subject: [Bioperl-l] WebBlat, where'd it go?
In-Reply-To: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
References: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
Message-ID: <AF7BBBC2-6A6E-486A-872C-8BB8B0A7FC0C@uiuc.edu>

It's no longer maintained (deprecated); see the following for an  
explanation:

http://article.gmane.org/gmane.comp.lang.perl.bio.general/13545

Basically, only local BLAT searches are supported through BioPerl.

chris

On Apr 23, 2008, at 2:39 PM, Arkady wrote:

> Hi folks,
>
> I'm trying to use BioPerl to run a BLAT search on the four primate
> genomes on UCSC. I understand that the proper tool for this is
> Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
> bioperl distribution (nor do I even know how to figure out what
> version that is, unfortunately, though it's a very recent install -- a
> month ago?). I also can't find it on CPAN. Is this deprecated? Has
> something else replaced it? Or are we always supposed to run local
> BLAT?
>
> Thanks.
>
> John Woods
>
> Institute for Cellular and Molecular Biology
> The University of Texas at Austin
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bioperlanand at yahoo.com  Wed Apr 23 19:05:27 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Wed, 23 Apr 2008 16:05:27 -0700 (PDT)
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org>
Message-ID: <795696.39415.qm@web36804.mail.mud.yahoo.com>

Hi Jason,

Thanks for the reply.

I am a little lost with the solution suggested. Is that how slide 60 in the pdf is obtained: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I guess I am missing something quite obvious, I apologize.

What I have & want is this: I have a directory having say 100 different blast reports & hence I am looking to obtain 100 different bioperl formatted blast html outputs with the respective images just as it would appear in the blast report.

Thanks,

Anand

Jason Stajich <jason at bioperl.org> wrote: 

Basically you want to inject your own IMG tags into the file with these routines:


    $writerhtml->start_report(\&my_start_report);
    $writerhtml->title(\&my_title);
    $writerhtml->hit_link_align(\&my_hit_link_align);
    $writerhtml->hit_link_desc(\&my_hit_link_desc);


fgblast shows a way to do this in part. It relies on Gbrowse to generate the image but you can replace the gbrowse_img reference to your own image generating software.
http://people.genome.duke.edu/~jes12/software/scripts/fgblast


-jason
On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:

Hi everybody,


I would like to use Bio::Graphics in conjunction with Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted blast report output along with an image of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf


I am able to get the HTML output using  "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the image using the examples outlined in the Bio::Graphics HOWTO: http://www.bioperl.org/wiki/HOWTO:Graphics


My question: How do I integrate Bio::Graphics with Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits image at the correct position in my BioPerl reformatted html file.


I also found that someone else has asked something similar to whatever I am asking & is listed under the "Orphans, Leftovers" category in the ListSummary:April 26-May 9,2006 document: 
http://www.bioperl.org/wiki/ListSummary:April_26-May_9%2C2006#Orphans.2C_Leftovers


Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;


my $infile = shift or die $!;


my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile);
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                                  -file   => ">${infile}.html");


$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------


Thanks in advance,


Anand


---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
 

---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

From jason at bioperl.org  Thu Apr 24 14:06:41 2008
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 24 Apr 2008 11:06:41 -0700
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <795696.39415.qm@web36804.mail.mud.yahoo.com>
References: <795696.39415.qm@web36804.mail.mud.yahoo.com>
Message-ID: <D47EBDB9-C15C-44A7-9376-89FA946270DD@bioperl.org>

The overview graphic is generated basically from the script in  
scripts/graphics/search_overview.PLS

So you'd have to run that on each report to generate the graphic,  
then use the other methods  to insert <img src="NAME"> images into  
each rendered HTML report.

-jason

On Apr 23, 2008, at 4:05 PM, Anand Venkatraman wrote:

> Hi Jason,
>
> Thanks for the reply.
>
> I am a little lost with the solution suggested. Is that how slide  
> 60 in the pdf is obtained: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
> I guess I am missing something quite obvious, I apologize.
>
> What I have & want is this: I have a directory having say 100  
> different blast reports & hence I am looking to obtain 100  
> different bioperl formatted blast html outputs with the respective  
> images just as it would appear in the blast report.
>
> Thanks,
>
> Anand
>
> Jason Stajich <jason at bioperl.org> wrote:
>
> Basically you want to inject your own IMG tags into the file with  
> these routines:
>
>
>     $writerhtml->start_report(\&my_start_report);
>     $writerhtml->title(\&my_title);
>     $writerhtml->hit_link_align(\&my_hit_link_align);
>     $writerhtml->hit_link_desc(\&my_hit_link_desc);
>
>
> fgblast shows a way to do this in part. It relies on Gbrowse to  
> generate the image but you can replace the gbrowse_img reference to  
> your own image generating software.
> http://people.genome.duke.edu/~jes12/software/scripts/fgblast
>
>
>
>
> -jason
> On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:
>
> Hi everybody,
>
>
> I would like to use Bio::Graphics in conjunction with  
> Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted  
> blast report output along with an image of the blast hits as shown  
> on Slide 60 in this pdf: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
>
> I am able to get the HTML output using   
> "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the  
> image using the examples outlined in the Bio::Graphics HOWTO:  
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
>
> My question: How do I integrate Bio::Graphics with  
> Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits  
> image at the correct position in my BioPerl reformatted html file.
>
>
> I also found that someone else has asked something similar to  
> whatever I am asking & is listed under the "Orphans, Leftovers"  
> category in the ListSummary:April 26-May 9,2006 document:
> http://www.bioperl.org/wiki/ListSummary:April_26-May_9% 
> 2C2006#Orphans.2C_Leftovers
>
>
> Here is my code so far:
> ----------------------------------------------------------------
> #!/usr/bin/perl -w
> # usage: $0 <blast_report>
> use strict;
> use Bio::SearchIO;
> use Bio::SearchIO::Writer::HTMLResultWriter;
>
>
> my $infile = shift or die $!;
>
>
> my $searchio = new Bio::SearchIO( -format => 'blast',-file   =>  
> $infile);
> my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
> my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
>                                                   -file   => ">$ 
> {infile}.html");
>
>
> $outhtml->write_result($searchio->next_result);
> ----------------------------------------------------------------
>
>
> Thanks in advance,
>
>
> Anand
>
>
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.


From 1zoujing at 163.com  Wed Apr 16 22:53:16 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 16 Apr 2008 19:53:16 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>
References: <16602770.post@talk.nabble.com> <16603225.post@talk.nabble.com>
	<Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>
Message-ID: <16737795.post@talk.nabble.com>


    Thank you very much!
I splited the file on \t directly.

   Zou Jing


Stefan Kirov-2 wrote:
> 
> It is not. If you use this file, why would you need a parser for it 
> anyway? Just split on \t or read with OpenOffice or equiv.
> Stefan
> 
> On Thu, 10 Apr 2008, zoujing wrote:
> 
>>
>> Seached  the web and found the answer now, quote the answer as following:
>>   The error was thrown by my Bio::ASN1::EntrezGene module because it
>> expects a text file, while you fed it with a binary file.  To use
>> gzipped ASN binary file from NCBI, download the NCBI gene2xml
>> (ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml),
>> then use this syntax to run my parser on the binary files:
>>
>> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
>> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped
>> binary file directly downloaded from NCBI
>>
>> Same syntax should be used when you're using SeqIO (thus
>> SeqIO::entrezgene).
>> Mingyi
>>
>>   But there still one thing, I want to parse "gene_info.gz" in Gene of
>> NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one
>> line
>> per GeneID, Column header line is the first line in the file
>> ) is not the right format for Bio::ASN1::EntrezGene?
>>
>>
>>
>> zoujing wrote:
>>>
>>>    I am a geen hand in Bioperl. When I run perl with
>>> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
>>> information:
>>>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>>>
>>>    But the Sus_scrofa.ags is download from NCBI, with the format of
>>> ASN1,
>>> should be the same as Homo_sapiens in the example. So it should be no
>>> error as the code is the example from Mingyi.
>>>    I wonder why this happen, and should I change something about the
>>> file?
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16737795.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Wed Apr 16 22:55:47 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 16 Apr 2008 19:55:47 -0700 (PDT)
Subject: [Bioperl-l] Bio::ASN1::EntrezGene parse so slowly?
In-Reply-To: <264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>
References: <16602210.post@talk.nabble.com>
	<264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>
Message-ID: <16737804.post@talk.nabble.com>


Thank you vey much!
  Solved the problem now.

   Jing

Sean Davis-3 wrote:
> 
> gene_info is a tab-delimited text file, if I recall correctly.  Have
> you looked at it?  If it is, you should be able to parse it in a few
> seconds with just a couple lines of code.
> 
> Sean
> 
> 
> On Thu, Apr 10, 2008 at 1:08 AM, zoujing <1zoujing at 163.com> wrote:
>>
>>   I want to parse a file "gene_info" from NCBI. The format of Gene in
>> NCBI is
>>  ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
>>  properly/too slow. The file is about 500M.
>>   The code is following:
>>   use Bio::ASN1::EntrezGene;
>>   my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
>>   my $i = 0;
>>   while(my $result = $parser->next_seq)
>>   { last; #something to do there, here use last for test}
>>
>>   When it goes to the "while" part, it is processing on and on, it does
>> not
>>  went out, even I used "last" in the "while" part.
>>    So I wonder whether it is too slow or the module is not fit for this
>> job,
>>  or I did something wrong?
>>
>>   Thank you!
>>  --
>>  View this message in context:
>> http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
>>  Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>>  _______________________________________________
>>  Bioperl-l mailing list
>>  Bioperl-l at lists.open-bio.org
>>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16737804.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From sbassi at clubdelarazon.org  Sat Apr 26 13:49:20 2008
From: sbassi at clubdelarazon.org (Sebastian Bassi)
Date: Sat, 26 Apr 2008 14:49:20 -0300
Subject: [Bioperl-l] bioperl installation problem
Message-ID: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>

I tried to install bioperl because I need to install cviewer.
Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and sdterr outputs.

Here is one of the errors I get:

set_attribute: not a compat02 graph at
/usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
sleeping for 3 seconds
set_attribute: not a compat02 graph at
/usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.

But I have GD::Graph, so I don't know what is going on:

sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install GD::Graph'
CPAN: Storable loaded ok
Going to read /home/sbassi/.cpan/Metadata
  Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
GD::Graph is up to date.

Any help regarding this: http://www.pastecode.com.ar/f37c1cd60
would be appreciated.

Best,
SB.

-- 
Sebasti?n Bassi (???????). Diplomado en Ciencia y Tecnolog?a.
Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
Mostr? tu c?digo: http://www.pastecode.com.ar
GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D


From jason at bioperl.org  Sat Apr 26 15:23:37 2008
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 26 Apr 2008 12:23:37 -0700
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
Message-ID: <B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>

the error refers to the 'Graph' module not 'GD::Graph';

-jason
On Apr 26, 2008, at 10:49 AM, Sebastian Bassi wrote:

> I tried to install bioperl because I need to install cviewer.
> Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and  
> sdterr outputs.
>
> Here is one of the errors I get:
>
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
> sleeping for 3 seconds
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.
>
> But I have GD::Graph, so I don't know what is going on:
>
> sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install GD::Graph'
> CPAN: Storable loaded ok
> Going to read /home/sbassi/.cpan/Metadata
>   Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
> GD::Graph is up to date.
>
> Any help regarding this: http://www.pastecode.com.ar/f37c1cd60
> would be appreciated.
>
> Best,
> SB.
>
> -- 
> Sebasti?n Bassi (???????). Diplomado en Ciencia y  
> Tecnolog?a.
> Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
> Mostr? tu c?digo: http://www.pastecode.com.ar
> GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sbassi at clubdelarazon.org  Sat Apr 26 17:08:13 2008
From: sbassi at clubdelarazon.org (Sebastian Bassi)
Date: Sat, 26 Apr 2008 18:08:13 -0300
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
	<B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>
Message-ID: <9e2f512b0804261408l45ff9f91j94f44065d21cd65f@mail.gmail.com>

On Sat, Apr 26, 2008 at 4:23 PM, Jason Stajich <jason at bioperl.org> wrote:
> the error refers to the 'Graph' module not 'GD::Graph';

You are right, but I have it also installed:

sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install Graph'
Password:
CPAN: Storable loaded ok
Going to read /home/sbassi/.cpan/Metadata
  Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
Graph is up to date.


-- 
Sebasti?n Bassi (???????). Diplomado en Ciencia y Tecnolog?a.
Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
Mostr? tu c?digo: http://www.pastecode.com.ar
GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D


From bix at sendu.me.uk  Sat Apr 26 19:30:56 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Sun, 27 Apr 2008 00:30:56 +0100
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
Message-ID: <4813BB30.6060703@sendu.me.uk>

Sebastian Bassi wrote:
> I tried to install bioperl because I need to install cviewer.
> Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and sdterr outputs.
> 
> Here is one of the errors I get:
> 
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
> sleeping for 3 seconds
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.

You're trying to install a very old version of Bioperl which apparently 
uses behaviour of the Graph module no longer supported:
http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod#Backward_compatibility_with_Graph_0.2

Your options are to force install your desired version of Bioperl (if 
you don't need to use the modules that are causing the errors you get), 
downgrade your version of Graph to pre-0.2, or install the latest 
version of Bioperl (1.5.2 or from svn).

From dr.hogart at gmail.com  Sun Apr 27 10:05:20 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Sun, 27 Apr 2008 18:05:20 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
Message-ID: <op.t99vyoejavnppr@hogart.hackers>

Hi all,

is it possible to add a GD::graphic object (chart) to Bio::Graphics panel  
to obtain a file with image of both the chart and bioseq object?


From Russell.Smithies at agresearch.co.nz  Sun Apr 27 17:27:23 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 28 Apr 2008 09:27:23 +1200
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <op.t99vyoejavnppr@hogart.hackers>
References: <op.t99vyoejavnppr@hogart.hackers>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>

You can get the GD object back from the Bio::Graphics::Panel  then draw
on it using GD methods

Eg:

#create a BioPerl panel
my $panel = Bio::Graphics::Panel->new(
                              			-length   => 600
                              			-width    => 800,
					-bgcolor  => 'white'
					);
# add your features
my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
200,);
$panel->add_track($feature, glyph   =>   'segments',
					-label   =>   0,
					-height  =>   30,
					-bgcolor  =>  'red',
					-fgcolor  => 'red'
					 );

# grab the GD thingy
my $gd = $panel->gd;

#create a color - not sure if there's a better way?
$black = $gd->colorAllocate(0,0,0);

#draw on your GD thingy
$gd->line(10,10,$panel->width -10,10,$black);
$gd->string(gdSmallFont,20,10,'test' ,'$black);

# print it as normal	
print $panel->png;


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-
> bio.org] On Behalf Of sergei ryazansky
> Sent: Monday, 28 April 2008 2:05 a.m.
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
> 
> Hi all,
> 
> is it possible to add a GD::graphic object (chart) to Bio::Graphics
panel
> to obtain a file with image of both the chart and bioseq object?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From dr.hogart at gmail.com  Sun Apr 27 20:25:18 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Mon, 28 Apr 2008 04:25:18 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
Message-ID: <op.uaaosgoeavnppr@hogart.hackers>

Thanks for answer!
Yours  script works fine, but nevertheless, as for as I understand 'gd'  
method return the gd::image object. But I need the to merge bioseq object  
with gd::graph object (gd::graph::area). Is it possible? Or maybe I  
misunderstood something in your example?


On Mon, 28 Apr 2008 01:27:23 +0400, Smithies, Russell  
<Russell.Smithies at agresearch.co.nz> wrote:

> You can get the GD object back from the Bio::Graphics::Panel  then draw
> on it using GD methods
>
> Eg:
>
> #create a BioPerl panel
> my $panel = Bio::Graphics::Panel->new(
>                               			-length   => 600
>                               			-width    => 800,
> 					-bgcolor  => 'white'
> 					);
> # add your features
> my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> 200,);
> $panel->add_track($feature, glyph   =>   'segments',
> 					-label   =>   0,
> 					-height  =>   30,
> 					-bgcolor  =>  'red',
> 					-fgcolor  => 'red'
> 					 );
>
> # grab the GD thingy
> my $gd = $panel->gd;
>
> #create a color - not sure if there's a better way?
> $black = $gd->colorAllocate(0,0,0);
>
> #draw on your GD thingy
> $gd->line(10,10,$panel->width -10,10,$black);
> $gd->string(gdSmallFont,20,10,'test' ,'$black);
>
> # print it as normal	
> print $panel->png;
>
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-
>> bio.org] On Behalf Of sergei ryazansky
>> Sent: Monday, 28 April 2008 2:05 a.m.
>> To: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
>>
>> Hi all,
>>
>> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> panel
>> to obtain a file with image of both the chart and bioseq object?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


From Bank.Beszteri at awi.de  Mon Apr 28 08:18:20 2008
From: Bank.Beszteri at awi.de (=?UTF-8?B?QsOhbmsgQmVzenRlcmk=?=)
Date: Mon, 28 Apr 2008 14:18:20 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FB204F.90405@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de>
Message-ID: <4815C08C.1060305@awi.de>

Dear BioSQL / bioperl-db-ists,

I would like  to share my experiences with trying to load uniprot_trembl 
into a BioSQL db, and also to ask a couple of questions; perhaps some of 
you know the problems I encountered. I used bioperl-live and 
bioperl-db-live as of 2008-04-03 and uniprot_trembl.dat as of 
2008-04-04. The command was like

load_seqdatabase.pl --safe --logchunk 1000 --host dbserv --dbname abc 
--dbuser efg --dbpass xyz --driver mysql --namespace uniprot_trembl 
--format embl uniprot_trembl.dat

although I split the dat file into 10 chunks and started them parallel 
to make it faster. This did not go quite as smoothly as Swissprot did. 
In the end, it seems to have loaded 5022284 entries of the 5443284 which 
appear to be there in the input file (when counting with grep -c "ID   ").

Besides the harmless taxonomy warnings which also appear with Swissprot 
(and have been discussed about here a couple of weeks ago and also 
earlier), there came a couple of more serious errors. Perhaps some of 
you know them already:

First of all, the below error seems to lead to a crash, in spite of --safe:

 >>>
------------- EXCEPTION -------------
MSG: A1XDT7 seems to have an invalid species classification.
STACK Bio::SeqIO::embl::_read_EMBL_Species 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:108
7
STACK Bio::SeqIO::embl::next_seq 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:320
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:634
-------------------------------------

Command exited with non-zero status 255
<<<

What this is about is NCBI Tax_ID:435 (Acetobacter aceti; it has some 30 
synonyms in my DB, too), which, to me, looks like a completely normal 
taxon: I could follow its taxonomy up to the root in my NCBI taxonomy in 
the BioSQL DB I used. I don?t know if someone else has seen / can 
reproduce the problem, or should I think about some problem with my 
taxonomy db? Besides, is it the expected behaviour from 
load_seqdatabase.pl to die upon this error?

###################

The other problems did not lead to a crash, only to a failure to load 
the sequence, which would be what I?d expect with --safe. The first type 
of errors looks like

 >>>
Could not store Q49I36:
------------- EXCEPTION -------------
MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor returned 2 rows 
instead of 1. Query was [name_class="scientific 
name",binomial="Onchocerca volvulus"]
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:958
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:854
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182
STACK Bio::DB::Persistent::PersistentObject::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:244
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
STACK Bio::DB::Persistent::PersistentObject::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:271
STACK (eval) 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:630
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:612
-------------------------------------
<<<

In this particular case, "Onchocerca volvulus" does indeed have two 
taxon_ids in my DB (6282 and 563188, of which only the first one is 
returned by a web search at NCBI taxonomy); but the same thing happened 
with a number of other taxa (followed by how many times the above error 
was caused by the particular taxa):

Wolbachia pipientis     64
Hemerocallis sp.        1
Hypsiglena torquata     3
Salmonella enterica     1211
Burkholderia sp.        31
Streptococcus sp.       4
Rhizobium sp.   600
Nostoc sp.      19
Drosophila sp.  18
Onchocerca volvulus     62
Atlapetes schistaceus   4
Symbiodinium sp.        3
Escherichia coli        7421
Hieraaetus fasciatus    4
Borrelia burgdorferi group      1
Pseudomonas sp. 29
Rotavirus A     1076
Gorilla gorilla 746
Rana plancyi    14
unclassified sequences  1

(This should be 11312 cases altogether, but the list might be incomplete 
because I accidentally removed one of my logs, which contained STDOUT 
&STDERR ~ for 10 % of the entries)

Again, is this a known problem for some of you, or could there be a 
problem with my copy of NCBI taxonomy? I don?t remember having updated 
it after the initial upload, so I?m quite surprised by such duplicate 
entries....

###################

Type 2 error w/o crash:

 >>>
Could not store A5HU09:
------------- EXCEPTION -------------
MSG: create: object (Bio::Species) failed to insert or to be found by 
unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206
STACK Bio::DB::Persistent::PersistentObject::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:244
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
STACK Bio::DB::Persistent::PersistentObject::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:271
STACK (eval) 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:630
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:612
<<<

This particular record has the NCBI_TaxID 44271, which looks completely 
normal in the NCBI taxonomy loaded in my BioSQL DB, but the same problem 
appeared in 53 further cases (I could not look into them in detail as 
yet to see whether they were all the same species). On the other hand, 7 
records which were succesfully loaded have this taxonomy ID in the DB 
(44271).

###################

Nr 3 no crash:

 >>>
Could not store Q6T859: Unmatched ( in regex; marked by <-- HERE in 
m/Camelina microcarpa (Littlepod false flax) ( <-- HERE microcarpa 
subsp.\s+/ at 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/Species.pm line 
466, <GEN0> line 357048.
<<<

This happens in the sub binomial in Species.pm using the option "FULL", 
which requests to also return subspecies. I have not looked much deeper 
into this yet, but is it possible that there is a parsing problem with 
multi-line species strings? In the above case the OS field in 
uniprot_trembl.dat looks like

OS   Camelina microcarpa (Littlepod false flax) (Camelina microcarpa subsp.
OS   sylvestris).

###################

I?m still looking for where the remaining records disappeared: of the 
421000 records not showing up in the DB, I could find these:

crasher (Tax_ID=435):   45 entries
problem 1 ("MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor 
returned 2 rows instead of 1."): 11312 entries
problem 2 ("MSG: create: object (Bio::Species) failed to insert or to be 
found by unique key"): 54 entries
problem 3 ("Unmatched ( in regex"): 28241 entries

381348 still remain... Although these could in principle come from the 
first 10 %, for which I don?t have the output, but they don?t seem to: 
after restarting that chunk, I get ~ 30 "Could not store" errors.

So the last question: are there any error messages I can expect which 
don?t contain "Could not store" and which I thus missed here?


Bank Beszteri


Bioinformatics
Alfred Wegener Institute for Polar and Marine Research
Am Handelshafen 12
27570 Bremerhaven

From cjfields at uiuc.edu  Mon Apr 28 09:20:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 28 Apr 2008 08:20:39 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815C08C.1060305@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
Message-ID: <5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>

On Apr 28, 2008, at 7:18 AM, B?nk Beszteri wrote:

> Dear BioSQL / bioperl-db-ists,
>
> I would like  to share my experiences with trying to load  
> uniprot_trembl into a BioSQL db, and also to ask a couple of  
> questions; perhaps some of you know the problems I encountered. I  
> used bioperl-live and bioperl-db-live as of 2008-04-03 and  
> uniprot_trembl.dat as of 2008-04-04. The command was like
>
> load_seqdatabase.pl --safe --logchunk 1000 --host dbserv --dbname  
> abc --dbuser efg --dbpass xyz --driver mysql --namespace  
> uniprot_trembl --format embl uniprot_trembl.dat
>
> ....
>
> First of all, the below error seems to lead to a crash, in spite of  
> --safe:
>
> >>>
> ------------- EXCEPTION -------------
> MSG: A1XDT7 seems to have an invalid species classification.
> STACK Bio::SeqIO::embl::_read_EMBL_Species /home/biocl/bbeszter/lib/ 
> bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:108
> 7
> STACK Bio::SeqIO::embl::next_seq /home/biocl/bbeszter/lib/bioperl- 
> live/bioperl-live/Bio/SeqIO/embl.pm:320
> STACK toplevel /home/biocl/bbeszter/lib/bioperl-live/bioperl-db/ 
> scripts/biosql/load_seqdatabase.pl:634
> -------------------------------------
>
> Command exited with non-zero status 255
> <<<
>
> What this is about is NCBI Tax_ID:435 (Acetobacter aceti; it has  
> some 30 synonyms in my DB, too), which, to me, looks like a  
> completely normal taxon: I could follow its taxonomy up to the root  
> in my NCBI taxonomy in the BioSQL DB I used. I don?t know if someone  
> else has seen / can reproduce the problem, or should I think about  
> some problem with my taxonomy db? Besides, is it the expected  
> behaviour from load_seqdatabase.pl to die upon this error?

...

You should use 'swiss' format instead of 'embl' when loading Uniprot/ 
SwissProt sequences.  Though on the surface they're similar the  
feature table (among other things) is completely different.  I'm not  
sure if that's causing all of the issues here but it certainly could  
contribute to them.

In the meantime, it's much easier for us to track these problems if  
you file a bug (BioPerl, file for bioperl-db):

http://bugzilla.open-bio.org/

chris

From cjfields at uiuc.edu  Sun Apr 27 17:54:03 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 27 Apr 2008 16:54:03 -0500
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
Message-ID: <FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>

I think this is how some of the synteny mapping is done using  
SynBrowse (the trapezoids connecting syntenous genes on different  
tracks).

http://www.gmod.org/wiki/index.php/SynView

chris

On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:

> You can get the GD object back from the Bio::Graphics::Panel  then  
> draw
> on it using GD methods
>
> Eg:
>
> #create a BioPerl panel
> my $panel = Bio::Graphics::Panel->new(
>                              			-length   => 600
>                              			-width    => 800,
> 					-bgcolor  => 'white'
> 					);
> # add your features
> my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> 200,);
> $panel->add_track($feature, glyph   =>   'segments',
> 					-label   =>   0,
> 					-height  =>   30,
> 					-bgcolor  =>  'red',
> 					-fgcolor  => 'red'
> 					 );
>
> # grab the GD thingy
> my $gd = $panel->gd;
>
> #create a color - not sure if there's a better way?
> $black = $gd->colorAllocate(0,0,0);
>
> #draw on your GD thingy
> $gd->line(10,10,$panel->width -10,10,$black);
> $gd->string(gdSmallFont,20,10,'test' ,'$black);
>
> # print it as normal	
> print $panel->png;
>
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-
>> bio.org] On Behalf Of sergei ryazansky
>> Sent: Monday, 28 April 2008 2:05 a.m.
>> To: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
>>
>> Hi all,
>>
>> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> panel
>> to obtain a file with image of both the chart and bioseq object?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Mon Apr 28 09:51:53 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Mon, 28 Apr 2008 15:51:53 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
Message-ID: <4815D679.3070307@awi.de>

Chris Fields schrieb:
>
> ...
>
> You should use 'swiss' format instead of 'embl' when loading 
> Uniprot/SwissProt sequences.  Though on the surface they're similar 
> the feature table (among other things) is completely different.  I'm 
> not sure if that's causing all of the issues here but it certainly 
> could contribute to them.
>
> In the meantime, it's much easier for us to track these problems if 
> you file a bug (BioPerl, file for bioperl-db):
>
> http://bugzilla.open-bio.org/
>
Hi Chris,

I will do so; in the meanwhile: I?m not loading Swissprot, but TrEMBL. 
Is swiss also the appropriate format here? By reading 
http://expasy.org/sprot/userman.html#diffEMBL, I concluded that embl 
should be the one I?d need for TrEMBL.

Bank

From cjfields at uiuc.edu  Mon Apr 28 12:24:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 28 Apr 2008 11:24:39 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815D679.3070307@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
	<4815D679.3070307@awi.de>
Message-ID: <B7918B56-536D-497F-A59D-D48A61085339@uiuc.edu>


On Apr 28, 2008, at 8:51 AM, B?nk Beszteri wrote:

> Chris Fields schrieb:
>>
>> ...
>>
>> You should use 'swiss' format instead of 'embl' when loading  
>> Uniprot/SwissProt sequences.  Though on the surface they're similar  
>> the feature table (among other things) is completely different.   
>> I'm not sure if that's causing all of the issues here but it  
>> certainly could contribute to them.
>>
>> In the meantime, it's much easier for us to track these problems if  
>> you file a bug (BioPerl, file for bioperl-db):
>>
>> http://bugzilla.open-bio.org/
>>
> Hi Chris,
>
> I will do so; in the meanwhile: I?m not loading Swissprot, but  
> TrEMBL. Is swiss also the appropriate format here? By reading http://expasy.org/sprot/userman.html#diffEMBL 
> , I concluded that embl should be the one I?d need for TrEMBL.
>
> Bank

The section you link to describes several important differences  
between EMBL and SwissProt/UniProt format (i.e. how each indicated  
line type differs between SwissProt and EMBL formats, including ID,  
AC, OS/OC, FT, etc).  I'm unsure how you derived that 'embl' would  
work from that, e.g. they are close, but there are enough significant  
differences that using 'embl' for SwissProt (or vice versa) will not  
work as intended, if at all.

chris

From hlapp at gmx.net  Mon Apr 28 15:46:07 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 28 Apr 2008 15:46:07 -0400
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815D679.3070307@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
	<4815D679.3070307@awi.de>
Message-ID: <3BD6A261-D023-4A5F-9CBC-C3216B0145F0@gmx.net>


On Apr 28, 2008, at 9:51 AM, B?nk Beszteri wrote:
>  I?m not loading Swissprot, but TrEMBL. Is swiss also the  
> appropriate format here?


Yes, though I guess it can be confusing.

Maybe we should create a symlink uniprot.pm to swiss.pm, or in fact  
fork them if UniProt starts accumulating enough differences from the  
traditional Swissprot format.

BTW as you had noticed, the --safe switch only protects the script  
from crashing due to a db loading error. A parsing error will still  
cause a crash.

I guess you can argue that that's not nice, and having a chance to  
skip over the record that offends the (BioPerl) parser would be  
useful. The problem is that if the parser errors out, it's not  
guaranteed where we are in the file and whether the parser module is  
in a state that it can recover itself from. For the database it's a  
bit easier as one just needs to rollback() the transaction (each  
sequence is its own transaction).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From Russell.Smithies at agresearch.co.nz  Mon Apr 28 17:15:16 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 29 Apr 2008 09:15:16 +1200
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>

I thought it was a bit of a hack but I guess if someone else is doing it
too, it can't be all bad  :-)

It looks like you can combine your drawing methods like this:
(I'm sure Lincoln will tell us this is bad but it seems to work ok)
------------------------------------------------------------------------
-------------

#!perl -w
use GD::Graph::lines;
use GD::Graph::colour;
use GD::Graph::Data;

use Bio::Graphics;
use Bio::SeqFeature::Generic;

# create and draw on a graphics panel
my $panel = Bio::Graphics::Panel->new(
                                      -length => 500,
                                      -width  => 500
                                     );
my $track = $panel->add_track(
                              -glyph => 'generic',
                              -label => 1
                             );

# create and add a few features
for($i = 100; $i < 500; $i+= 100){
  my $feature = Bio::SeqFeature::Generic->new(
                                              -display_name => "feature:
$i",
                                              -score        => $i,
                                              -start        => $i,
                                              -end          => $i + 100
                                             );
  $track->add_feature($feature);
}


# create and draw the graph
my @data = (
    ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
    [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
    [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
);
my $graph = GD::Graph::lines->new(500, 300);

$graph->set(
      x_label           => 'X Label',
      y_label           => 'Y label',
      title             => 'Some simple graph',
      y_max_value       => 8,
      y_tick_number     => 8,
      y_label_skip      => 2
) or die $graph->error;

$graph->set( dclrs => [ qw( green blue black red pink) ] );

my $gd = $graph->plot(\@data) or die $graph->error;

# combine the two images
my $combined = $panel->gd($gd);

open(IMG, '>file.png') or die $!;
binmode IMG;
print IMG $combined->png;

------------------------------------------------------------------------
------------------

> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Monday, 28 April 2008 9:54 a.m.
> To: Smithies, Russell
> Cc: sergei ryazansky; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] addition of GD::graphic object to
Bio::Graphics
> 
> I think this is how some of the synteny mapping is done using
> SynBrowse (the trapezoids connecting syntenous genes on different
> tracks).
> 
> http://www.gmod.org/wiki/index.php/SynView
> 
> chris
> 
> On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
> 
> > You can get the GD object back from the Bio::Graphics::Panel  then
> > draw
> > on it using GD methods
> >
> > Eg:
> >
> > #create a BioPerl panel
> > my $panel = Bio::Graphics::Panel->new(
> >                              			-length   => 600
> >                              			-width    =>
800,
> > 					-bgcolor  => 'white'
> > 					);
> > # add your features
> > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> > 200,);
> > $panel->add_track($feature, glyph   =>   'segments',
> > 					-label   =>   0,
> > 					-height  =>   30,
> > 					-bgcolor  =>  'red',
> > 					-fgcolor  => 'red'
> > 					 );
> >
> > # grab the GD thingy
> > my $gd = $panel->gd;
> >
> > #create a color - not sure if there's a better way?
> > $black = $gd->colorAllocate(0,0,0);
> >
> > #draw on your GD thingy
> > $gd->line(10,10,$panel->width -10,10,$black);
> > $gd->string(gdSmallFont,20,10,'test' ,'$black);
> >
> > # print it as normal
> > print $panel->png;
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org
> > [mailto:bioperl-l-bounces at lists.open-
> >> bio.org] On Behalf Of sergei ryazansky
> >> Sent: Monday, 28 April 2008 2:05 a.m.
> >> To: bioperl-l at bioperl.org
> >> Subject: [Bioperl-l] addition of GD::graphic object to
Bio::Graphics
> >>
> >> Hi all,
> >>
> >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> > panel
> >> to obtain a file with image of both the chart and bioseq object?
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =
> >
> =============================================================
> =========
> > Attention: The information contained in this message and/or
> > attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or
> > privileged
> > material. Any review, retransmission, dissemination or other use of,
> > or
> > taking of any action in reliance upon, this information by persons
or
> > entities other than the intended recipients is prohibited by
> > AgResearch
> > Limited. If you have received this message in error, please notify
the
> > sender immediately.
> > =
> >
> =============================================================
> =========
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From lincoln.stein at gmail.com  Mon Apr 28 17:33:19 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 28 Apr 2008 17:33:19 -0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
	<D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
Message-ID: <6dce9a0b0804281433i697cda2fo2c47ce59010d0858@mail.gmail.com>

Hi,

No, I'm perfectly happy with combining images like this. It is part of what
I intended.

Another idea would be to use the Image glyph to embed graphs at particular
genomic locations in the panel. Right now the glyph is designed in the
expectation that the image passed to it is sitting on the file system (or a
web URL), but it would be easy to modify it so that a callback can generate
the GD on the fly, by using, for example GD::Graph.

Lincoln

On Mon, Apr 28, 2008 at 5:15 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

> I thought it was a bit of a hack but I guess if someone else is doing it
> too, it can't be all bad  :-)
>
> It looks like you can combine your drawing methods like this:
> (I'm sure Lincoln will tell us this is bad but it seems to work ok)
> ------------------------------------------------------------------------
> -------------
>
> #!perl -w
> use GD::Graph::lines;
> use GD::Graph::colour;
> use GD::Graph::Data;
>
> use Bio::Graphics;
> use Bio::SeqFeature::Generic;
>
> # create and draw on a graphics panel
> my $panel = Bio::Graphics::Panel->new(
>                                       -length => 500,
>                                      -width  => 500
>                                     );
> my $track = $panel->add_track(
>                              -glyph => 'generic',
>                              -label => 1
>                             );
>
> # create and add a few features
> for($i = 100; $i < 500; $i+= 100){
>   my $feature = Bio::SeqFeature::Generic->new(
>                                               -display_name => "feature:
> $i",
>                                              -score        => $i,
>                                              -start        => $i,
>                                              -end          => $i + 100
>                                             );
>  $track->add_feature($feature);
> }
>
>
> # create and draw the graph
> my @data = (
>    ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
>    [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
>    [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
> );
> my $graph = GD::Graph::lines->new(500, 300);
>
> $graph->set(
>      x_label           => 'X Label',
>      y_label           => 'Y label',
>      title             => 'Some simple graph',
>      y_max_value       => 8,
>      y_tick_number     => 8,
>      y_label_skip      => 2
> ) or die $graph->error;
>
> $graph->set( dclrs => [ qw( green blue black red pink) ] );
>
> my $gd = $graph->plot(\@data) or die $graph->error;
>
> # combine the two images
> my $combined = $panel->gd($gd);
>
> open(IMG, '>file.png') or die $!;
> binmode IMG;
> print IMG $combined->png;
>
> ------------------------------------------------------------------------
> ------------------
>
> > -----Original Message-----
> > From: Chris Fields [mailto:cjfields at uiuc.edu]
> > Sent: Monday, 28 April 2008 9:54 a.m.
> > To: Smithies, Russell
> > Cc: sergei ryazansky; bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
> >
> > I think this is how some of the synteny mapping is done using
> > SynBrowse (the trapezoids connecting syntenous genes on different
> > tracks).
> >
> > http://www.gmod.org/wiki/index.php/SynView
> >
> > chris
> >
> > On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
> >
> > > You can get the GD object back from the Bio::Graphics::Panel  then
> > > draw
> > > on it using GD methods
> > >
> > > Eg:
> > >
> > > #create a BioPerl panel
> > > my $panel = Bio::Graphics::Panel->new(
> > >                                                     -length   => 600
> > >                                                     -width    =>
> 800,
> > >                                     -bgcolor  => 'white'
> > >                                     );
> > > # add your features
> > > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> > > 200,);
> > > $panel->add_track($feature, glyph   =>   'segments',
> > >                                     -label   =>   0,
> > >                                     -height  =>   30,
> > >                                     -bgcolor  =>  'red',
> > >                                     -fgcolor  => 'red'
> > >                                      );
> > >
> > > # grab the GD thingy
> > > my $gd = $panel->gd;
> > >
> > > #create a color - not sure if there's a better way?
> > > $black = $gd->colorAllocate(0,0,0);
> > >
> > > #draw on your GD thingy
> > > $gd->line(10,10,$panel->width -10,10,$black);
> > > $gd->string(gdSmallFont,20,10,'test' ,'$black);
> > >
> > > # print it as normal
> > > print $panel->png;
> > >
> > >
> > >
> > >
> > >> -----Original Message-----
> > >> From: bioperl-l-bounces at lists.open-bio.org
> > > [mailto:bioperl-l-bounces at lists.open-
> > >> bio.org] On Behalf Of sergei ryazansky
> > >> Sent: Monday, 28 April 2008 2:05 a.m.
> > >> To: bioperl-l at bioperl.org
> > >> Subject: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
> > >>
> > >> Hi all,
> > >>
> > >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> > > panel
> > >> to obtain a file with image of both the chart and bioseq object?
> > >>
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > =
> > >
> > =============================================================
> > =========
> > > Attention: The information contained in this message and/or
> > > attachments
> > > from AgResearch Limited is intended only for the persons or entities
> > > to which it is addressed and may contain confidential and/or
> > > privileged
> > > material. Any review, retransmission, dissemination or other use of,
> > > or
> > > taking of any action in reliance upon, this information by persons
> or
> > > entities other than the intended recipients is prohibited by
> > > AgResearch
> > > Limited. If you have received this message in error, please notify
> the
> > > sender immediately.
> > > =
> > >
> > =============================================================
> > =========
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu

From dr.hogart at gmail.com  Tue Apr 29 03:56:24 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Tue, 29 Apr 2008 11:56:24 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
	<D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
Message-ID: <op.uac4caojavnppr@hogart.img.ras.ru>

Thank you very much! It is exactly that I was looking for.

On Tue, 29 Apr 2008 01:15:16 +0400, Smithies, Russell  
<Russell.Smithies at agresearch.co.nz> wrote:

> I thought it was a bit of a hack but I guess if someone else is doing it
> too, it can't be all bad  :-)
>
> It looks like you can combine your drawing methods like this:
> (I'm sure Lincoln will tell us this is bad but it seems to work ok)
> ------------------------------------------------------------------------
> -------------
>
> #!perl -w
> use GD::Graph::lines;
> use GD::Graph::colour;
> use GD::Graph::Data;
>
> use Bio::Graphics;
> use Bio::SeqFeature::Generic;
>
> # create and draw on a graphics panel
> my $panel = Bio::Graphics::Panel->new(
>                                       -length => 500,
>                                       -width  => 500
>                                      );
> my $track = $panel->add_track(
>                               -glyph => 'generic',
>                               -label => 1
>                              );
>
> # create and add a few features
> for($i = 100; $i < 500; $i+= 100){
>   my $feature = Bio::SeqFeature::Generic->new(
>                                               -display_name => "feature:
> $i",
>                                               -score        => $i,
>                                               -start        => $i,
>                                               -end          => $i + 100
>                                              );
>   $track->add_feature($feature);
> }
>
>
> # create and draw the graph
> my @data = (
>     ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
>     [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
>     [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
> );
> my $graph = GD::Graph::lines->new(500, 300);
>
> $graph->set(
>       x_label           => 'X Label',
>       y_label           => 'Y label',
>       title             => 'Some simple graph',
>       y_max_value       => 8,
>       y_tick_number     => 8,
>       y_label_skip      => 2
> ) or die $graph->error;
>
> $graph->set( dclrs => [ qw( green blue black red pink) ] );
>
> my $gd = $graph->plot(\@data) or die $graph->error;
>
> # combine the two images
> my $combined = $panel->gd($gd);
>
> open(IMG, '>file.png') or die $!;
> binmode IMG;
> print IMG $combined->png;
>
> ------------------------------------------------------------------------
> ------------------
>
>> -----Original Message-----
>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> Sent: Monday, 28 April 2008 9:54 a.m.
>> To: Smithies, Russell
>> Cc: sergei ryazansky; bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
>>
>> I think this is how some of the synteny mapping is done using
>> SynBrowse (the trapezoids connecting syntenous genes on different
>> tracks).
>>
>> http://www.gmod.org/wiki/index.php/SynView
>>
>> chris
>>
>> On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
>>
>> > You can get the GD object back from the Bio::Graphics::Panel  then
>> > draw
>> > on it using GD methods
>> >
>> > Eg:
>> >
>> > #create a BioPerl panel
>> > my $panel = Bio::Graphics::Panel->new(
>> >                              			-length   => 600
>> >                              			-width    =>
> 800,
>> > 					-bgcolor  => 'white'
>> > 					);
>> > # add your features
>> > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
>> > 200,);
>> > $panel->add_track($feature, glyph   =>   'segments',
>> > 					-label   =>   0,
>> > 					-height  =>   30,
>> > 					-bgcolor  =>  'red',
>> > 					-fgcolor  => 'red'
>> > 					 );
>> >
>> > # grab the GD thingy
>> > my $gd = $panel->gd;
>> >
>> > #create a color - not sure if there's a better way?
>> > $black = $gd->colorAllocate(0,0,0);
>> >
>> > #draw on your GD thingy
>> > $gd->line(10,10,$panel->width -10,10,$black);
>> > $gd->string(gdSmallFont,20,10,'test' ,'$black);
>> >
>> > # print it as normal
>> > print $panel->png;
>> >
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org
>> > [mailto:bioperl-l-bounces at lists.open-
>> >> bio.org] On Behalf Of sergei ryazansky
>> >> Sent: Monday, 28 April 2008 2:05 a.m.
>> >> To: bioperl-l at bioperl.org
>> >> Subject: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
>> >>
>> >> Hi all,
>> >>
>> >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
>> > panel
>> >> to obtain a file with image of both the chart and bioseq object?
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =
>> >
>> =============================================================
>> =========
>> > Attention: The information contained in this message and/or
>> > attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or
>> > privileged
>> > material. Any review, retransmission, dissemination or other use of,
>> > or
>> > taking of any action in reliance upon, this information by persons
> or
>> > entities other than the intended recipients is prohibited by
>> > AgResearch
>> > Limited. If you have received this message in error, please notify
> the
>> > sender immediately.
>> > =
>> >
>> =============================================================
>> =========
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


-- 
?????????? M2, ????????????? ???????? ?????????? Opera:  
http://www.opera.com/mail/mail/


From d.gatherer at mrcvu.gla.ac.uk  Tue Apr 29 08:21:05 2008
From: d.gatherer at mrcvu.gla.ac.uk (Derek Gatherer)
Date: Tue, 29 Apr 2008 13:21:05 +0100
Subject: [Bioperl-l] translate() oddities
Message-ID: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>

Hi

I thought I'd better run this by the community before I embarrass 
myself on Bugzilla.  It seems like a clear bug to me.  I'm running 
Bioperl 1.5.0 on RedHat.

For a test input:

 >test
ATGATGATGATGATGTGA

the following code is fine.

while((my $seqobj = $seq_in->next_seq()))
{
     print "\n".$seqobj->display_id;
     my $len  = $seqobj->length();
     print " length: $len";
     my $frame1_obj = $seqobj->translate();
     my $f1_prot = $frame1_obj->seq();
     print "\n$f1_prot";
}

Output:

test length: 18
MMMMM*

But if I want to change the frame as specified in the BioPerl 
tutorial, by using:

my $frame1_obj = $seqobj->translate(frame => 1); # which should now 
give frame 2, I get:

test length: 18
MMMMM-frame

The frame is unchanged and the text "-frame" is tacked on the end of 
the output.  The same occurs with translate(frame => 2).

Any ideas?  Can something as fundamental as translate() really be 
bugged?  or am I guilty of some particularly heinous syntax error?

Cheers
Derek


From tristan.lefebure at gmail.com  Tue Apr 29 09:58:21 2008
From: tristan.lefebure at gmail.com (Tristan Lefebure)
Date: Tue, 29 Apr 2008 09:58:21 -0400
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
Message-ID: <200804290958.21548.tristan.lefebure@gmail.com>

Aren't you forgetting the dash?

my $frame1_obj = $seqobj->translate(-frame => 1)


On Tuesday 29 April 2008 08:21:05 Derek Gatherer wrote:
> my $frame1_obj = $seqobj->translate(frame => 1)


-Tristan

From d.gatherer at mrcvu.gla.ac.uk  Tue Apr 29 10:05:03 2008
From: d.gatherer at mrcvu.gla.ac.uk (Derek Gatherer)
Date: Tue, 29 Apr 2008 15:05:03 +0100
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <481726BF.1060609@bms.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<481726BF.1060609@bms.com>
Message-ID: <E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>

Thanks Stefan

Actually, there was a typo in my message, I did use -frame => 
1.  However, the problem disappears on upgrading from 1.5.0 to 1.5.2.

So not a bug anymore.

Cheers
Derek

At 14:46 29/04/2008, Stefan Kirov wrote:
>my $frame1_obj = $seqobj->translate(-frame => 1);
>not
>my $frame1_obj = $seqobj->translate(frame => 1);
>Stefan
>
>Derek Gatherer wrote:
> > Hi
> >
> > I thought I'd better run this by the community before I embarrass
> > myself on Bugzilla.  It seems like a clear bug to me.  I'm running
> > Bioperl 1.5.0 on RedHat.
> >
> > For a test input:
> >
> > >test
> > ATGATGATGATGATGTGA
> >
> > the following code is fine.
> >
> > while((my $seqobj = $seq_in->next_seq()))
> > {
> >     print "\n".$seqobj->display_id;
> >     my $len  = $seqobj->length();
> >     print " length: $len";
> >     my $frame1_obj = $seqobj->translate();
> >     my $f1_prot = $frame1_obj->seq();
> >     print "\n$f1_prot";
> > }
> >
> > Output:
> >
> > test length: 18
> > MMMMM*
> >
> > But if I want to change the frame as specified in the BioPerl
> > tutorial, by using:
> >
> > my $frame1_obj = $seqobj->translate(frame => 1); # which should now
> > give frame 2, I get:
> >
> > test length: 18
> > MMMMM-frame
> >
> > The frame is unchanged and the text "-frame" is tacked on the end of
> > the output.  The same occurs with translate(frame => 2).
> >
> > Any ideas?  Can something as fundamental as translate() really be
> > bugged?  or am I guilty of some particularly heinous syntax error?
> >
> > Cheers
> > Derek
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >


From l.douchy at gmail.com  Tue Apr 29 10:16:40 2008
From: l.douchy at gmail.com (Laurent DOUCHY)
Date: Tue, 29 Apr 2008 16:16:40 +0200
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <200804290958.21548.tristan.lefebure@gmail.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<200804290958.21548.tristan.lefebure@gmail.com>
Message-ID: <2fb209dd0804290716x36e403dek55978dc4f54e34ff@mail.gmail.com>

Hello,

I resolved this issue in Bio::seqIO with the following line :

my $sequence = $seq->translate('*', 'X', '0', '1', '0', '0', '0', '0')->seq;
the third parameter set the frame.

I hope to have been helpful.

laurent.

On Tue, Apr 29, 2008 at 3:58 PM, Tristan Lefebure <
tristan.lefebure at gmail.com> wrote:

> Aren't you forgetting the dash?
>
> my $frame1_obj = $seqobj->translate(-frame => 1)
>
>
> On Tuesday 29 April 2008 08:21:05 Derek Gatherer wrote:
> > my $frame1_obj = $seqobj->translate(frame => 1)
>
>
>
> -Tristan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From roy.chaudhuri at gmail.com  Tue Apr 29 10:27:10 2008
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 29 Apr 2008 15:27:10 +0100
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>	<481726BF.1060609@bms.com>
	<E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
Message-ID: <4817303E.1040903@gmail.com>

Spent two minutes looking at this, so may as well chip in with what I 
discovered even though you solved your problem.

This "bug" comes about because in version 1.5.1 and earlier, the 
arguments to translate were a simple list, with the first argument the 
terminator (defaults to "*"). Your old version therefore assumed that 
you wanted to translate the stop codon to "-frame". Amusingly given your 
typo, if you miss the hyphen off the frame argument in version 1.5.2 it 
reverts to the old interface and you end up with the output 
"MMMMMframe". The moral of the story is of course to read the docs 
relevant to the version you are using.

Roy.
--
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

Derek Gatherer wrote:
> Thanks Stefan
> 
> Actually, there was a typo in my message, I did use -frame => 
> 1.  However, the problem disappears on upgrading from 1.5.0 to 1.5.2.
> 
> So not a bug anymore.
> 
> Cheers
> Derek
> 
> At 14:46 29/04/2008, Stefan Kirov wrote:
>> my $frame1_obj = $seqobj->translate(-frame => 1);
>> not
>> my $frame1_obj = $seqobj->translate(frame => 1);
>> Stefan
>>
>> Derek Gatherer wrote:
>>> Hi
>>>
>>> I thought I'd better run this by the community before I embarrass
>>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>>> Bioperl 1.5.0 on RedHat.
>>>
>>> For a test input:
>>>
>>>> test
>>> ATGATGATGATGATGTGA
>>>
>>> the following code is fine.
>>>
>>> while((my $seqobj = $seq_in->next_seq()))
>>> {
>>>     print "\n".$seqobj->display_id;
>>>     my $len  = $seqobj->length();
>>>     print " length: $len";
>>>     my $frame1_obj = $seqobj->translate();
>>>     my $f1_prot = $frame1_obj->seq();
>>>     print "\n$f1_prot";
>>> }
>>>
>>> Output:
>>>
>>> test length: 18
>>> MMMMM*
>>>
>>> But if I want to change the frame as specified in the BioPerl
>>> tutorial, by using:
>>>
>>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>>> give frame 2, I get:
>>>
>>> test length: 18
>>> MMMMM-frame
>>>
>>> The frame is unchanged and the text "-frame" is tacked on the end of
>>> the output.  The same occurs with translate(frame => 2).
>>>
>>> Any ideas?  Can something as fundamental as translate() really be
>>> bugged?  or am I guilty of some particularly heinous syntax error?
>>>
>>> Cheers
>>> Derek
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From stefan.kirov at bms.com  Tue Apr 29 09:46:39 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Tue, 29 Apr 2008 09:46:39 -0400
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
Message-ID: <481726BF.1060609@bms.com>

my $frame1_obj = $seqobj->translate(-frame => 1);
not
my $frame1_obj = $seqobj->translate(frame => 1);
Stefan

Derek Gatherer wrote:
> Hi
>
> I thought I'd better run this by the community before I embarrass
> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
> Bioperl 1.5.0 on RedHat.
>
> For a test input:
>
> >test
> ATGATGATGATGATGTGA
>
> the following code is fine.
>
> while((my $seqobj = $seq_in->next_seq()))
> {
>     print "\n".$seqobj->display_id;
>     my $len  = $seqobj->length();
>     print " length: $len";
>     my $frame1_obj = $seqobj->translate();
>     my $f1_prot = $frame1_obj->seq();
>     print "\n$f1_prot";
> }
>
> Output:
>
> test length: 18
> MMMMM*
>
> But if I want to change the frame as specified in the BioPerl
> tutorial, by using:
>
> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
> give frame 2, I get:
>
> test length: 18
> MMMMM-frame
>
> The frame is unchanged and the text "-frame" is tacked on the end of
> the output.  The same occurs with translate(frame => 2).
>
> Any ideas?  Can something as fundamental as translate() really be
> bugged?  or am I guilty of some particularly heinous syntax error?
>
> Cheers
> Derek
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Tue Apr 29 11:00:00 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 29 Apr 2008 10:00:00 -0500
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <4817303E.1040903@gmail.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>	<481726BF.1060609@bms.com>
	<E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
	<4817303E.1040903@gmail.com>
Message-ID: <36045A08-AEA8-4639-A384-1DC53B5DC129@uiuc.edu>

Yes the interface changed somewhat post 1.5.1, mainly to accept named  
parameters.  I think a few methods do this now as passing in lists of  
more than 2 args, undef'ing those one doesn't want set, gets confusing.

chris

On Apr 29, 2008, at 9:27 AM, Roy Chaudhuri wrote:

> Spent two minutes looking at this, so may as well chip in with what  
> I discovered even though you solved your problem.
>
> This "bug" comes about because in version 1.5.1 and earlier, the  
> arguments to translate were a simple list, with the first argument  
> the terminator (defaults to "*"). Your old version therefore assumed  
> that you wanted to translate the stop codon to "-frame". Amusingly  
> given your typo, if you miss the hyphen off the frame argument in  
> version 1.5.2 it reverts to the old interface and you end up with  
> the output "MMMMMframe". The moral of the story is of course to read  
> the docs relevant to the version you are using.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
>
> Derek Gatherer wrote:
>> Thanks Stefan
>> Actually, there was a typo in my message, I did use -frame => 1.   
>> However, the problem disappears on upgrading from 1.5.0 to 1.5.2.
>> So not a bug anymore.
>> Cheers
>> Derek
>> At 14:46 29/04/2008, Stefan Kirov wrote:
>>> my $frame1_obj = $seqobj->translate(-frame => 1);
>>> not
>>> my $frame1_obj = $seqobj->translate(frame => 1);
>>> Stefan
>>>
>>> Derek Gatherer wrote:
>>>> Hi
>>>>
>>>> I thought I'd better run this by the community before I embarrass
>>>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>>>> Bioperl 1.5.0 on RedHat.
>>>>
>>>> For a test input:
>>>>
>>>>> test
>>>> ATGATGATGATGATGTGA
>>>>
>>>> the following code is fine.
>>>>
>>>> while((my $seqobj = $seq_in->next_seq()))
>>>> {
>>>>    print "\n".$seqobj->display_id;
>>>>    my $len  = $seqobj->length();
>>>>    print " length: $len";
>>>>    my $frame1_obj = $seqobj->translate();
>>>>    my $f1_prot = $frame1_obj->seq();
>>>>    print "\n$f1_prot";
>>>> }
>>>>
>>>> Output:
>>>>
>>>> test length: 18
>>>> MMMMM*
>>>>
>>>> But if I want to change the frame as specified in the BioPerl
>>>> tutorial, by using:
>>>>
>>>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>>>> give frame 2, I get:
>>>>
>>>> test length: 18
>>>> MMMMM-frame
>>>>
>>>> The frame is unchanged and the text "-frame" is tacked on the end  
>>>> of
>>>> the output.  The same occurs with translate(frame => 2).
>>>>
>>>> Any ideas?  Can something as fundamental as translate() really be
>>>> bugged?  or am I guilty of some particularly heinous syntax error?
>>>>
>>>> Cheers
>>>> Derek
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 29 11:07:30 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 29 Apr 2008 10:07:30 -0500
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <481726BF.1060609@bms.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<481726BF.1060609@bms.com>
Message-ID: <18DB95FB-52B9-4091-ACEE-996891F8A5AE@uiuc.edu>

As an aside, I've been playing around with perl6 (Rakudo) for a bit  
now.  Parameter-like passing (using autoaccessors and other means)  
will be added in soon, so you will be able to do this:

$seqobj = Seq.new(seq => 'ATGATGATGATGATGTGA', alphabet => 'dna');
my $protobj = $seq.translate(frame => 1);

Yes, I'm a geek. ; >

chris

On Apr 29, 2008, at 8:46 AM, Stefan Kirov wrote:

> my $frame1_obj = $seqobj->translate(-frame => 1);
> not
> my $frame1_obj = $seqobj->translate(frame => 1);
> Stefan
>
> Derek Gatherer wrote:
>> Hi
>>
>> I thought I'd better run this by the community before I embarrass
>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>> Bioperl 1.5.0 on RedHat.
>>
>> For a test input:
>>
>>> test
>> ATGATGATGATGATGTGA
>>
>> the following code is fine.
>>
>> while((my $seqobj = $seq_in->next_seq()))
>> {
>>    print "\n".$seqobj->display_id;
>>    my $len  = $seqobj->length();
>>    print " length: $len";
>>    my $frame1_obj = $seqobj->translate();
>>    my $f1_prot = $frame1_obj->seq();
>>    print "\n$f1_prot";
>> }
>>
>> Output:
>>
>> test length: 18
>> MMMMM*
>>
>> But if I want to change the frame as specified in the BioPerl
>> tutorial, by using:
>>
>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>> give frame 2, I get:
>>
>> test length: 18
>> MMMMM-frame
>>
>> The frame is unchanged and the text "-frame" is tacked on the end of
>> the output.  The same occurs with translate(frame => 2).
>>
>> Any ideas?  Can something as fundamental as translate() really be
>> bugged?  or am I guilty of some particularly heinous syntax error?
>>
>> Cheers
>> Derek
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dr.hogart at gmail.com  Tue Apr 29 11:57:51 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Tue, 29 Apr 2008 19:57:51 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
Message-ID: <op.uadqmpg8avnppr@hogart.img.ras.ru>

Hi all!

I am trying to perform TCoffe aligment by  
Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the script.  
This subroutine works fine, but it is not single subroutine - there are a  
lot of other ones in the script. The problem is when compilation of script  
finish execution (nb! successful execution) of tcoffee subroutine the  
compiliation of the end of the script also interrupted. It seems that the  
tcoffee program itself induce interraption of perl compilation. Is it  
possible to pass this problem?

-- 


From darin.london at duke.edu  Tue Apr 29 12:49:53 2008
From: darin.london at duke.edu (darin.london at duke.edu)
Date: Tue, 29 Apr 2008 12:49:53 -0400
Subject: [Bioperl-l] BOSC 2008 Announcement and Call For Submissions
Message-ID: <200804291650.m3TGnr0H020814@tenero.duhs.duke.edu>


BOSC 2008 Call for Abstracts Reminder

The 9th annual Bioinformatics Open Source Conference (BOSC 2008) will take place in Toronto, Ontario, Canada, as one of several Special Interest Group (SIG) meetings occurring in conjunction with the 16th annual Intelligent Systems for Molecular Biology Conference (ISMB 2008).

This is a reminder to submit your proposals for talks to the BOSC submission system before May 11.

Submission Process:
All abstracts must be submitted through our Open Conference Systems site (http://events.open-bio.org/BOSC2008/openconf.php).
The form will ask for a small Abstract Text to be pasted into it, and a full paper.  The small Abstract text should be a summary, while the longer abstract (should provide more details, including the open-source license requirement details)
Full-length abstracts are limited to one page with one inch (2.5 cm) margins on the top, sides, and bottom.  The full-length abstract should include the title, authors, and affiliations.  We prefer your abstract to be in PDF format, although plain t

Important Dates:
May 11: Abstract submission deadline.
June 2: Notification of accepted talks.
June 4: Early registration discount cut-off.
July 18-19: BOSC 2008!

We hope to see you at BOSC 2008!

Kam Dahlquist and Darin London
BOSC 2008 Co-organizers

			 
From bix at sendu.me.uk  Tue Apr 29 12:54:41 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 29 Apr 2008 17:54:41 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uadqmpg8avnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
Message-ID: <481752D1.7010904@sendu.me.uk>

sergei ryazansky wrote:
> I am trying to perform TCoffe aligment by 
> Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the 
> script. This subroutine works fine, but it is not single subroutine - 
> there are a lot of other ones in the script. The problem is when 
> compilation of script finish execution (nb! successful execution) of 
> tcoffee subroutine the compiliation of the end of the script also 
> interrupted. It seems that the tcoffee program itself induce 
> interraption of perl compilation. Is it possible to pass this problem?

You'll have to supply us with a minimal version of the script and the 
complete error message.

From dr.hogart at gmail.com  Wed Apr 30 07:24:35 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 15:24:35 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
Message-ID: <op.uae8m9tzavnppr@hogart.img.ras.ru>

On Tue, 29 Apr 2008 19:57:51 +0400, sergei ryazansky <dr.hogart at gmail.com>  
wrote:

> Hi all!
>
> I am trying to perform TCoffe aligment by  
> Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the  
> script. This subroutine works fine, but it is not single subroutine -  
> there are a lot of other ones in the script. The problem is when  
> compilation of script finish execution (nb! successful execution) of  
> tcoffee subroutine the compiliation of the end of the script also  
> interrupted. It seems that the tcoffee program itself induce  
> interraption of perl compilation. Is it possible to pass this problem?
>


My subroutine is following:

sub align {
	my $file=shift @_;
	my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 'fasta',  
'outfile' => 'temp_align.out');
	my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
	my $aln=$factory->align ($file);
	open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
	return @temp_file;
}

This subroutine is called by the following command:

my @align_fa = align($inputfile_align);

After successful execution of this subroutine (accompaning with the  
corresponding messages on the terminal window) the execution of remainder  
script is terminated without any error messages.

-- 

From bix at sendu.me.uk  Wed Apr 30 08:47:17 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 13:47:17 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uae8m9tzavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
Message-ID: <48186A55.4030406@sendu.me.uk>

sergei ryazansky wrote:
> My subroutine is following:
> 
> sub align {
>     my $file=shift @_;
>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
> 'fasta', 'outfile' => 'temp_align.out');
>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>     my $aln=$factory->align ($file);
>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>     return @temp_file;
> }
> 
> This subroutine is called by the following command:
> 
> my @align_fa = align($inputfile_align);
> 
> After successful execution of this subroutine (accompaning with the 
> corresponding messages on the terminal window) the execution of 
> remainder script is terminated without any error messages.

The problem lies somewhere within the rest of your script, so we have to 
see it if you want help.

Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you 
don't make use of the resulting alignment object? A system call might 
make more sense given what you're doing. The beauty of 
Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the 
result file (temp_align.out) yourself.

From dr.hogart at gmail.com  Wed Apr 30 09:36:58 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 17:36:58 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
Message-ID: <op.uaferwytavnppr@hogart.img.ras.ru>

On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:

> sergei ryazansky wrote:
>> My subroutine is following:
>>  sub align {
>>     my $file=shift @_;
>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>  
>> 'fasta', 'outfile' => 'temp_align.out');
>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>     my $aln=$factory->align ($file);
>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>     return @temp_file;
>> }
>>  This subroutine is called by the following command:
>>  my @align_fa = align($inputfile_align);
>>  After successful execution of this subroutine (accompaning with the  
>> corresponding messages on the terminal window) the execution of  
>> remainder script is terminated without any error messages.
>
> The problem lies somewhere within the rest of your script, so we have to  
> see it if you want help.
>
> Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you  
> don't make use of the resulting alignment object? A system call might  
> make more sense given what you're doing. The beauty of  
> Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the  
> result file (temp_align.out) yourself.

The rest of script,imho, is ok, because without this sub it is work fine.  
May be problem lies into the TCoffee itself?

One of the feature of script is to estimate the quantity of nt changes in  
each position in the different similar sequences in comparing with  
consensus sequences. To perform this it is nesseccary to obtain the  
multiply alignment: the result of TCoffee alignment goes to another  
subroutine, that estemated the level of changes. Of course, I dont think  
that this way is the best approach, most probably there are a lot of the  
better ways to do it. But for my today purposes it is ok.

-- 

From avilella at gmail.com  Wed Apr 30 10:16:56 2008
From: avilella at gmail.com (Albert Vilella)
Date: Wed, 30 Apr 2008 15:16:56 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru> <48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>

Hi Sergei,

Can you try to isolate this call with a simpler example to see if it still
fails? When you say that the problems are in the compilation, do you mean
that the interpreter won't even compile or that it fails during execution?
Have you checked that you have all the dependencies right?

Cheers,

    Albert.

On Wed, Apr 30, 2008 at 2:36 PM, sergei ryazansky <dr.hogart at gmail.com>
wrote:

> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
>
>  sergei ryazansky wrote:
> >
> > > My subroutine is following:
> > >  sub align {
> > >    my $file=shift @_;
> > >    my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>
> > > 'fasta', 'outfile' => 'temp_align.out');
> > >    my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
> > >    my $aln=$factory->align ($file);
> > >    open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
> > >    return @temp_file;
> > > }
> > >  This subroutine is called by the following command:
> > >  my @align_fa = align($inputfile_align);
> > >  After successful execution of this subroutine (accompaning with the
> > > corresponding messages on the terminal window) the execution of remainder
> > > script is terminated without any error messages.
> > >
> >
> > The problem lies somewhere within the rest of your script, so we have to
> > see it if you want help.
> >
> > Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you
> > don't make use of the resulting alignment object? A system call might make
> > more sense given what you're doing. The beauty of
> > Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the
> > result file (temp_align.out) yourself.
> >
>
> The rest of script,imho, is ok, because without this sub it is work fine.
> May be problem lies into the TCoffee itself?
>
> One of the feature of script is to estimate the quantity of nt changes in
> each position in the different similar sequences in comparing with consensus
> sequences. To perform this it is nesseccary to obtain the multiply
> alignment: the result of TCoffee alignment goes to another subroutine, that
> estemated the level of changes. Of course, I dont think that this way is the
> best approach, most probably there are a lot of the better ways to do it.
> But for my today purposes it is ok.
>
> --
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

From bix at sendu.me.uk  Wed Apr 30 10:22:01 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 15:22:01 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <48188089.8000300@sendu.me.uk>

sergei ryazansky wrote:
> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
> 
>> sergei ryazansky wrote:
>>> My subroutine is following:
>>>  sub align {
>>>     my $file=shift @_;
>>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
>>> 'fasta', 'outfile' => 'temp_align.out');
>>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>>     my $aln=$factory->align ($file);
>>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>>     return @temp_file;
>>> }
>>>  This subroutine is called by the following command:
>>>  my @align_fa = align($inputfile_align);
>>>  After successful execution of this subroutine (accompaning with the 
>>> corresponding messages on the terminal window) the execution of 
>>> remainder script is terminated without any error messages.
>>
>> The problem lies somewhere within the rest of your script, so we have 
>> to see it if you want help.
> 
> The rest of script,imho, is ok, because without this sub it is work 
> fine. May be problem lies into the TCoffee itself?

I've run your subroutine in a simple script of my own and it doesn't 
cause script termination. Again, the problem lies elsewhere in your 
script. Supply it or it is impossible for anyone to help you.

From Sebastien.Moretti at unil.ch  Wed Apr 30 10:06:28 2008
From: Sebastien.Moretti at unil.ch (Sebastien MORETTI)
Date: Wed, 30 Apr 2008 16:06:28 +0200
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <48187CE4.8030606@unil.ch>

>>> My subroutine is following:
>>>  sub align {
>>>     my $file=shift @_;
>>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
>>> 'fasta', 'outfile' => 'temp_align.out');
>>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>>     my $aln=$factory->align ($file);
>>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>>     return @temp_file;
>>> }
>>>  This subroutine is called by the following command:
>>>  my @align_fa = align($inputfile_align);
>>>  After successful execution of this subroutine (accompaning with the 
>>> corresponding messages on the terminal window) the execution of 
>>> remainder script is terminated without any error messages.
>>
>> The problem lies somewhere within the rest of your script, so we have 
>> to see it if you want help.
>>
>> Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you 
>> don't make use of the resulting alignment object? A system call might 
>> make more sense given what you're doing. The beauty of 
>> Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse 
>> the result file (temp_align.out) yourself.
> 
> The rest of script,imho, is ok, because without this sub it is work 
> fine. May be problem lies into the TCoffee itself?
> 
> One of the feature of script is to estimate the quantity of nt changes 
> in each position in the different similar sequences in comparing with 
> consensus sequences. To perform this it is nesseccary to obtain the 
> multiply alignment: the result of TCoffee alignment goes to another 
> subroutine, that estemated the level of changes. Of course, I dont think 
> that this way is the best approach, most probably there are a lot of the 
> better ways to do it. But for my today purposes it is ok.

Do you have tried to use the tcoffee command, called via bioperl, as a 
command line ?
To check if it is a problem with tcoffee or with the tcoffee release 
that bioperl must use.

-- 
S?bastien Moretti


From dr.hogart at gmail.com  Wed Apr 30 10:54:59 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 18:54:59 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
Message-ID: <op.uafidxitavnppr@hogart.img.ras.ru>

Hi Albert,

The isolated call is executed without any problem, so the code is  
absolutely correct. The problem arise when this sub executed within the  
whole script - after successful execution of TCoffee alignment the  
execution of the rest of script is terminated. The whole code is very big  
(~500 lines), so for simplicity lets imagine the sheme of script in the  
following view:
sub1;
sub2;
sub3;
sub align;  # TCoffe alignment;
sub4;
sub5;

Each sub (subroutine) is independent from the others subs; The order of  
script execution is 1,2,3,align,4,5. But after the execution of align the  
execution of the rest of subs (4 and 5) is terminated. The script without  
sub align {} successfully execute the sub 4 and sub 5. So, I mean that  
interpreter won't compile sub 4 and 5 if sub align is placed before them.

On Wed, 30 Apr 2008 18:16:56 +0400, Albert Vilella <avilella at gmail.com>  
wrote:

> Hi Sergei,
>
> Can you try to isolate this call with a simpler example to see if it  
> still
> fails? When you say that the problems are in the compilation, do you mean
> that the interpreter won't even compile or that it fails during  
> execution?
> Have you checked that you have all the dependencies right?
>
> Cheers,
>
>     Albert.
>
> On Wed, Apr 30, 2008 at 2:36 PM, sergei ryazansky <dr.hogart at gmail.com>
> wrote:
>
>> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
>>
>>  sergei ryazansky wrote:
>> >
>> > > My subroutine is following:
>> > >  sub align {
>> > >    my $file=shift @_;
>> > >    my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>
>> > > 'fasta', 'outfile' => 'temp_align.out');
>> > >    my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>> > >    my $aln=$factory->align ($file);
>> > >    open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>> > >    return @temp_file;
>> > > }
>> > >  This subroutine is called by the following command:
>> > >  my @align_fa = align($inputfile_align);
>> > >  After successful execution of this subroutine (accompaning with the
>> > > corresponding messages on the terminal window) the execution of  
>> remainder
>> > > script is terminated without any error messages.
>> > >
>> >
>> > The problem lies somewhere within the rest of your script, so we have  
>> to
>> > see it if you want help.
>> >
>> > Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you
>> > don't make use of the resulting alignment object? A system call might  
>> make
>> > more sense given what you're doing. The beauty of
>> > Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse  
>> the
>> > result file (temp_align.out) yourself.
>> >
>>
>> The rest of script,imho, is ok, because without this sub it is work  
>> fine.
>> May be problem lies into the TCoffee itself?
>>
>> One of the feature of script is to estimate the quantity of nt changes  
>> in
>> each position in the different similar sequences in comparing with  
>> consensus
>> sequences. To perform this it is nesseccary to obtain the multiply
>> alignment: the result of TCoffee alignment goes to another subroutine,  
>> that
>> estemated the level of changes. Of course, I dont think that this way  
>> is the
>> best approach, most probably there are a lot of the better ways to do  
>> it.
>> But for my today purposes it is ok.
>>
>> --
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>


-- 
?????????? M2, ????????????? ???????? ?????????? Opera:  
http://www.opera.com/mail/mail/


From dr.hogart at gmail.com  Wed Apr 30 11:14:09 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 19:14:09 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru> <48187CE4.8030606@unil.ch>
Message-ID: <op.uafi7ytravnppr@hogart.img.ras.ru>

No, I didn tried.
To tell the truth the problem like this I have obtatin earlier. I simply  
wanted to aling the several set of sequences by TCoffee Bioperl package.  
The script should have been consequently add the set one after another to  
TCoffee wrapper. But after the alignment of the first set of sequences the  
alignment of the rest sets was terminated. So it was neccessary to use  
another "super_script" that called first script with different arguments  
linked to the corresponding set.


> Do you have tried to use the tcoffee command, called via bioperl, as a  
> command line ?


-- 


From bix at sendu.me.uk  Wed Apr 30 11:28:50 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 16:28:50 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uafidxitavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>	<op.uaferwytavnppr@hogart.img.ras.ru>	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru>
Message-ID: <48189032.20102@sendu.me.uk>

sergei ryazansky wrote:
> Hi Albert,
> 
> The isolated call is executed without any problem, so the code is 
> absolutely correct. The problem arise when this sub executed within the 
> whole script - after successful execution of TCoffee alignment the 
> execution of the rest of script is terminated. The whole code is very 
> big (~500 lines), so for simplicity lets imagine the sheme of script in 
> the following view:
> sub1;
> sub2;
> sub3;
> sub align;  # TCoffe alignment;
> sub4;
> sub5;
> 
> Each sub (subroutine) is independent from the others subs; The order of 
> script execution is 1,2,3,align,4,5. But after the execution of align 
> the execution of the rest of subs (4 and 5) is terminated. The script 
> without sub align {} successfully execute the sub 4 and sub 5. So, I 
> mean that interpreter won't compile sub 4 and 5 if sub align is placed 
> before them.

This has nothing to do with interpreter compilation, which is successful 
if the script runs at all.

What do you do with the output of &align? The thing you are doing with 
that output is most likely the cause of your script terminating, which 
is why &sub4 and &sub5 run when you don't run &align (have no output 
that causes the problem).

If you're not willing to show us your script, here are some simple 
debugging steps you can do yourself:

# don't do anything with the output of align() - does &sub4 still run?

# add some print statements after you call align(), and then after every 
further block of code in your script to see exactly where the script 
terminates

# reduce your script down to a minimal script that shows the problem 
(with the help of the previous step) and show us that

From dr.hogart at gmail.com  Wed Apr 30 11:42:41 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 19:42:41 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafkhojw9ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
Message-ID: <op.uafklfmd9ju7si@hogart.img.ras.ru>


------- Forwarded message -------
From: "Sergei Ryazansky" <dr.hogart at gmail.com>
To: "Sendu Bala" <bix at sendu.me.uk>
Cc:
Subject: Re: [Bioperl-l] alignment by TCoffee as a subroutine
Date: Wed, 30 Apr 2008 19:40:26 +0400

> What do you do with the output of &align? The thing you are doing with  
> that output is most likely the cause of your script terminating, which  
> is why &sub4 and &sub5 run when you don't run &align (have no output  
> that causes the problem).

please sea my answer to Sebastien Moretti - there are description of
another similar problem. The only thing that I did there with output is
printing to file. Nevetheless the problem was the same.

> # don't do anything with the output of align() - does &sub4 still run?

please sea above.

> # add some print statements after you call align(), and then after every  
> further block of code in your script to see exactly where the script  
> terminates
> # reduce your script down to a minimal script that shows the problem  
> (with the help of the previous step) and show us that

all tests with individual bloks was performed earlier. the results is ok.


From cjfields at uiuc.edu  Wed Apr 30 12:25:06 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 11:25:06 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafklfmd9ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
Message-ID: <5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>

Sergei,

I agree with Sendu; we can't diagnose this unless we either have the  
entire script of a minimal version of it demonstrating the bug.

The best way to handle this is to file a bug report, attaching  
relevant data using the 'Create a new attachment' link (including  
either the full script or a shortened one which demonstrates the bug).  
Otherwise we're just shooting in the dark trying to diagnose the  
problem.

http://bugzilla.open-bio.org/

chris

On Apr 30, 2008, at 10:42 AM, Sergei Ryazansky wrote:

>
>
> ------- Forwarded message -------
> From: "Sergei Ryazansky" <dr.hogart at gmail.com>
> To: "Sendu Bala" <bix at sendu.me.uk>
> Cc:
> Subject: Re: [Bioperl-l] alignment by TCoffee as a subroutine
> Date: Wed, 30 Apr 2008 19:40:26 +0400
>
>> What do you do with the output of &align? The thing you are doing  
>> with that output is most likely the cause of your script  
>> terminating, which is why &sub4 and &sub5 run when you don't run  
>> &align (have no output that causes the problem).
>
> please sea my answer to Sebastien Moretti - there are description of
> another similar problem. The only thing that I did there with output  
> is
> printing to file. Nevetheless the problem was the same.
>
>> # don't do anything with the output of align() - does &sub4 still  
>> run?
>
> please sea above.
>
>> # add some print statements after you call align(), and then after  
>> every further block of code in your script to see exactly where the  
>> script terminates
>> # reduce your script down to a minimal script that shows the  
>> problem (with the help of the previous step) and show us that
>
> all tests with individual bloks was performed earlier. the results  
> is ok.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dr.hogart at gmail.com  Wed Apr 30 12:40:19 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 20:40:19 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
Message-ID: <op.uafm9hl79ju7si@hogart.img.ras.ru>

On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu> wrote:

Chris, I have already sent file to Sendu and also I am attaching it here.  
I have removed from it really unnecessary parts.

> Sergei,
>
> I agree with Sendu; we can't diagnose this unless we either have the  
> entire script of a minimal version of it demonstrating the bug.
>
> The best way to handle this is to file a bug report, attaching relevant  
> data using the 'Create a new attachment' link (including either the full  
> script or a shortened one which demonstrates the bug). Otherwise we're  
> just shooting in the dark trying to diagnose the problem.
>
> http://bugzilla.open-bio.org/
>
> chris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: script.pl
Type: application/octet-stream
Size: 6870 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20080430/6aef0fde/attachment.obj>

From cjfields at uiuc.edu  Wed Apr 30 13:02:19 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 12:02:19 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafm9hl79ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
Message-ID: <EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>

Hmm, maybe you were confused?  From my last email:

"The best way to handle this is to file a bug report, attaching  
relevant data using the 'Create a new attachment' link (including  
either the full script or a shortened one which demonstrates the bug).  
Otherwise we're just shooting in the dark trying to diagnose the  
problem."

http://bugzilla.open-bio.org/

Anyone can work on fixing the issue there (so it'll probably get fixed  
faster).  The devs can also track progress on the problem via the dev  
mail list (bioperl-guts).  Diagnosing the bug may also reveal issues  
not just with Bio::Tools::Run::Alignment::TCoffee but also with other  
related modules.

If needed I can post it to bugzilla, but it helps to submit the bug  
yourself (so you can receive posts on it's progress).

chris

On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:

> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu>  
> wrote:
>
> Chris, I have already sent file to Sendu and also I am attaching it  
> here. I have removed from it really unnecessary parts.
>
>> Sergei,
>>
>> I agree with Sendu; we can't diagnose this unless we either have  
>> the entire script of a minimal version of it demonstrating the bug.
>>
>> The best way to handle this is to file a bug report, attaching  
>> relevant data using the 'Create a new attachment' link (including  
>> either the full script or a shortened one which demonstrates the  
>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>> the problem.
>>
>> http://bugzilla.open-bio.org/
>>
>> chris

From dr.hogart at gmail.com  Wed Apr 30 13:39:35 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 21:39:35 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafop6079ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
	<EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>
	<op.uafop6079ju7si@hogart.img.ras.ru>
Message-ID: <op.uafpz9n79ju7si@hogart.img.ras.ru>

On Wed, 30 Apr 2008 21:11:56 +0400, Sergei Ryazansky <dr.hogart at gmail.com>  
wrote:

> Oh, sorry, you right - I too fast read you message. I do it slight later.
>
>> Hmm, maybe you were confused?  From my last email:
>>
>> "The best way to handle this is to file a bug report, attaching  
>> relevant data using the 'Create a new attachment' link (including  
>> either the full script or a shortened one which demonstrates the bug).  
>> Otherwise we're just shooting in the dark trying to diagnose the  
>> problem."
>>
>> http://bugzilla.open-bio.org/
>>
>> Anyone can work on fixing the issue there (so it'll probably get fixed  
>> faster).  The devs can also track progress on the problem via the dev  
>> mail list (bioperl-guts).  Diagnosing the bug may also reveal issues  
>> not just with Bio::Tools::Run::Alignment::TCoffee but also with other  
>> related modules.
>>
>> If needed I can post it to bugzilla, but it helps to submit the bug  
>> yourself (so you can receive posts on it's progress).
>>
>> chris
>>
>> On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:
>>
>>> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu>  
>>> wrote:
>>>
>>> Chris, I have already sent file to Sendu and also I am attaching it  
>>> here. I have removed from it really unnecessary parts.
>>>
>>>> Sergei,
>>>>
>>>> I agree with Sendu; we can't diagnose this unless we either have the  
>>>> entire script of a minimal version of it demonstrating the bug.
>>>>
>>>> The best way to handle this is to file a bug report, attaching  
>>>> relevant data using the 'Create a new attachment' link (including  
>>>> either the full script or a shortened one which demonstrates the  
>>>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>>>> the problem.
>>>>
>>>> http://bugzilla.open-bio.org/
>>>>
>>>> chris
>


From cjfields at uiuc.edu  Wed Apr 30 14:29:28 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 13:29:28 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafpz9n79ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
	<EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>
	<op.uafop6079ju7si@hogart.img.ras.ru>
	<op.uafpz9n79ju7si@hogart.img.ras.ru>
Message-ID: <39A139E4-6783-41E6-8EE9-1FE60CB57577@uiuc.edu>

Sorry, didn't catch that...

chris

On Apr 30, 2008, at 12:39 PM, Sergei Ryazansky wrote:

> On Wed, 30 Apr 2008 21:11:56 +0400, Sergei Ryazansky <dr.hogart at gmail.com 
> > wrote:
>
>> Oh, sorry, you right - I too fast read you message. I do it slight  
>> later.
>>
>>> Hmm, maybe you were confused?  From my last email:
>>>
>>> "The best way to handle this is to file a bug report, attaching  
>>> relevant data using the 'Create a new attachment' link (including  
>>> either the full script or a shortened one which demonstrates the  
>>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>>> the problem."
>>>
>>> http://bugzilla.open-bio.org/
>>>
>>> Anyone can work on fixing the issue there (so it'll probably get  
>>> fixed faster).  The devs can also track progress on the problem  
>>> via the dev mail list (bioperl-guts).  Diagnosing the bug may also  
>>> reveal issues not just with Bio::Tools::Run::Alignment::TCoffee  
>>> but also with other related modules.
>>>
>>> If needed I can post it to bugzilla, but it helps to submit the  
>>> bug yourself (so you can receive posts on it's progress).
>>>
>>> chris
>>>
>>> On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:
>>>
>>>> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields  
>>>> <cjfields at uiuc.edu> wrote:
>>>>
>>>> Chris, I have already sent file to Sendu and also I am attaching  
>>>> it here. I have removed from it really unnecessary parts.
>>>>
>>>>> Sergei,
>>>>>
>>>>> I agree with Sendu; we can't diagnose this unless we either have  
>>>>> the entire script of a minimal version of it demonstrating the  
>>>>> bug.
>>>>>
>>>>> The best way to handle this is to file a bug report, attaching  
>>>>> relevant data using the 'Create a new attachment' link  
>>>>> (including either the full script or a shortened one which  
>>>>> demonstrates the bug). Otherwise we're just shooting in the dark  
>>>>> trying to diagnose the problem.
>>>>>
>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> chris
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Tue Apr  1 08:31:49 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Tue, 01 Apr 2008 14:31:49 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
Message-ID: <47F22B35.1030502@awi.de>

Dear list,

we have recently started to try to find a solution for indexing large 
sequence databases / flat files for a java project, and because we ran 
into problems using biojava, and because both the OBDA and BioSQL ways 
seem to be compatible across bio~ projects, we also started to 
experiment with bioperl. It looks like this should work fine, but we had 
a couple of problems here, too. Perhaps some of you can give me hint 
what we are doing wrong!

The first thing we tried was to use Bio::DB::Flat for indexing a TrEMBL 
flat file (~ 12 GB); but it seems we haven?t got a machine with enough 
memory to be able to handle this. (Perhaps you would be using the "bdb" 
style index in such a case in bioperl, but this apparently doesn?t work 
with biojava, so we had to stick with "flat"). So next we started to 
test BioSQL, by trying to load just Swissprot in a MySQL DB first, like:

load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser xyz 
--dbpass abc --driver mysql --namespace uniprot_sprot --format swiss 
uniprot_sprot.dat

Here we get an error message

###########################################

Loading /biodb/spinkern/uniprot_sprot.dat ...
Could not store Q6DAH5:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: The supplied lineage does not start near 'Erwinia carotovora subsp. 
atroseptica' (I was supplied 'Erwinia carotovora subsp. | Pectobacterium 
| Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | 
Proteobacteria | Bacteria')
STACK: Error::throw
STACK: Bio::Root::Root::throw 
/biodb/spinkern/bioperl-1.5/bioperl-1.5.2_102/Bio/Root/Root.pm:359
STACK: Bio::Species::classification 
/biodb/spinkern/bioperl-1.5/bioperl-1.5.2_102/Bio/Species.pm:174
STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:552 

STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1305 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:973 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:852 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182 

STACK: Bio::DB::Persistent::PersistentObject::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:244 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 

STACK: Bio::DB::Persistent::PersistentObject::store 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271 

STACK: load_seqdatabase.pl:622
-----------------------------------------------------------

at load_seqdatabase.pl line 635

############################################

or similar, depending on whether we use a pre-loaded ncbi taxonomy or 
not, and which Swissprot release we are trying to load. It often seems 
to come from sg. like here, subsp. or other special addition to the 
species line; but alternative genus names and other curious things also 
to appear. It looks like Species.pm tries to validate the species name 
against the lineage info already there in the BioSQL DB, and in several 
cases, it finds inconsistencies. If we start with the ncbi taxonomy 
already loaded in the database, the first error comes much earlier.

I found a thread on the same problem from ~ two years ago 
(http://thread.gmane.org/gmane.comp.lang.perl.bio.general/13766/focus=13788), 
where the solution recommended was to update bioperl, so I was quite 
surprised to find the problem with the version you can see above 
(1.5.2_102 bioperl core, 1.5.2_100 bioperl_db). Can someone give me any 
hints as to what is going wrong here?

The only workaround we have found so far was to comment out line 174 in 
Species.pm:

$self->throw("The supplied lineage does not start near '$name' (I was 
supplied '".join(" | ", @vals)."')");

After doing so, load_seqdatabase.pl runs for several hours (until it 
evetually crashes; I haven?t found out yet why), but proceeds really 
slowly. I also found some info on this for Pg and Oracle in the mailing 
list, but has anyone some approximate numbers for MySQL, how long should 
a first Swissprot load take?

Would be grateful to hear about your ideas / experiences on these issues!

Bank Beszteri


Bioinformatics / Scientific Computing
Alfred Wegener Institute for Polar and Marine Research
Am Handelshafen 12.
27570 Bremerhaven
Germany


From cjfields at uiuc.edu  Tue Apr  1 20:45:28 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 1 Apr 2008 19:45:28 -0500
Subject: [Bioperl-l] quick update on bioperl nightly builds
Message-ID: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>

I'm simplifying the nightly build archive names (removing svn revision  
# and date) in case anyone needs to update bioperl-live/run/db/network  
on a regular basis (read: GBrowse installations).  When I have time  
I'll start working on automated builds, which will require some extra  
work with Module::Build and Build.PL.

chris


From hiekeen at gmail.com  Tue Apr  1 22:14:07 2008
From: hiekeen at gmail.com (Jinyan Huang)
Date: Wed, 2 Apr 2008 10:14:07 +0800
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
Message-ID: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>

I have 20 pathways. My interesting genes are in these pathways. There
are some genes overlaps in these pathways. How can I make a graphic
network using these genes? It means connecting these pathways through
these overlap genes. What kind of software can I use?

Thank you very much in advance.

-- 
Best regards,
Jinyan Huang (ekeen)
School of Life Sciences and Technology, 1302 Room
Tongji University
Siping Road 1239, Shanghai 200092
P.R. China
Tel :0086-21-65981041
Msn: hiekeen at hotmail.com
eMail: hiekeen at gmail.com


From hlapp at gmx.net  Tue Apr  1 22:30:06 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 1 Apr 2008 22:30:06 -0400
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47F22B35.1030502@awi.de>
References: <47F22B35.1030502@awi.de>
Message-ID: <CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>


On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
> [...] So next we started to test BioSQL, by trying to load just  
> Swissprot in a MySQL DB first, like:
>
> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser  
> xyz --dbpass abc --driver mysql --namespace uniprot_sprot --format  
> swiss uniprot_sprot.dat
>
> Here we get an error message
>
> ###########################################
>
> Loading /biodb/spinkern/uniprot_sprot.dat ...
> Could not store Q6DAH5:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: The supplied lineage does not start near 'Erwinia carotovora  
> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |  
> Pectobacterium | Enterobacteriaceae | Enterobacteriales |  
> Gammaproteobacteria | Proteobacteria | Bacteria')
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/ 
> bioperl-1.5.2_102/Bio/Root/Root.pm:359
> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/ 
> bioperl-1.5.2_102/Bio/Species.pm:174
> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 
> 552
> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object / 
> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:1305
> STACK:  
> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:973
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / 
> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:852
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:182
> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 
> 244
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:169
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:251
> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/spinkern/ 
> bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271
> STACK: load_seqdatabase.pl:622
> -----------------------------------------------------------
>
> at load_seqdatabase.pl line 635
>
> ############################################
>
> or similar, depending on whether we use a pre-loaded ncbi taxonomy  
> or not

I recommend to always use a pre-loaded NCBI taxonomy unless you know  
there are only a few organisms that are straightforward (for the  
parser, that is).

> , and which Swissprot release we are trying to load. It often seems  
> to come from sg. like here, subsp. or other special addition to the  
> species line; but alternative genus names and other curious things  
> also to appear. It looks like Species.pm tries to validate the  
> species name against the lineage info already there in the BioSQL  
> DB, and in several cases, it finds inconsistencies.

It actually happens upon a successful lookup when the species object  
is populated from the database.

> [...]
> The only workaround we have found so far was to comment out line  
> 174 in Species.pm:
>
> $self->throw("The supplied lineage does not start near '$name' (I  
> was supplied '".join(" | ", @vals)."')");

That should be OK if you work with a pre-loaded taxonomy. It's sort  
of a sanity check that should catch a parser having messed up a  
species. If you use a pre-loaded NCBI taxonomy the results of the  
species parsing don't matter in all details so long as the NCBI  
taxonID is parsed out correctly, and then found in the database.

Note that this actually a warn() in the main trunk version of  
BioPerl, so you might want to upgrade to that (or change throw() to  
warn() in your version). You still get the records flagged with that,  
but it isn't an exception.

>
> After doing so, load_seqdatabase.pl runs for several hours (until  
> it evetually crashes; I haven?t found out yet why), but proceeds  
> really slowly.

It should certainly *not* crash. Note also that you can supply --safe  
on the command line, in which case the script will continue with the  
next record if one fails to load for whatever reason.

You will want to adjust the width constraint of dbxref.accession, for  
example to 128 chars. This will also be fixed for BioSQL 1.0.1.
See http://bugzilla.open-bio.org/show_bug.cgi?id=2474


> I also found some info on this for Pg and Oracle in the mailing  
> list, but has anyone some approximate numbers for MySQL, how long  
> should a first Swissprot load take?

Possibly around 20 hours according to Erik Rijkers:
See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html

You can use the --logchunks N option to have it print out performance  
statistics every N records.

Hope this helps,

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Apr  1 22:38:12 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 1 Apr 2008 22:38:12 -0400
Subject: [Bioperl-l] Very basic implementation of GenBank XML SeqIO
	module
In-Reply-To: <47F13C2C.4070909@umdnj.edu>
References: <47F13C2C.4070909@umdnj.edu>
Message-ID: <DBDEDED2-656B-4CFD-B603-C0868ED5DAD9@gmx.net>

Ryan - do you not have a committer account?

I do agree with Chris on the test. Modules w/o tests tend to become  
'pseudogenized.'

	-hilmar

On Mar 31, 2008, at 3:31 PM, Ryan Golhar wrote:
> I have a (very) basic SAX implementation of a SeqIO module to parse  
> GenBank XML records.  Right now, it only reads in basic information  
> regarding the sequence and the sequence itself.
>
> It does not yet parse the features table.  Should I submit it to be  
> included in bioperl or wait until I implement more for the features  
> table?  I'm not sure when I'll get around to it though
>
> Ryan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Tue Apr  1 23:12:04 2008
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 01 Apr 2008 23:12:04 -0400
Subject: [Bioperl-l] quick update on bioperl nightly builds
In-Reply-To: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
Message-ID: <1207105924.6184.4.camel@frissell>

Hi Chris,

The tarball is currently (Apr 1) being built in a tmp directory, so that
the extracted tarball is ./tmp/bioperl-live/.  Is that intended?

Thanks,
Scott

On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
> I'm simplifying the nightly build archive names (removing svn revision  
> # and date) in case anyone needs to update bioperl-live/run/db/network  
> on a regular basis (read: GBrowse installations).  When I have time  
> I'll start working on automated builds, which will require some extra  
> work with Module::Build and Build.PL.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory


From cjfields at uiuc.edu  Tue Apr  1 23:59:30 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 1 Apr 2008 22:59:30 -0500
Subject: [Bioperl-l] quick update on bioperl nightly builds
In-Reply-To: <1207105924.6184.4.camel@frissell>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
Message-ID: <D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>

Nope, that isn't intended.  I fixed it and reran it manually, so it  
should be fine now (note I didn't update the log file; the next cron  
run will catch that).

I may toy around with your recent passthrough flag addition to try  
getting automated PPM's up and running.

chris

On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:

> Hi Chris,
>
> The tarball is currently (Apr 1) being built in a tmp directory, so  
> that
> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>
> Thanks,
> Scott
>
> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>> I'm simplifying the nightly build archive names (removing svn  
>> revision
>> # and date) in case anyone needs to update bioperl-live/run/db/ 
>> network
>> on a regular basis (read: GBrowse installations).  When I have time
>> I'll start working on automated builds, which will require some extra
>> work with Module::Build and Build.PL.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Wed Apr  2 07:33:38 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 2 Apr 2008 07:33:38 -0400
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
In-Reply-To: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
Message-ID: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>

On Tue, Apr 1, 2008 at 10:14 PM, Jinyan Huang <hiekeen at gmail.com> wrote:
> I have 20 pathways. My interesting genes are in these pathways. There
>  are some genes overlaps in these pathways. How can I make a graphic
>  network using these genes? It means connecting these pathways through
>  these overlap genes. What kind of software can I use?

R/Bioconductor has tools for working with graphs and pathways.
Cytoscape is another open-source graphical solution.  Ingenuity is, of
course, not free.  If you are looking at a perl solution, you can look
at the various graph modules and their integration with the Graphviz
libraries.

SEan


From cain.cshl at gmail.com  Wed Apr  2 08:28:22 2008
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 02 Apr 2008 08:28:22 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl
	nightly	builds
In-Reply-To: <D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
Message-ID: <1207139302.6507.7.camel@frissell>

Hi Chris,

(trimmed out gbrowse mailing list since this is just bioperl business)

Speaking of the pass through stuff, Sendu mentioned that I stomped on
some changes to Build.PL that you and he did when I committed that
change, so it should be rolled back.  Is there a good (svn) way to do
that?  Or should I just copy the contents of the old (good) Build.PL
into a fresh file in my checkout and commit it?

Thanks,
Scott

On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
> Nope, that isn't intended.  I fixed it and reran it manually, so it  
> should be fine now (note I didn't update the log file; the next cron  
> run will catch that).
> 
> I may toy around with your recent passthrough flag addition to try  
> getting automated PPM's up and running.
> 
> chris
> 
> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > The tarball is currently (Apr 1) being built in a tmp directory, so  
> > that
> > the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
> >
> > Thanks,
> > Scott
> >
> > On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
> >> I'm simplifying the nightly build archive names (removing svn  
> >> revision
> >> # and date) in case anyone needs to update bioperl-live/run/db/ 
> >> network
> >> on a regular basis (read: GBrowse installations).  When I have time
> >> I'll start working on automated builds, which will require some extra
> >> work with Module::Build and Build.PL.
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                         cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory


From robert.citek at gmail.com  Wed Apr  2 08:24:06 2008
From: robert.citek at gmail.com (Robert Citek)
Date: Wed, 2 Apr 2008 07:24:06 -0500
Subject: [Bioperl-l] module for pubchem queries
Message-ID: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>

Hello all,

I have a list of chemical compounds that have some kind of interaction
with proteins or genes.  The current list contains names or SMILES and
I would like to get the CID number for those compounds.  Currently,
I'm using perl to query the NCBI's eutils[1], which works great.  But
I was just curious to know of there was a bioperl module to do
something similar.  A quick google didn't turn up anything, so I
thought I'd ask.

[1] http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html

Regards,
- Robert


From David.Messina at sbc.su.se  Wed Apr  2 08:41:45 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 2 Apr 2008 14:41:45 +0200
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
In-Reply-To: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
Message-ID: <628aabb70804020541v6cee4584ibd9935290ae7cc0a@mail.gmail.com>

I have no personal experience with it, but a colleague of mine suggested
VisANT <http://visant.bu.edu/>.


Dave


From cjfields at uiuc.edu  Wed Apr  2 11:03:32 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 2 Apr 2008 10:03:32 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl nightly
	builds
In-Reply-To: <1207139302.6507.7.camel@frissell>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
	<1207139302.6507.7.camel@frissell>
Message-ID: <3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>

The changes I made were related to problems checking MySQL for  
Bio::DB::SeqFeature::Store tests when connectivity requires username/ 
password.  For some reason it tests DB connectivity up front, while  
Bio::DB::GFF assumes the DB setup is correct (no direct DB check) then  
runs tests assuming the setup is correct.

You can view the diffs for your commits here:

http://code.open-bio.org/svnweb/index.cgi/bioperl/diff/bioperl-live/trunk/ModuleBuildBioperl.pm?revs=14604&revs=14548

http://code.open-bio.org/svnweb/index.cgi/bioperl/diff/bioperl-live/trunk/Build.PL?revs=14604&revs=14565

I'll try working on merging them together today; it shouldn't be too  
hard (the changes were fairly minor in both Build.PL and  
Module::Build).  I'll test to make sure your changes stay in as well.   
Down the road I believe we need to rethink how we want the Build  
process to run using Module::Build as it's a bit convoluted, but it  
works for now.

chris

On Apr 2, 2008, at 7:28 AM, Scott Cain wrote:
> Hi Chris,
>
> (trimmed out gbrowse mailing list since this is just bioperl business)
>
> Speaking of the pass through stuff, Sendu mentioned that I stomped on
> some changes to Build.PL that you and he did when I committed that
> change, so it should be rolled back.  Is there a good (svn) way to do
> that?  Or should I just copy the contents of the old (good) Build.PL
> into a fresh file in my checkout and commit it?
>
> Thanks,
> Scott
>
> On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
>> Nope, that isn't intended.  I fixed it and reran it manually, so it
>> should be fine now (note I didn't update the log file; the next cron
>> run will catch that).
>>
>> I may toy around with your recent passthrough flag addition to try
>> getting automated PPM's up and running.
>>
>> chris
>>
>> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> The tarball is currently (Apr 1) being built in a tmp directory, so
>>> that
>>> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>>>
>>> Thanks,
>>> Scott
>>>
>>> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>>>> I'm simplifying the nightly build archive names (removing svn
>>>> revision
>>>> # and date) in case anyone needs to update bioperl-live/run/db/
>>>> network
>>>> on a regular basis (read: GBrowse installations).  When I have time
>>>> I'll start working on automated builds, which will require some  
>>>> extra
>>>> work with Module::Build and Build.PL.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> ------------------------------------------------------------------------
>>> Scott Cain, Ph. D.                                         cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> -------------------------------------------------------------------------
>> Check out the new SourceForge.net Marketplace.
>> It's the best place to buy or sell services for
>> just about anything Open Source.
>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> Gmod-gbrowse at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Apr  2 11:54:05 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 2 Apr 2008 10:54:05 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl nightly
	builds
In-Reply-To: <3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
	<1207139302.6507.7.camel@frissell>
	<3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>
Message-ID: <71375DA3-A751-4908-8000-D9ACAE39B19C@uiuc.edu>

Okay, committed them.  The accept passthrough still appears to work;  
let me know if anything pops up.

chris

On Apr 2, 2008, at 10:03 AM, Chris Fields wrote:

> ...
> I'll try working on merging them together today; it shouldn't be too  
> hard (the changes were fairly minor in both Build.PL and  
> Module::Build).  I'll test to make sure your changes stay in as  
> well.  Down the road I believe we need to rethink how we want the  
> Build process to run using Module::Build as it's a bit convoluted,  
> but it works for now.
>
> chris
>
> On Apr 2, 2008, at 7:28 AM, Scott Cain wrote:
>> Hi Chris,
>>
>> (trimmed out gbrowse mailing list since this is just bioperl  
>> business)
>>
>> Speaking of the pass through stuff, Sendu mentioned that I stomped on
>> some changes to Build.PL that you and he did when I committed that
>> change, so it should be rolled back.  Is there a good (svn) way to do
>> that?  Or should I just copy the contents of the old (good) Build.PL
>> into a fresh file in my checkout and commit it?
>>
>> Thanks,
>> Scott
>>
>> On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
>>> Nope, that isn't intended.  I fixed it and reran it manually, so it
>>> should be fine now (note I didn't update the log file; the next cron
>>> run will catch that).
>>>
>>> I may toy around with your recent passthrough flag addition to try
>>> getting automated PPM's up and running.
>>>
>>> chris
>>>
>>> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
>>>
>>>> Hi Chris,
>>>>
>>>> The tarball is currently (Apr 1) being built in a tmp directory, so
>>>> that
>>>> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>>>>
>>>> Thanks,
>>>> Scott
>>>>
>>>> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>>>>> I'm simplifying the nightly build archive names (removing svn
>>>>> revision
>>>>> # and date) in case anyone needs to update bioperl-live/run/db/
>>>>> network
>>>>> on a regular basis (read: GBrowse installations).  When I have  
>>>>> time
>>>>> I'll start working on automated builds, which will require some  
>>>>> extra
>>>>> work with Module::Build and Build.PL.
>>>>>
>>>>> chris
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> -- 
>>>> ------------------------------------------------------------------------
>>>> Scott Cain, Ph. D.                                         cain at cshl.edu
>>>> GMOD Coordinator (http://www.gmod.org/)
>>>> 216-392-3087
>>>> Cold Spring Harbor Laboratory
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> -------------------------------------------------------------------------
>>> Check out the new SourceForge.net Marketplace.
>>> It's the best place to buy or sell services for
>>> just about anything Open Source.
>>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>>> _______________________________________________
>>> Gmod-gbrowse mailing list
>>> Gmod-gbrowse at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>> -- 
>> ------------------------------------------------------------------------
>> Scott Cain, Ph. D.                                   cain.cshl at gmail.com
>> GMOD Coordinator (http://www.gmod.org/)                      
>> 216-392-3087
>> Cold Spring Harbor Laboratory
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From zhpan99 at yahoo.com  Wed Apr  2 13:52:46 2008
From: zhpan99 at yahoo.com (Pan Zheng)
Date: Wed, 2 Apr 2008 10:52:46 -0700 (PDT)
Subject: [Bioperl-l] installing bioperl-1.5.2 errors:DB_File
Message-ID: <726978.82400.qm@web53105.mail.re2.yahoo.com>

Hi,
   
  I am installing bioperl-1.5.2_102 under cygwin on my Windows XP and having some errors during the process.
   
  When I was running "perl Build test", one major error is the error about DB_File. I tried to install DB_File from cpan and rpm without any luck.
   
  ++++++++++++++++++++++++
  CPAN: File::Temp loaded ok (v0.16)
CPAN: YAML loaded ok (v0.62)
    CPAN.pm: Going to build P/PM/PMQS/DB_File-1.817.tar.gz
  Parsing config.in...
Looks Good.
Checking if your kit is complete...
Looks good
Note (probably harmless): No library found for -ldb
Writing Makefile for DB_File
cp DB_File.pm blib/lib/DB_File.pm
AutoSplitting blib/lib/DB_File.pm (blib/lib/auto/DB_File)
gcc -c  -I/usr/local/BerkeleyDB/include -DPERL_USE_SAFE_PUTENV -fno-strict-alias
ing -pipe -Wdeclaration-after-statement -DUSEIMPORTLIB -O3   -DVERSION=\"1.817\"
 -DXS_VERSION=\"1.817\"  "-I/usr/lib/perl5/5.8/cygwin/CORE"  -D_NOT_CORE  -DmDB_
Prefix_t=size_t -DmDB_Hash_t=u_int32_t   version.c
version.c:30:16: db.h: No such file or directory
make: *** [version.o] Error 1
  PMQS/DB_File-1.817.tar.gz
  /usr/bin/make -- NOT OK
Running make test
  Can't test without successful make
Running make install
  Make had returned bad status, install seems impossible
Failed during this command:
 PMQS/DB_File-1.817.tar.gz                    : make NO
  +++++++++++++++++++++++++++++++++++++++++++++++
   
   
  I can't remember I had this kind error while installing earlier version.
   
  Would you please help me on DB_File installation ?
   
  Thanks.
   
  Pan

       
---------------------------------
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.


From dr.hogart at gmail.com  Thu Apr  3 09:01:03 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Thu, 03 Apr 2008 17:01:03 +0400
Subject: [Bioperl-l] support of clustalw2 in bio::run::tool::alignment
Message-ID: <op.t81c31ljavnppr@hogart.img.ras.ru>

As for as I understand clustalw2 is not supported in bioperl v1.5.2.100.  
In what version it will be realized?
Thank you in advance.


From slduncan at iastate.edu  Thu Apr  3 14:13:16 2008
From: slduncan at iastate.edu (slduncan at iastate.edu)
Date: Thu, 3 Apr 2008 13:13:16 -0500 (CDT)
Subject: [Bioperl-l] help installing bioperl with cygwin
Message-ID: <161313331084931@webmail.iastate.edu>

I am trying to use cpan to install bioperl and I had an error message saying:
c:\Documents not recognized as and external or internal....
Any ideas here.  Also, I am new to the computer world so please be kind. :)

Stacy Duncan
Iowa State University
Bioinformatics and Computational Biology
1802 University Blvd.
VMRI Building 6
Ames, IA 50011-1240
office phone: (515) 294-8385
office fax: (515) 294-1401
home phone: (336) 965-5622
e-mail: slduncan at iastate.edu


From cjfields at uiuc.edu  Fri Apr  4 16:13:23 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 15:13:23 -0500
Subject: [Bioperl-l] help installing bioperl with cygwin
In-Reply-To: <161313331084931@webmail.iastate.edu>
References: <161313331084931@webmail.iastate.edu>
Message-ID: <B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>

It's best if you use ActiveState's Perl installation (it's the only  
one we really support at this moment, unless someone wants to give  
StrawberryPerl a run).  See:

http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows

chris

On Apr 3, 2008, at 1:13 PM, slduncan at iastate.edu wrote:

> I am trying to use cpan to install bioperl and I had an error  
> message saying:
> c:\Documents not recognized as and external or internal....
> Any ideas here.  Also, I am new to the computer world so please be  
> kind. :)
>
> Stacy Duncan
> Iowa State University
> Bioinformatics and Computational Biology
> 1802 University Blvd.
> VMRI Building 6
> Ames, IA 50011-1240
> office phone: (515) 294-8385
> office fax: (515) 294-1401
> home phone: (336) 965-5622
> e-mail: slduncan at iastate.edu
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Apr  4 16:07:12 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 15:07:12 -0500
Subject: [Bioperl-l] installing bioperl-1.5.2 errors:DB_File
In-Reply-To: <726978.82400.qm@web53105.mail.re2.yahoo.com>
References: <726978.82400.qm@web53105.mail.re2.yahoo.com>
Message-ID: <F786C444-6A18-4AA5-8AE8-6C0ECEEACC5E@uiuc.edu>

I think you have to use the cygwin installer to install DB_File (it  
also installs dependencies, such as BDB).  According to 'perldoc  
perlcygwin':

....
Optional Libraries for Perl on Cygwin

Several Perl functions and modules depend on the existence of some  
optional libraries. Configure will find them if they are installed in  
one of the directories listed as being used for library searches. Pre- 
built packages for most of these are available from the Cygwin  
installer.
....

chris
On Apr 2, 2008, at 12:52 PM, Pan Zheng wrote:

> Hi,
>
>  I am installing bioperl-1.5.2_102 under cygwin on my Windows XP and  
> having some errors during the process.
>
>  When I was running "perl Build test", one major error is the error  
> about DB_File. I tried to install DB_File from cpan and rpm without  
> any luck.
>
>  ++++++++++++++++++++++++
>  CPAN: File::Temp loaded ok (v0.16)
> CPAN: YAML loaded ok (v0.62)
>    CPAN.pm: Going to build P/PM/PMQS/DB_File-1.817.tar.gz
>  Parsing config.in...
> Looks Good.
> Checking if your kit is complete...
> Looks good
> Note (probably harmless): No library found for -ldb
> Writing Makefile for DB_File
> cp DB_File.pm blib/lib/DB_File.pm
> AutoSplitting blib/lib/DB_File.pm (blib/lib/auto/DB_File)
> gcc -c  -I/usr/local/BerkeleyDB/include -DPERL_USE_SAFE_PUTENV -fno- 
> strict-alias
> ing -pipe -Wdeclaration-after-statement -DUSEIMPORTLIB -O3   - 
> DVERSION=\"1.817\"
> -DXS_VERSION=\"1.817\"  "-I/usr/lib/perl5/5.8/cygwin/CORE"  - 
> D_NOT_CORE  -DmDB_
> Prefix_t=size_t -DmDB_Hash_t=u_int32_t   version.c
> version.c:30:16: db.h: No such file or directory
> make: *** [version.o] Error 1
>  PMQS/DB_File-1.817.tar.gz
>  /usr/bin/make -- NOT OK
> Running make test
>  Can't test without successful make
> Running make install
>  Make had returned bad status, install seems impossible
> Failed during this command:
> PMQS/DB_File-1.817.tar.gz                    : make NO
>  +++++++++++++++++++++++++++++++++++++++++++++++
>
>
>  I can't remember I had this kind error while installing earlier  
> version.
>
>  Would you please help me on DB_File installation ?
>
>  Thanks.
>
>  Pan
>
>
> ---------------------------------
> You rock. That's why Blockbuster's offering you one month of  
> Blockbuster Total Access, No Cost.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Apr  4 17:25:41 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 16:25:41 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
Message-ID: <15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>

Do you need something to access eutils via BioPerl, or are you looking  
for a specific set of classes?  I wrote an interface to eutils  
(Bio::DB::EUtilities), you could do something like this:

#!/usr/bin/perl -w

use strict;
use warnings;
use Bio::DB::EUtilities;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
                                      -term => 'dihydroorotate',
                                      -db => 'pcsubstance',
                                      -retmax => 1000);

print join(',',$eutil->get_ids)."\n";

chris

On Apr 2, 2008, at 7:24 AM, Robert Citek wrote:

> Hello all,
>
> I have a list of chemical compounds that have some kind of interaction
> with proteins or genes.  The current list contains names or SMILES and
> I would like to get the CID number for those compounds.  Currently,
> I'm using perl to query the NCBI's eutils[1], which works great.  But
> I was just curious to know of there was a bioperl module to do
> something similar.  A quick google didn't turn up anything, so I
> thought I'd ask.
>
> [1] http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
>
> Regards,
> - Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ekeen at mail.tongji.edu.cn  Mon Apr  7 02:57:04 2008
From: ekeen at mail.tongji.edu.cn (Jinyan Huang)
Date: Mon, 7 Apr 2008 14:57:04 +0800
Subject: [Bioperl-l] How to analysis the relationship of my interesting KEGG
	pathways?
Message-ID: <fb5dae380804062357ka7de019kb3451a5e169c0bf4@mail.gmail.com>

In my research, I got 25 interesting pathways. I want to know the
regulated relationship of these pathways. It is better if there some
software to connect these KEGG pathways.

Thank you very much in advance.


From miguel.pignatelli at uv.es  Mon Apr  7 06:12:58 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Mon, 07 Apr 2008 12:12:58 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
Message-ID: <47F9F3AA.2090003@uv.es>

Hi all,

Is there any way to obtain the date of creation of individual GenBank 
entries? I don't mean the "last revision" date that can be found in the 
first line of a GenBank file.

I can access this creation date by looking at the "revision history" of 
any GenBank entry (for example, see
http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105), 
but I need a systematic (and local=fast) way to access this information.

Any help would be very appreciated,
Thank you very much in advance,

M;


From Bank.Beszteri at awi.de  Mon Apr  7 07:46:43 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Mon, 07 Apr 2008 13:46:43 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
References: <47F22B35.1030502@awi.de>
	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
Message-ID: <47FA09A3.2070004@awi.de>

Hi Hilmar,

it was important to understand that the inconsistency in taxon names is 
apparently only between the Swissprot entries with "non-standard" names 
and the contents of the taxonomy tables and that it is best to use a 
pre-loaded taxonomy, thanks for that! We have now updated to 
bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have 
loaded everything OK in ~26 hours (with many of the "The supplied 
lineage does not start near..." warnings, but no other problems). Our 
next test is to try to load trembl (will try to do this in parallel in 
multiple chunks), hope it will work just as nicely!

Thanks for your tips & insights!

Bank

Hilmar Lapp wrote:

>
> On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
>
>> [...] So next we started to test BioSQL, by trying to load just  
>> Swissprot in a MySQL DB first, like:
>>
>> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser  
>> xyz --dbpass abc --driver mysql --namespace uniprot_sprot --format  
>> swiss uniprot_sprot.dat
>>
>> Here we get an error message
>>
>> ###########################################
>>
>> Loading /biodb/spinkern/uniprot_sprot.dat ...
>> Could not store Q6DAH5:
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: The supplied lineage does not start near 'Erwinia carotovora  
>> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |  
>> Pectobacterium | Enterobacteriaceae | Enterobacteriales |  
>> Gammaproteobacteria | Proteobacteria | Bacteria')
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/ 
>> bioperl-1.5.2_102/Bio/Root/Root.pm:359
>> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/ 
>> bioperl-1.5.2_102/Bio/Species.pm:174
>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 552
>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object / 
>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:1305
>> STACK:  Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
>> /biodb/ spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:973
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / 
>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:852
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:182
>> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 244
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:169
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:251
>> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/spinkern/ 
>> bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271
>> STACK: load_seqdatabase.pl:622
>> -----------------------------------------------------------
>>
>> at load_seqdatabase.pl line 635
>>
>> ############################################
>>
>> or similar, depending on whether we use a pre-loaded ncbi taxonomy  
>> or not
>
>
> I recommend to always use a pre-loaded NCBI taxonomy unless you know  
> there are only a few organisms that are straightforward (for the  
> parser, that is).
>
>> , and which Swissprot release we are trying to load. It often seems  
>> to come from sg. like here, subsp. or other special addition to the  
>> species line; but alternative genus names and other curious things  
>> also to appear. It looks like Species.pm tries to validate the  
>> species name against the lineage info already there in the BioSQL  
>> DB, and in several cases, it finds inconsistencies.
>
>
> It actually happens upon a successful lookup when the species object  
> is populated from the database.
>
>> [...]
>> The only workaround we have found so far was to comment out line  174 
>> in Species.pm:
>>
>> $self->throw("The supplied lineage does not start near '$name' (I  
>> was supplied '".join(" | ", @vals)."')");
>
>
> That should be OK if you work with a pre-loaded taxonomy. It's sort  
> of a sanity check that should catch a parser having messed up a  
> species. If you use a pre-loaded NCBI taxonomy the results of the  
> species parsing don't matter in all details so long as the NCBI  
> taxonID is parsed out correctly, and then found in the database.
>
> Note that this actually a warn() in the main trunk version of  
> BioPerl, so you might want to upgrade to that (or change throw() to  
> warn() in your version). You still get the records flagged with that,  
> but it isn't an exception.
>
>>
>> After doing so, load_seqdatabase.pl runs for several hours (until  it 
>> evetually crashes; I haven?t found out yet why), but proceeds  really 
>> slowly.
>
>
> It should certainly *not* crash. Note also that you can supply --safe  
> on the command line, in which case the script will continue with the  
> next record if one fails to load for whatever reason.
>
> You will want to adjust the width constraint of dbxref.accession, for  
> example to 128 chars. This will also be fixed for BioSQL 1.0.1.
> See http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>
>
>> I also found some info on this for Pg and Oracle in the mailing  
>> list, but has anyone some approximate numbers for MySQL, how long  
>> should a first Swissprot load take?
>
>
> Possibly around 20 hours according to Erik Rijkers:
> See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html
>
> You can use the --logchunks N option to have it print out performance  
> statistics every N records.
>
> Hope this helps,
>
>     -hilmar


From cjfields at uiuc.edu  Mon Apr  7 08:32:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 07:32:45 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FA09A3.2070004@awi.de>
References: <47F22B35.1030502@awi.de>
	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
	<47FA09A3.2070004@awi.de>
Message-ID: <E8A1ED59-830D-473F-8818-1BAC4E0A2FA2@uiuc.edu>

The warnings are something that we still need to resolve, but the only  
fix I can think of likely breaks backward compatibility with older  
bioperl-db installations (i.e. storing the given scientific name  
instead of the binomial name, which is used as a fallback when no  
taxid is found).  There is a full explanation here:

http://bugzilla.open-bio.org/show_bug.cgi?id=2092

Anyway, I think it needs further testing when someone, likely Hilmar  
or I, have time.

chris

On Apr 7, 2008, at 6:46 AM, B?nk Beszteri wrote:

> Hi Hilmar,
>
> it was important to understand that the inconsistency in taxon names  
> is apparently only between the Swissprot entries with "non-standard"  
> names and the contents of the taxonomy tables and that it is best to  
> use a pre-loaded taxonomy, thanks for that! We have now updated to  
> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to  
> have loaded everything OK in ~26 hours (with many of the "The  
> supplied lineage does not start near..." warnings, but no other  
> problems). Our next test is to try to load trembl (will try to do  
> this in parallel in multiple chunks), hope it will work just as  
> nicely!
>
> Thanks for your tips & insights!
>
> Bank
>
> Hilmar Lapp wrote:
>
>>
>> On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
>>
>>> [...] So next we started to test BioSQL, by trying to load just   
>>> Swissprot in a MySQL DB first, like:
>>>
>>> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser   
>>> xyz --dbpass abc --driver mysql --namespace uniprot_sprot -- 
>>> format  swiss uniprot_sprot.dat
>>>
>>> Here we get an error message
>>>
>>> ###########################################
>>>
>>> Loading /biodb/spinkern/uniprot_sprot.dat ...
>>> Could not store Q6DAH5:
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: The supplied lineage does not start near 'Erwinia carotovora   
>>> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |   
>>> Pectobacterium | Enterobacteriaceae | Enterobacteriales |   
>>> Gammaproteobacteria | Proteobacteria | Bacteria')
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/  
>>> bioperl-1.5.2_102/Bio/Root/Root.pm:359
>>> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/  
>>> bioperl-1.5.2_102/Bio/Species.pm:174
>>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm: 552
>>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object /  
>>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:1305
>>> STACK:   
>>> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key / 
>>> biodb/ spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:973
>>> STACK:  
>>> Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key /  
>>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:852
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:182
>>> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm: 244
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:169
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:251
>>> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/ 
>>> spinkern/ bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm:271
>>> STACK: load_seqdatabase.pl:622
>>> -----------------------------------------------------------
>>>
>>> at load_seqdatabase.pl line 635
>>>
>>> ############################################
>>>
>>> or similar, depending on whether we use a pre-loaded ncbi  
>>> taxonomy  or not
>>
>>
>> I recommend to always use a pre-loaded NCBI taxonomy unless you  
>> know  there are only a few organisms that are straightforward (for  
>> the  parser, that is).
>>
>>> , and which Swissprot release we are trying to load. It often  
>>> seems  to come from sg. like here, subsp. or other special  
>>> addition to the  species line; but alternative genus names and  
>>> other curious things  also to appear. It looks like Species.pm  
>>> tries to validate the  species name against the lineage info  
>>> already there in the BioSQL  DB, and in several cases, it finds  
>>> inconsistencies.
>>
>>
>> It actually happens upon a successful lookup when the species  
>> object  is populated from the database.
>>
>>> [...]
>>> The only workaround we have found so far was to comment out line   
>>> 174 in Species.pm:
>>>
>>> $self->throw("The supplied lineage does not start near '$name' (I   
>>> was supplied '".join(" | ", @vals)."')");
>>
>>
>> That should be OK if you work with a pre-loaded taxonomy. It's  
>> sort  of a sanity check that should catch a parser having messed up  
>> a  species. If you use a pre-loaded NCBI taxonomy the results of  
>> the  species parsing don't matter in all details so long as the  
>> NCBI  taxonID is parsed out correctly, and then found in the  
>> database.
>>
>> Note that this actually a warn() in the main trunk version of   
>> BioPerl, so you might want to upgrade to that (or change throw()  
>> to  warn() in your version). You still get the records flagged with  
>> that,  but it isn't an exception.
>>
>>>
>>> After doing so, load_seqdatabase.pl runs for several hours (until   
>>> it evetually crashes; I haven?t found out yet why), but proceeds   
>>> really slowly.
>>
>>
>> It should certainly *not* crash. Note also that you can supply -- 
>> safe  on the command line, in which case the script will continue  
>> with the  next record if one fails to load for whatever reason.
>>
>> You will want to adjust the width constraint of dbxref.accession,  
>> for  example to 128 chars. This will also be fixed for BioSQL 1.0.1.
>> See http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>>
>>
>>> I also found some info on this for Pg and Oracle in the mailing   
>>> list, but has anyone some approximate numbers for MySQL, how long   
>>> should a first Swissprot load take?
>>
>>
>> Possibly around 20 hours according to Erik Rijkers:
>> See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html
>>
>> You can use the --logchunks N option to have it print out  
>> performance  statistics every N records.
>>
>> Hope this helps,
>>
>>    -hilmar
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Apr  7 08:34:00 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 07 Apr 2008 13:34:00 +0100
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FA09A3.2070004@awi.de>
References: <47F22B35.1030502@awi.de>	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
	<47FA09A3.2070004@awi.de>
Message-ID: <47FA14B8.7000500@sendu.me.uk>

B?nk Beszteri wrote:
> Hi Hilmar,
> 
> it was important to understand that the inconsistency in taxon names is 
> apparently only between the Swissprot entries with "non-standard" names 
> and the contents of the taxonomy tables and that it is best to use a 
> pre-loaded taxonomy, thanks for that! We have now updated to 
> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have 
> loaded everything OK in ~26 hours (with many of the "The supplied 
> lineage does not start near..." warnings, but no other problems).

Can you provide some examples of these warnings (of the taxons that 
cause them)? If there's anything consistent about them perhaps 
Bio::Species can be improved to accommodate them properly (instead of 
just issuing the warning and getting the classification wrong).


From heikki at sanbi.ac.za  Mon Apr  7 08:48:34 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Mon, 7 Apr 2008 14:48:34 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47F9F3AA.2090003@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
Message-ID: <200804071448.34769.heikki@sanbi.ac.za>

Miguel,

You probably know this but:

- Your entry example below is a GenPept entry, not a GenBank entry
- The NCBI sequence format "genbank" has only the last modified date.
   I do not know about other formats (ASN.1, ...)
- NCBI Entrez is a great tool but it obscures the source database.
- If you really are working on real GenBank entries, you can use the accession 
number to see find corresponding EMBL (and Swiss-Prot) flat file formats that 
have both creation and last modified dates.

Post to the list if you have trouble getting the dates from EMBL/Swiss-Prot 
formats using bioperl.

Yours,

	-Heikki

On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
> Hi all,
>
> Is there any way to obtain the date of creation of individual GenBank
> entries? I don't mean the "last revision" date that can be found in the
> first line of a GenBank file.
>
> I can access this creation date by looking at the "revision history" of
> any GenBank entry (for example, see
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> but I need a systematic (and local=fast) way to access this information.
>
> Any help would be very appreciated,
> Thank you very much in advance,
>
> M;
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From granjeau at tagc.univ-mrs.fr  Mon Apr  7 09:30:10 2008
From: granjeau at tagc.univ-mrs.fr (Samuel GRANJEAUD - IR/ICIM)
Date: Mon, 07 Apr 2008 15:30:10 +0200
Subject: [Bioperl-l] help installing bioperl with cygwin
In-Reply-To: <B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>
References: <161313331084931@webmail.iastate.edu>
	<B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>
Message-ID: <47FA21E2.3010602@tagc.univ-mrs.fr>

Hi,

I'm using BioPerl under Cygwin, because Cygwin allows one to work in a 
Unix-like environment in a command line point of view.

So, I use the CVS version which runs out of the box
http://www.bioperl.org/wiki/Using_CVS
which has been replaced by SVN at the beginning of the year
http://www.bioperl.org/wiki/Using_Subversion

So if you really want to work under Cygwin, you can try this quick and 
dirty way, but you still have to become experienced because BioPerl is 
not supported under Cygwin.

You may try Strawberry, but in my experience in installing wxPerl, 
wxPerl fails on both flavours of Perl. ActiveState's Perl is still the 
easiest way to install many packages.

Regards,
Samuel


Chris Fields wrote:
> It's best if you use ActiveState's Perl installation (it's the only 
> one we really support at this moment, unless someone wants to give 
> StrawberryPerl a run).  See:
>
> http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
>
> chris
>
> On Apr 3, 2008, at 1:13 PM, slduncan at iastate.edu wrote:
>
>> I am trying to use cpan to install bioperl and I had an error message 
>> saying:
>> c:\Documents not recognized as and external or internal....
>> Any ideas here.  Also, I am new to the computer world so please be 
>> kind. :)
>>
>> Stacy Duncan
>> Iowa State University
>> Bioinformatics and Computational Biology
>> 1802 University Blvd.
>> VMRI Building 6
>> Ames, IA 50011-1240
>> office phone: (515) 294-8385
>> office fax: (515) 294-1401
>> home phone: (336) 965-5622
>> e-mail: slduncan at iastate.edu
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 

Samuel GRANJEAUD                   granjeau at tagc.univ-mrs.fr
INSERM - ICIM - TAGC               Tel: +33  (0)491 82 87 24
http://tagc.univ-mrs.fr            Fax: +33  (0)491 82 87 01
http://icim.marseille.inserm.fr/proteomique


From er at xs4all.nl  Mon Apr  7 10:36:57 2008
From: er at xs4all.nl (Erik)
Date: Mon, 7 Apr 2008 16:36:57 +0200 (CEST)
Subject: [Bioperl-l] Indexing large databases / BioSQL
Message-ID: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>

On Mon, April 7, 2008 14:34, Sendu Bala wrote:
> B?nk Beszteri wrote:
>> Hi Hilmar,
>>
>> it was important to understand that the inconsistency in taxon names is
>> apparently only between the Swissprot entries with "non-standard" names
>> and the contents of the taxonomy tables and that it is best to use a
>> pre-loaded taxonomy, thanks for that! We have now updated to
>> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have
>> loaded everything OK in ~26 hours (with many of the "The supplied
>> lineage does not start near..." warnings, but no other problems).
>
> Can you provide some examples of these warnings (of the taxons that
> cause them)? If there's anything consistent about them perhaps
> Bio::Species can be improved to accommodate them properly (instead of
> just issuing the warning and getting the classification wrong).
>

I did this a little while ago and saved the output
(UniProtKB/Swiss-Prot Release 55.1 of 18-Mar-2008, I think).

All warnings (and a few errors) for swissprot are here:

   http://bugzilla.open-bio.org/show_bug.cgi?id=2474

as an attached file

I suppose the OP will have encountered similar output - I don't think there is
much RDBMS-type-dependency involved.

   regards,

   Erik Rijkers


From cjfields at uiuc.edu  Mon Apr  7 11:46:01 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 10:46:01 -0500
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <200804071448.34769.heikki@sanbi.ac.za>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
Message-ID: <2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>

Strangely enough, if you use NCBI's esummary you can get both dates.   
Via Bio::DB::EUtilities in bioperl-live, if you dump out DocSum data  
(using a debugging method I added in a while back):

---------------------------------------

use Bio::DB::EUtilities;

# for multiple IDs use an array ref; also only use GI's (not accessions)
my $factory = Bio::DB::EUtilities->new(
                         -eutil => 'esummary',
                         -db => 'protein',
                         -id => 1621261);

$factory->print_DocSums;

---------------------------------------

One gets the following tag/value pairs:

UID: 1621261
Caption             :CAB02640
Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN  
PYRR [Mycobacterium tuberculosis
		     H37Rv]
Extra               :gi|1621261|emb|CAB02640.1|[1621261]
Gi                  :1621261
CreateDate          :2003/11/21
UpdateDate          :2006/11/14
Flags               :
TaxId               :83332
Length              :193
Status              :live
ReplacedBy          :
Comment             :

I'll add in a method to grab the data element by tag (in this case,  
grab the creation date by asking for the 'CreateDate' key).  Might  
come in handy for scripts.

chris

On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:

> Miguel,
>
> You probably know this but:
>
> - Your entry example below is a GenPept entry, not a GenBank entry
> - The NCBI sequence format "genbank" has only the last modified date.
>   I do not know about other formats (ASN.1, ...)
> - NCBI Entrez is a great tool but it obscures the source database.
> - If you really are working on real GenBank entries, you can use the  
> accession
> number to see find corresponding EMBL (and Swiss-Prot) flat file  
> formats that
> have both creation and last modified dates.
>
> Post to the list if you have trouble getting the dates from EMBL/ 
> Swiss-Prot
> formats using bioperl.
>
> Yours,
>
> 	-Heikki
>
> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in  
>> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision  
>> history" of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi? 
>> val=74311105),
>> but I need a systematic (and local=fast) way to access this  
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From miguel.pignatelli at uv.es  Mon Apr  7 12:24:50 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Mon, 07 Apr 2008 18:24:50 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
Message-ID: <47FA4AD2.5030206@uv.es>


I've noticed that the ASN.1 version of those records has a 
"creation-date" tag.
But this is somehow strange, because the creation date obtained by you 
and that obtained via ASN.1 format is 2003/11/21, but if you look at the 
revision history of the record:

http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=CAB02640

reports a creation date of "Oct 19 1996 12:28 AM"

I don't know how to get this, because the EMBL version of this gene:

http://www.ebi.ac.uk/cgi-bin/dbfetch?db=emblcds&id=CAB02640&style=raw

doesn't has DT fields at all.

M;


Chris Fields wrote:
> Strangely enough, if you use NCBI's esummary you can get both dates.  
> Via Bio::DB::EUtilities in bioperl-live, if you dump out DocSum data 
> (using a debugging method I added in a while back):
> 
> ---------------------------------------
> 
> use Bio::DB::EUtilities;
> 
> # for multiple IDs use an array ref; also only use GI's (not accessions)
> my $factory = Bio::DB::EUtilities->new(
>                         -eutil => 'esummary',
>                         -db => 'protein',
>                         -id => 1621261);
> 
> $factory->print_DocSums;
> 
> ---------------------------------------
> 
> One gets the following tag/value pairs:
> 
> UID: 1621261
> Caption             :CAB02640
> Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN PYRR 
> [Mycobacterium tuberculosis
>              H37Rv]
> Extra               :gi|1621261|emb|CAB02640.1|[1621261]
> Gi                  :1621261
> CreateDate          :2003/11/21
> UpdateDate          :2006/11/14
> Flags               :
> TaxId               :83332
> Length              :193
> Status              :live
> ReplacedBy          :
> Comment             :
> 
> I'll add in a method to grab the data element by tag (in this case, grab 
> the creation date by asking for the 'CreateDate' key).  Might come in 
> handy for scripts.
> 
> chris
> 
> On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:
> 
>> Miguel,
>>
>> You probably know this but:
>>
>> - Your entry example below is a GenPept entry, not a GenBank entry
>> - The NCBI sequence format "genbank" has only the last modified date.
>>   I do not know about other formats (ASN.1, ...)
>> - NCBI Entrez is a great tool but it obscures the source database.
>> - If you really are working on real GenBank entries, you can use the 
>> accession
>> number to see find corresponding EMBL (and Swiss-Prot) flat file 
>> formats that
>> have both creation and last modified dates.
>>
>> Post to the list if you have trouble getting the dates from 
>> EMBL/Swiss-Prot
>> formats using bioperl.
>>
>> Yours,
>>
>>     -Heikki
>>
>> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>>> Hi all,
>>>
>>> Is there any way to obtain the date of creation of individual GenBank
>>> entries? I don't mean the "last revision" date that can be found in the
>>> first line of a GenBank file.
>>>
>>> I can access this creation date by looking at the "revision history" of
>>> any GenBank entry (for example, see
>>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>>> but I need a systematic (and local=fast) way to access this information.
>>>
>>> Any help would be very appreciated,
>>> Thank you very much in advance,
>>>
>>> M;
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> -- 
>> ______ _/      _/_____________________________________________________
>>      _/      _/
>>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>>  _/  _/  _/  University of Western Cape, South Africa
>>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/________________________________________________________
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> 


From cjfields at uiuc.edu  Mon Apr  7 13:48:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 12:48:45 -0500
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FA4AD2.5030206@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
	<47FA4AD2.5030206@uv.es>
Message-ID: <CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>

Note in the example I gave that, during the revision history, the  
DBSOURCE changed at the point of the creation date (the original nuc.  
record was a M. tuberculosis contig sequence, which later changed to  
an updated full M. tuberculosis genome record at the time of the  
'create date').

Couldn't find anything specific in the GenBank docs on this, but it  
appears (at least for a protein record) the creation date reflects the  
date in which the sequence was either originally deposited or  
originally derived from the nucleotide source record present in the  
record.  In other words, it may not reflect the original date of  
deposition (which could have come from a different record, as in this  
case).

chris

On Apr 7, 2008, at 11:24 AM, Miguel Pignatelli wrote:

>
> I've noticed that the ASN.1 version of those records has a "creation- 
> date" tag.
> But this is somehow strange, because the creation date obtained by  
> you and that obtained via ASN.1 format is 2003/11/21, but if you  
> look at the revision history of the record:
>
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=CAB02640
>
> reports a creation date of "Oct 19 1996 12:28 AM"
>
> I don't know how to get this, because the EMBL version of this gene:
>
> http://www.ebi.ac.uk/cgi-bin/dbfetch?db=emblcds&id=CAB02640&style=raw
>
> doesn't has DT fields at all.
>
> M;
>
>
> Chris Fields wrote:
>> Strangely enough, if you use NCBI's esummary you can get both  
>> dates.  Via Bio::DB::EUtilities in bioperl-live, if you dump out  
>> DocSum data (using a debugging method I added in a while back):
>> ---------------------------------------
>> use Bio::DB::EUtilities;
>> # for multiple IDs use an array ref; also only use GI's (not  
>> accessions)
>> my $factory = Bio::DB::EUtilities->new(
>>                        -eutil => 'esummary',
>>                        -db => 'protein',
>>                        -id => 1621261);
>> $factory->print_DocSums;
>> ---------------------------------------
>> One gets the following tag/value pairs:
>> UID: 1621261
>> Caption             :CAB02640
>> Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN  
>> PYRR [Mycobacterium tuberculosis
>>             H37Rv]
>> Extra               :gi|1621261|emb|CAB02640.1|[1621261]
>> Gi                  :1621261
>> CreateDate          :2003/11/21
>> UpdateDate          :2006/11/14
>> Flags               :
>> TaxId               :83332
>> Length              :193
>> Status              :live
>> ReplacedBy          :
>> Comment             :
>> I'll add in a method to grab the data element by tag (in this case,  
>> grab the creation date by asking for the 'CreateDate' key).  Might  
>> come in handy for scripts.
>> chris
>> On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:
>>> Miguel,
>>>
>>> You probably know this but:
>>>
>>> - Your entry example below is a GenPept entry, not a GenBank entry
>>> - The NCBI sequence format "genbank" has only the last modified  
>>> date.
>>>  I do not know about other formats (ASN.1, ...)
>>> - NCBI Entrez is a great tool but it obscures the source database.
>>> - If you really are working on real GenBank entries, you can use  
>>> the accession
>>> number to see find corresponding EMBL (and Swiss-Prot) flat file  
>>> formats that
>>> have both creation and last modified dates.
>>>
>>> Post to the list if you have trouble getting the dates from EMBL/ 
>>> Swiss-Prot
>>> formats using bioperl.
>>>
>>> Yours,
>>>
>>>    -Heikki
>>>
>>> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>>>> Hi all,
>>>>
>>>> Is there any way to obtain the date of creation of individual  
>>>> GenBank
>>>> entries? I don't mean the "last revision" date that can be found  
>>>> in the
>>>> first line of a GenBank file.
>>>>
>>>> I can access this creation date by looking at the "revision  
>>>> history" of
>>>> any GenBank entry (for example, see
>>>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105) 
>>>> ,
>>>> but I need a systematic (and local=fast) way to access this  
>>>> information.
>>>>
>>>> Any help would be very appreciated,
>>>> Thank you very much in advance,
>>>>
>>>> M;
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>> -- 
>>> ______ _/      _/ 
>>> _____________________________________________________
>>>     _/      _/
>>>    _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>>   _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>>  _/  _/  _/  SANBI, South African National Bioinformatics Institute
>>> _/  _/  _/  University of Western Cape, South Africa
>>>    _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>>> ___ _/_/_/_/_/ 
>>> ________________________________________________________
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Tue Apr  8 03:35:43 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Tue, 08 Apr 2008 09:35:43 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
Message-ID: <47FB204F.90405@awi.de>


>>Can you provide some examples of these warnings (of the taxons that
>>cause them)? If there's anything consistent about them perhaps
>>Bio::Species can be improved to accommodate them properly (instead of
>>just issuing the warning and getting the classification wrong).
>>    
>>
>
>All warnings (and a few errors) for swissprot are here:
>
>   http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>
>as an attached file
>
>I suppose the OP will have encountered similar output - I don't think there is
>much RDBMS-type-dependency involved.
>  
>
Hi Erik & Sendu,

yes, the same kind of thing, probably no DBMS-type dependency; in case 
it could be useful, I uploaded my output as a second attachment to the 
bugzilla report cited above.

Bank


From heikki at sanbi.ac.za  Tue Apr  8 04:32:12 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Tue, 8 Apr 2008 10:32:12 +0200
Subject: [Bioperl-l] Blast database sequence retrieval perl script
In-Reply-To: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
References: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
Message-ID: <200804081032.12312.heikki@sanbi.ac.za>


Dear Nelson,

I am cc:ing the bioperl mailing list where all these kind of queries should 
go. More people can help you that way.


Since you have your own local data set, you need to create an index that 
catalogues you sequences for easy retrieval.

You need to install bioperl-live first. See for example: 	
	http://www.bioperl.org/wiki/Using_Subversion

Then you can follow this HOWTO:
	http://www.bioperl.org/wiki/HOWTO:Flat_databases

The other HOWTOs will help you dealing with BioPerl sequence objects that are 
retrieved: http://www.bioperl.org/wiki/HOWTOs. 


Yours,

	-Heikki


On Monday 07 April 2008 14:50:23 Ndegwa, Nelson (IITA-Nairobi) wrote:
> Dear Prof. Heikki,
>
> Hi. We met at the Pathogen Bioinformatics Conference held in Nairobi
> Kenya in May 2007 at ICIPE. I recall you are a developer of Bioperl and
> Perl. I have managed to install a local Blast, having just cowpea Contig
> sequences, about 50,000 in total. This runs fine, as I can perform
> various queries and get results. However, any good match/hit on the
> local Blast database is hard to retrieve and the only option seems to go
> back to that database and search manually for the top hit sequence - an
> exceedingly manual task. Might you perhaps be having a Perl script I
> could adopt to my database to help with this task Such that the hits
> have a hyperlink which can be used to retrieve that specific entry? I
> have limited knowledge of Perl. Thank you.
>
> With Kind Regards,
>
> Nelson.


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From David.Messina at sbc.su.se  Tue Apr  8 07:29:12 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Tue, 8 Apr 2008 13:29:12 +0200
Subject: [Bioperl-l] How to analysis the relationship of my interesting KEGG
	pathways?
In-Reply-To: <628aabb70804080053g1fd9120ex9d5fd12f65f216f9@mail.gmail.com>
References: <fb5dae380804062357ka7de019kb3451a5e169c0bf4@mail.gmail.com>
	<628aabb70804080053g1fd9120ex9d5fd12f65f216f9@mail.gmail.com>
Message-ID: <628aabb70804080429k2aa17a6eu12197709d4cc1af0@mail.gmail.com>

Hi Jinyan,

You asked a similar question last week and received a couple of suggestions
-- did you take a look at those?

I'm not an expert on this topic, but I believe that since regulatory
information is much harder to obtain experimentally and therefore much less
well known, there isn't a lot of it in pathway databases like KEGG. You may
have to look through the literature and start trying to put together
possible regulatory links on your own.

Dave


From hrh at sanger.ac.uk  Tue Apr  8 08:48:32 2008
From: hrh at sanger.ac.uk (Hans Rudolf Hotz)
Date: Tue, 8 Apr 2008 13:48:32 +0100 (BST)
Subject: [Bioperl-l] Blast database sequence retrieval perl script
In-Reply-To: <200804081032.12312.heikki@sanbi.ac.za>
References: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
	<200804081032.12312.heikki@sanbi.ac.za>
Message-ID: <Pine.LNX.4.64.0804081340180.7147@deskpro50.dynamic.sanger.ac.uk>

Nelson

or simply use the BLAST indices for the sequence retrieval as well.

All you need to do is adding the "-o" option to the 'formatdb' command for 
the BLAST index creation (this will create some extra files). Then you can 
use 'fastacmd' (which is also part of the NCBI BLAST package) to retrieve 
the sequences.


Hans

On Tue, 8 Apr 2008, Heikki Lehvaslaiho wrote:

>
> Dear Nelson,
>
> I am cc:ing the bioperl mailing list where all these kind of queries should
> go. More people can help you that way.
>
>
> Since you have your own local data set, you need to create an index that
> catalogues you sequences for easy retrieval.
>
> You need to install bioperl-live first. See for example:
> 	http://www.bioperl.org/wiki/Using_Subversion
>
> Then you can follow this HOWTO:
> 	http://www.bioperl.org/wiki/HOWTO:Flat_databases
>
> The other HOWTOs will help you dealing with BioPerl sequence objects that are
> retrieved: http://www.bioperl.org/wiki/HOWTOs.
>
>
> Yours,
>
> 	-Heikki
>
>
> On Monday 07 April 2008 14:50:23 Ndegwa, Nelson (IITA-Nairobi) wrote:
>> Dear Prof. Heikki,
>>
>> Hi. We met at the Pathogen Bioinformatics Conference held in Nairobi
>> Kenya in May 2007 at ICIPE. I recall you are a developer of Bioperl and
>> Perl. I have managed to install a local Blast, having just cowpea Contig
>> sequences, about 50,000 in total. This runs fine, as I can perform
>> various queries and get results. However, any good match/hit on the
>> local Blast database is hard to retrieve and the only option seems to go
>> back to that database and search manually for the top hit sequence - an
>> exceedingly manual task. Might you perhaps be having a Perl script I
>> could adopt to my database to help with this task Such that the hits
>> have a hyperlink which can be used to retrieve that specific entry? I
>> have limited knowledge of Perl. Thank you.
>>
>> With Kind Regards,
>>
>> Nelson.
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From robert.citek at gmail.com  Tue Apr  8 10:09:27 2008
From: robert.citek at gmail.com (Robert Citek)
Date: Tue, 8 Apr 2008 09:09:27 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
	<15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
Message-ID: <4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>

Wrapping bioperl around eutils will work just fine.  Thanks for the pointer.

http://search.cpan.org/~sendu/bioperl-1.5.2_102/Bio/DB/EUtilities.pm

Regards,
- Robert

On Fri, Apr 4, 2008 at 4:25 PM, Chris Fields <cjfields at uiuc.edu> wrote:
> Do you need something to access eutils via BioPerl, or are you looking for a
> specific set of classes?  I wrote an interface to eutils
> (Bio::DB::EUtilities), you could do something like this:
>
>  #!/usr/bin/perl -w
>
>  use strict;
>  use warnings;
>  use Bio::DB::EUtilities;
>
>  my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>                                      -term => 'dihydroorotate',
>                                      -db => 'pcsubstance',
>                                      -retmax => 1000);
>
>  print join(',',$eutil->get_ids)."\n";
>
>  chris


From cjfields at uiuc.edu  Tue Apr  8 11:10:26 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 8 Apr 2008 10:10:26 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
	<15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
	<4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>
Message-ID: <32D210FC-575E-4D95-95DA-FC6F5BE1FC24@uiuc.edu>

Just to note, the the API has changed significantly from the interface  
in the 1.5.2 release.  The up-to-date (supported) interface is in  
subversion; there are some example recipes here:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

I'm working on a full HOWTO, just haven't had time to get it up on the  
wiki yet.

chris

On Apr 8, 2008, at 9:09 AM, Robert Citek wrote:

> Wrapping bioperl around eutils will work just fine.  Thanks for the  
> pointer.
>
> http://search.cpan.org/~sendu/bioperl-1.5.2_102/Bio/DB/EUtilities.pm
>
> Regards,
> - Robert
>
> On Fri, Apr 4, 2008 at 4:25 PM, Chris Fields <cjfields at uiuc.edu>  
> wrote:
>> Do you need something to access eutils via BioPerl, or are you  
>> looking for a
>> specific set of classes?  I wrote an interface to eutils
>> (Bio::DB::EUtilities), you could do something like this:
>>
>> #!/usr/bin/perl -w
>>
>> use strict;
>> use warnings;
>> use Bio::DB::EUtilities;
>>
>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>>                                     -term => 'dihydroorotate',
>>                                     -db => 'pcsubstance',
>>                                     -retmax => 1000);
>>
>> print join(',',$eutil->get_ids)."\n";
>>
>> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cuiw at ncbi.nlm.nih.gov  Tue Apr  8 16:41:58 2008
From: cuiw at ncbi.nlm.nih.gov (Cui, Wenwu (NIH/NLM/NCBI) [C])
Date: Tue, 8 Apr 2008 16:41:58 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47F9F3AA.2090003@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
Message-ID: <6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>

Hi, Miguel:

id1_fetch can do it. Detailed instruction can be found at:  

http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
1_fetch.html

Here is an example:

>id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
GI        Loaded      DB    Retrieval No.
--        ------      --    -------------
74311105  12/07/2007  NCBI  19766263
74311105  01/23/2007  NCBI  16325656
74311105  03/30/2006  NCBI  13131204
74311105  03/03/2006  NCBI  12915541
74311105  03/02/2006  NCBI  12885275
74311105  12/03/2005  NCBI  12259793
74311105  09/09/2005  NCBI  11257262
74311105  09/09/2005  NCBI  11242667

Wenwu Cui PhD
NCBI/NLM/NIH

> -----Original Message-----
> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> Sent: Monday, April 07, 2008 6:13 AM
> Cc: bioperl-l at bioperl.org
> Subject: [Bioperl-l] GenBank entries creation dates
> 
> Hi all,
> 
> Is there any way to obtain the date of creation of individual GenBank
> entries? I don't mean the "last revision" date that can be found in
the
> first line of a GenBank file.
> 
> I can access this creation date by looking at the "revision history"
of
> any GenBank entry (for example, see
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> but I need a systematic (and local=fast) way to access this
> information.
> 
> Any help would be very appreciated,
> Thank you very much in advance,
> 
> M;
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From miguel.pignatelli at uv.es  Wed Apr  9 07:32:39 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Wed, 09 Apr 2008 13:32:39 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
	<6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
Message-ID: <47FCA957.5040409@uv.es>

Wow, impressive, thanks Wenwu for the information, I have never used 
this tool before. The problem is that I need to know all the revision 
history (or at least the creation date) for *all* the GIs present in nr 
(well, or at least a significant portion of it) and this tool queries 
via web.

The existence of this tool confirms me that this information is 
available somewhere, is it possible to download the data that contains 
this information?

Thanks again,

M;


Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at:  
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
> 1_fetch.html
> 
> Here is an example:
> 
>> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD
> NCBI/NLM/NIH
> 
>> -----Original Message-----
>> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
>> Sent: Monday, April 07, 2008 6:13 AM
>> Cc: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] GenBank entries creation dates
>>
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in
> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision history"
> of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>> but I need a systematic (and local=fast) way to access this
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 


From cuiw at ncbi.nlm.nih.gov  Wed Apr  9 09:25:16 2008
From: cuiw at ncbi.nlm.nih.gov (Cui, Wenwu (NIH/NLM/NCBI) [C])
Date: Wed, 9 Apr 2008 09:25:16 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FCA957.5040409@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
	<6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
	<47FCA957.5040409@uv.es>
Message-ID: <6F230E9769AA8D4EB4BC401DF133EDB7180BE1@NIHCESMLBX15.nih.gov>

Hi, Miguel,

I do not know whether the data file is publically available. However,
you can perform 'real time' query via id1_fetch:

####step 1: generate GI file #####
id1_fetch -query 'YOUR-GENBANK-QUERY-STRING' -lt none -db Nucleotide
-out qfile

####step 2: retrieve revisions for GIs stored in qfile #####

id1_fetch -lt revisions -qf qfile  -fmt fasta -db Nucleotide

Good luck!

Wenwu Cui

> -----Original Message-----
> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> Sent: Wednesday, April 09, 2008 7:33 AM
> To: Cui, Wenwu (NIH/NLM/NCBI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] GenBank entries creation dates
> 
> Wow, impressive, thanks Wenwu for the information, I have never used
> this tool before. The problem is that I need to know all the revision
> history (or at least the creation date) for *all* the GIs present in
nr
> (well, or at least a significant portion of it) and this tool queries
> via web.
> 
> The existence of this tool confirms me that this information is
> available somewhere, is it possible to download the data that contains
> this information?
> 
> Thanks again,
> 
> M;
> 
> 
> Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> > Hi, Miguel:
> >
> > id1_fetch can do it. Detailed instruction can be found at:
> >
> >
>
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.i
> d
> > 1_fetch.html
> >
> > Here is an example:
> >
> >> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> > GI        Loaded      DB    Retrieval No.
> > --        ------      --    -------------
> > 74311105  12/07/2007  NCBI  19766263
> > 74311105  01/23/2007  NCBI  16325656
> > 74311105  03/30/2006  NCBI  13131204
> > 74311105  03/03/2006  NCBI  12915541
> > 74311105  03/02/2006  NCBI  12885275
> > 74311105  12/03/2005  NCBI  12259793
> > 74311105  09/09/2005  NCBI  11257262
> > 74311105  09/09/2005  NCBI  11242667
> >
> > Wenwu Cui PhD
> > NCBI/NLM/NIH
> >
> >> -----Original Message-----
> >> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> >> Sent: Monday, April 07, 2008 6:13 AM
> >> Cc: bioperl-l at bioperl.org
> >> Subject: [Bioperl-l] GenBank entries creation dates
> >>
> >> Hi all,
> >>
> >> Is there any way to obtain the date of creation of individual
> GenBank
> >> entries? I don't mean the "last revision" date that can be found in
> > the
> >> first line of a GenBank file.
> >>
> >> I can access this creation date by looking at the "revision
history"
> > of
> >> any GenBank entry (for example, see
> >>
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> >> but I need a systematic (and local=fast) way to access this
> >> information.
> >>
> >> Any help would be very appreciated,
> >> Thank you very much in advance,
> >>
> >> M;
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From CALLEY_JOHN_N at LILLY.COM  Wed Apr  9 09:45:23 2008
From: CALLEY_JOHN_N at LILLY.COM (John N Calley)
Date: Wed, 9 Apr 2008 09:45:23 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FCA957.5040409@uv.es>
Message-ID: <OF73E5AA49.8E1EF918-ON85257426.004AF961-85257426.004B915C@EliLilly.lilly.com>

You might want to keep in mind that the creation date is not always 
reliable. I am aware of one example where the recorded creation date 
precedes the sequencing date by several months (as determined by the trace 
file date). NCBI was not able to explain exactly what happened but (as I 
recall) hypothesized that some dates had been scrambled in a database 
rebuild. If there was interest I could probably pull up more details.

John Calley


Miguel Pignatelli <miguel.pignatelli at uv.es> 
Sent by: bioperl-l-bounces at lists.open-bio.org
04/09/2008 07:32 AM
Please respond to
miguel.pignatelli at uv.es


To
"Cui, Wenwu (NIH/NLM/NCBI) [C]" <cuiw at ncbi.nlm.nih.gov>
cc
bioperl-l at bioperl.org
Subject
Re: [Bioperl-l] GenBank entries creation dates


Wow, impressive, thanks Wenwu for the information, I have never used 
this tool before. The problem is that I need to know all the revision 
history (or at least the creation date) for *all* the GIs present in nr 
(well, or at least a significant portion of it) and this tool queries 
via web.

The existence of this tool confirms me that this information is 
available somewhere, is it possible to download the data that contains 
this information?

Thanks again,

M;


Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at: 
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
> 1_fetch.html
> 
> Here is an example:
> 
>> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD
> NCBI/NLM/NIH
> 
>> -----Original Message-----
>> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
>> Sent: Monday, April 07, 2008 6:13 AM
>> Cc: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] GenBank entries creation dates
>>
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in
> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision history"
> of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>> but I need a systematic (and local=fast) way to access this
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From frederic.romagne at gmail.com  Wed Apr  9 16:45:50 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 09 Apr 2008 15:45:50 -0500
Subject: [Bioperl-l] question about clustalw module.
Message-ID: <1207773950.483.13.camel@kiss-laptop>

Hello,

i have a problem when using Bio::Tools::Run::Alignment::Clustalw :

I give it an array_ref scalar (the array contains some fasta sequences)
and all the good parameters and i write the result via  Bio::SeqIO.

The fact is that my result file only contains the Accession number in
the header... An example :

the initial stream is : 

>NM_052854 Homo sapiens cAMP responsive element binding protein 3-like 1
(CREB3L1), mRNA.
AGAAGACGTGCGGAGGGAGACGCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGG
GGGAGCACTTAGCTCCCCCGCCCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTC
AGCCCCAACCCCGGGCTCCCCATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGT
GGAGTCGGCTGAATGCCCACGGTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCG
CTGCCCTAAGGCCCCCGCGCGCCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCG
CCCCTCCCCCGGGGCTTCGCCCCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAG
GAGCTCTGGACTGGGCGCGCCGCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCC
CGGGAGCCGGCTGCGATGGACGCCGTCTTGGAACCCTTCCCGGCCGACAGGCTGTTCCCC
GGATCCAGCTTCCTGGACTTGGGGGATCTGAACGAGTCGGACTTCCTCAACAATGCGCAC

...

the result file is :

>NM_052854
---------------------------------------AGAAGACGTGCGGAGGGAGAC
GCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGGGGGAGCACTTAGCTCCCCCGC
CCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTCAGCCCCAACCCCGGGCTCCCC
ATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGTGGAGTCGGCTGAATGCCCACG
GTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCGCTGCCCTAAGGCCCCCGCGCG
CCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCGCCCCTCCCCCGGGGCTTCGCC
CCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAGGAGCTCTGGACTGGGCGCGCC
GCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCCCGGGAGCCGGCTGCGATGGAC

...

?So i lost the other informations provided by the header...

?Is there any option to keep these informations?

Here is a part of my code with my options :


 my $seq_ref=\@seq;
 my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'quiet' => 1,
		'output' => 'FASTA');
 my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
 my $aln = $factory->align($seq_ref);


Thank you.


From jason at bioperl.org  Wed Apr  9 16:55:13 2008
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 9 Apr 2008 13:55:13 -0700
Subject: [Bioperl-l] question about clustalw module.
In-Reply-To: <1207773950.483.13.camel@kiss-laptop>
References: <1207773950.483.13.camel@kiss-laptop>
Message-ID: <C126E560-1A36-461E-ADAD-774446B9DB9E@bioperl.org>

the clustal alignment format does not allow for the description - if  
you want to preserve it you'll have to add it back, make a hash  
indexed by sequence ID and store the description, then when you get  
your alignment back you can update the description field before  
writing it out with AlignIO.

-jason
On Apr 9, 2008, at 1:45 PM, Fr?d?ric Romagn? wrote:

> Hello,
>
> i have a problem when using Bio::Tools::Run::Alignment::Clustalw :
>
> I give it an array_ref scalar (the array contains some fasta  
> sequences)
> and all the good parameters and i write the result via  Bio::SeqIO.
>
> The fact is that my result file only contains the Accession number in
> the header... An example :
>
> the initial stream is :
>
>> NM_052854 Homo sapiens cAMP responsive element binding protein 3- 
>> like 1
> (CREB3L1), mRNA.
> AGAAGACGTGCGGAGGGAGACGCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGG
> GGGAGCACTTAGCTCCCCCGCCCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTC
> AGCCCCAACCCCGGGCTCCCCATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGT
> GGAGTCGGCTGAATGCCCACGGTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCG
> CTGCCCTAAGGCCCCCGCGCGCCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCG
> CCCCTCCCCCGGGGCTTCGCCCCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAG
> GAGCTCTGGACTGGGCGCGCCGCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCC
> CGGGAGCCGGCTGCGATGGACGCCGTCTTGGAACCCTTCCCGGCCGACAGGCTGTTCCCC
> GGATCCAGCTTCCTGGACTTGGGGGATCTGAACGAGTCGGACTTCCTCAACAATGCGCAC
>
> ...
>
> the result file is :
>
>> NM_052854
> ---------------------------------------AGAAGACGTGCGGAGGGAGAC
> GCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGGGGGAGCACTTAGCTCCCCCGC
> CCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTCAGCCCCAACCCCGGGCTCCCC
> ATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGTGGAGTCGGCTGAATGCCCACG
> GTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCGCTGCCCTAAGGCCCCCGCGCG
> CCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCGCCCCTCCCCCGGGGCTTCGCC
> CCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAGGAGCTCTGGACTGGGCGCGCC
> GCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCCCGGGAGCCGGCTGCGATGGAC
>
> ...
>
> So i lost the other informations provided by the header...
>
> Is there any option to keep these informations?
>
> Here is a part of my code with my options :
>
>
>  my $seq_ref=\@seq;
>  my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'quiet' => 1,
> 		'output' => 'FASTA');
>  my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
>  my $aln = $factory->align($seq_ref);
>
>
> Thank you.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From lamq at usal.es  Thu Apr 10 11:52:24 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Thu, 10 Apr 2008 17:52:24 +0200
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
Message-ID: <47FE37B8.9090404@usal.es>

I am not able to add xyplot glyphs to one panel because I have some
problems with the aggregations.

Using that GFF file:

##sequence-region chr1 1 5578650
chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
. . .


And this source code for preparing the aggregated features necessary for
the xyplot glyph:

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,
                            -adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}'
                           );
my $segment  = $db->segment('chr1');
my @features1 = $db->features('atpc');
print "$#features1 \n";
my @features2 = $segment->features('atpc');
print "$#features2 \n";
my @features3 = $db->features('at');
print "$#features3 \n";
my @features4 = $segment->features('at');
print "$#features4 \n";

I obtain:

111572
111572
0
0

What I am doing wrong with the aggregator?

Many thanks.


From lamq at usal.es  Thu Apr 10 11:45:55 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Thu, 10 Apr 2008 17:45:55 +0200
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
Message-ID: <47FE3633.70908@usal.es>

I am not able to add xyplot glyphs to one panel because I have some 
problems with the aggregations.

Using that GFF file:

##sequence-region chr1 1 5578650
chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
. . .


And this source code for preparing the aggregated features necessary for 
the xyplot glyph:

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,
                            -adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}'
                           );
my $segment  = $db->segment('chr1');                           
my @features1 = $db->features('atpc');
print "$#features1 \n";
my @features2 = $segment->features('atpc');
print "$#features2 \n";
my @features3 = $db->features('at');
print "$#features3 \n";
my @features4 = $segment->features('at');
print "$#features4 \n";

I obtain:

111572
111572
0
0

What I am doing wrong with the aggregator?

Many thanks.


From lincoln.stein at gmail.com  Thu Apr 10 13:55:06 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 10 Apr 2008 13:55:06 -0400
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
In-Reply-To: <47FE37B8.9090404@usal.es>
References: <47FE37B8.9090404@usal.es>
Message-ID: <6dce9a0b0804101055w65e22abfgaa4f155751fef40f@mail.gmail.com>

Hi Luis,

When you aggregate the atpc 1 features together, you end up with one
feature. Thus @features3 is an array of size 1. The $# operator returns the
index of the last element, which is 0. If @features3 were empty, $#features3
would return -1.

Lincoln

On Thu, Apr 10, 2008 at 11:52 AM, Luis A. M. Quintales <lamq at usal.es> wrote:

> I am not able to add xyplot glyphs to one panel because I have some
> problems with the aggregations.
>
> Using that GFF file:
>
> ##sequence-region chr1 1 5578650
> chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
> chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
> chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
> chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
> . . .
>
>
> And this source code for preparing the aggregated features necessary for
> the xyplot glyph:
>
> my $filin  = $ARGV[0];
> my $db = Bio::DB::GFF->new( -dsn => $filin,
>                           -adaptor => 'memory',
>                           -aggregator => 'at{atpc:atfreq}'
>                          );
> my $segment  = $db->segment('chr1');
> my @features1 = $db->features('atpc');
> print "$#features1 \n";
> my @features2 = $segment->features('atpc');
> print "$#features2 \n";
> my @features3 = $db->features('at');
> print "$#features3 \n";
> my @features4 = $segment->features('at');
> print "$#features4 \n";
>
> I obtain:
>
> 111572
> 111572
> 0
> 0
>
> What I am doing wrong with the aggregator?
>
> Many thanks.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From adsj at novozymes.com  Fri Apr 11 04:53:23 2008
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Fri, 11 Apr 2008 10:53:23 +0200
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
Message-ID: <87d4owixh8.fsf@topper.koldfront.dk>

  Hi.

I am trying to make Bio::SeqIO return objects of my own type (a small
extension of Bio::Seq::RichSeq), by setting -seqfactory. I am having a
little trouble creating the correct object to pass with -seqfactory:

Following the example given in SYNOPSIS of Bio::Factory::SequenceFactoryI,
I get this error:

 $ perl -e '
 >            use Bio::Seq::SeqFactory;
 >            my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq');
 > 
 >            my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 >                                          -display_id => 'exampleseq');
 > 
 >            print "seq is a ", ref($seq), "\n";
 > '

 ------------- EXCEPTION: Bio::Root::Exception -------------
 MSG: Can't locate type.pm in @INC (@INC contains: /z/bio/biotools/bioinfperlmodules/ /z/bio/adm/modules /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl .) at (eval 13) line 3.
 : Unrecognized Sequence type for SeqFactory 'type'
 STACK: Error::throw
 STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:357
 STACK: Bio::Seq::SeqFactory::type /usr/share/perl/5.8/Bio/Seq/SeqFactory.pm:134
 STACK: Bio::Seq::SeqFactory::new /usr/share/perl/5.8/Bio/Seq/SeqFactory.pm:93
 STACK: -e:3
 -----------------------------------------------------------
 $ 

If I go "Bio::Seq::SeqFactory('Bio::PrimarySeq'=>1)" instead, for
instance, it seems to work:

 $ perl -e '
 >            use Bio::Seq::SeqFactory;
 >            my $seqbuilder = Bio::Seq::SeqFactory->new('Bio::PrimarySeq'=>1);
 > 
 >            my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 >                                          -display_id => 'exampleseq');
 > 
 >            print "seq is a ", ref($seq), "\n";
 > '
 seq is a Bio::PrimarySeq
 $ 

I was about to write a patch for the pod, when I realized that I'd
better start by asking: Is this a buglet in the pod or the code?

  Best regards,

    Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From hlapp at gmx.net  Fri Apr 11 11:35:54 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 11 Apr 2008 11:35:54 -0400
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <87d4owixh8.fsf@topper.koldfront.dk>
References: <87d4owixh8.fsf@topper.koldfront.dk>
Message-ID: <0037240B-F469-4388-972A-324101B11621@gmx.net>


On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:
>  $ perl -e '
>>            use Bio::Seq::SeqFactory;
>>            my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>  
>> 'Bio::PrimarySeq');


You need to prefix the argument with a dash: '-type', not 'type'.  
Otherwise, it assumes that the class you want instantiated is 'type.pm'.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From 1zoujing at 163.com  Thu Apr 10 01:08:52 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 22:08:52 -0700 (PDT)
Subject: [Bioperl-l]  Bio::ASN1::EntrezGene parse so slowly?
Message-ID: <16602210.post@talk.nabble.com>


  I want to parse a file "gene_info" from NCBI. The format of Gene in NCBI is
ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
properly/too slow. The file is about 500M. 
  The code is following:
  use Bio::ASN1::EntrezGene;
  my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
  my $i = 0;
  while(my $result = $parser->next_seq)
  { last; #something to do there, here use last for test}

  When it goes to the "while" part, it is processing on and on, it does not
went out, even I used "last" in the "while" part. 
   So I wonder whether it is too slow or the module is not fit for this job,
or I did something wrong?

  Thank you!
-- 
View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 02:17:41 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 23:17:41 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl Sus_scrofa.ags"
Message-ID: <16602770.post@talk.nabble.com>


   I am a geen hand in Bioperl. When I run perl with
"parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
information:
     Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
  
   But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
should be the same as Homo_sapiens in the example. So it should be no error
as the code is the example from Mingyi.
   I wonder why this happen, and should I change something about the file? 
    
-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16602770.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 02:56:52 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 23:56:52 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16602770.post@talk.nabble.com>
References: <16602770.post@talk.nabble.com>
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:03:56 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:03:56 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
per GeneID, Column header line is the first line in the file
) is not the right format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:04:32 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:04:32 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
per GeneID, Column header line is the first line in the file) is not the
right format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:09:40 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:09:40 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz. It doesn't work.Is
that means "gene_info.gz"( tab-delimited,one line per GeneID, Column header
line is the first line in the file) is not the right format for
Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:10:26 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:10:26 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there is still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz.
   It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line per
GeneID, Column header line is the first line in the file) is not the right
format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From stefan.kirov at bms.com  Fri Apr 11 15:59:29 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Fri, 11 Apr 2008 15:59:29 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16602770.post@talk.nabble.com>
References: <16602770.post@talk.nabble.com>
Message-ID: <Pine.WNT.4.64.0804111557210.2384@A161887.one.ads.bms.com>

AGS is a binary ASN.1 format and WILL NOT be parsed! You have to use 
gene2xml( weird, but this is NCBI) with these flags: -c -x -b -i. This 
will spit out text ASN which can be parsed.
Stefan

On Wed, 9 Apr 2008, zoujing wrote:

>
>   I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>     Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>
>   But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no error
> as the code is the example from Mingyi.
>   I wonder why this happen, and should I change something about the file?
>
> -- 
> View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16602770.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From stefan.kirov at bms.com  Fri Apr 11 16:01:30 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Fri, 11 Apr 2008 16:01:30 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16603225.post@talk.nabble.com>
References: <16603225.post@talk.nabble.com>
Message-ID: <Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>

It is not. If you use this file, why would you need a parser for it 
anyway? Just split on \t or read with OpenOffice or equiv.
Stefan

On Thu, 10 Apr 2008, zoujing wrote:

>
> Seached  the web and found the answer now, quote the answer as following:
>   The error was thrown by my Bio::ASN1::EntrezGene module because it
> expects a text file, while you fed it with a binary file.  To use
> gzipped ASN binary file from NCBI, download the NCBI gene2xml
> (ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml),
> then use this syntax to run my parser on the binary files:
>
> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped
> binary file directly downloaded from NCBI
>
> Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene).
> Mingyi
>
>   But there still one thing, I want to parse "gene_info.gz" in Gene of
> NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
> per GeneID, Column header line is the first line in the file
> ) is not the right format for Bio::ASN1::EntrezGene?
>
>
>
> zoujing wrote:
>>
>>    I am a geen hand in Bioperl. When I run perl with
>> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
>> information:
>>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>>
>>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
>> should be the same as Homo_sapiens in the example. So it should be no
>> error as the code is the example from Mingyi.
>>    I wonder why this happen, and should I change something about the file?
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From asjo at koldfront.dk  Fri Apr 11 15:39:59 2008
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Fri, 11 Apr 2008 21:39:59 +0200
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <0037240B-F469-4388-972A-324101B11621@gmx.net> (Hilmar Lapp's
	message of "Fri, 11 Apr 2008 11:35:54 -0400")
References: <87d4owixh8.fsf@topper.koldfront.dk>
	<0037240B-F469-4388-972A-324101B11621@gmx.net>
Message-ID: <877if4i3jk.fsf@topper.koldfront.dk>

On Fri, 11 Apr 2008 11:35:54 -0400, Hilmar wrote:

> On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:

>>> my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>
>>> 'Bio::PrimarySeq');

> You need to prefix the argument with a dash: '-type', not 'type'. 
> Otherwise, it assumes that the class you want instantiated is
> 'type.pm'.

I guess that means I should submit a patch for the SYNOPSIS. Attached.


   Thanks,

    Adam


Index: Bio/Factory/SequenceFactoryI.pm
===================================================================
--- Bio/Factory/SequenceFactoryI.pm	(revision 14654)
+++ Bio/Factory/SequenceFactoryI.pm	(working copy)
@@ -20,7 +20,7 @@
 # get a Bio::Factory::SequenceFactoryI object like
 
     use Bio::Seq::SeqFactory;
-    my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq');
+    my $seqbuilder = Bio::Seq::SeqFactory->new('-type' => 'Bio::PrimarySeq');
 
     my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 				  -display_id => 'exampleseq');

-- 
 "Well, I'm a moon around you"                                Adam Sj?gren
                                                         asjo at koldfront.dk


From bamboowarrior at gmail.com  Fri Apr 11 19:10:35 2008
From: bamboowarrior at gmail.com (Arkady)
Date: Fri, 11 Apr 2008 18:10:35 -0500
Subject: [Bioperl-l] Nucleotide Links in Gene DB (GenBank)
Message-ID: <91656c3f0804111610r24c8fa5es5bcb56b7a59e0208@mail.gmail.com>

Hi everyone, I'm a bioperl n00b. Actually, kind of a genbank n00b,
too, as I'm from a CS background and just started bio things last
June.

I'm trying to set up an analysis pipeline of primate protein CDSs (the
nucleotide seqs). I've written a script which does a pretty decent job
of downloading these from GenBank--but it's inconsistent, because a
lot of sequences in nucleotide are 'predicted' and named LOCthisorthat
instead of by gene name.

So what I was thinking was this (assume ANKRD43 is the gene for this example):

1. Search 'gene' database for ANKRD43 AND (PRI*[ORGN])
On NCBI, there's an option to show all nucleotide links. How do I get
a list of those in bioperl? Can bioperl even search 'gene', or just
'nucleotide'?

2. Search 'nucleotide' for the referenced items from #1, and also for
ANKRD43[TITL] AND (PRI*[ORGN]), save CDSes.

3. BLAST mRNA for one of those CDSes, see if we pick up any other matches.

4. BLAT other primates for CDSes, see if we find anything not in GenBank.


On the other hand, I always get the feeling I'm doing things the hard
way--especially here, with #1 and #2. Is there a much more obvious,
simple way to do this?

Thanks, folks.


Cheers,
John Woods

Institute for Cellular and Molecular Biology
The University of Texas at Austin


From hlapp at gmx.net  Fri Apr 11 19:19:44 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 11 Apr 2008 19:19:44 -0400
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <877if4i3jk.fsf@topper.koldfront.dk>
References: <87d4owixh8.fsf@topper.koldfront.dk>
	<0037240B-F469-4388-972A-324101B11621@gmx.net>
	<877if4i3jk.fsf@topper.koldfront.dk>
Message-ID: <B4B3CAD0-C346-470C-98D7-D6CBFE116109@gmx.net>

Thanks, applied. -hilmar

On Apr 11, 2008, at 3:39 PM, Adam Sj?gren wrote:
> On Fri, 11 Apr 2008 11:35:54 -0400, Hilmar wrote:
>
>> On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:
>
>>>> my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>
>>>> 'Bio::PrimarySeq');
>
>> You need to prefix the argument with a dash: '-type', not 'type'.
>> Otherwise, it assumes that the class you want instantiated is
>> 'type.pm'.
>
> I guess that means I should submit a patch for the SYNOPSIS. Attached.
>
>
>    Thanks,
>
>     Adam
>
>
> Index: Bio/Factory/SequenceFactoryI.pm
> ===================================================================
> --- Bio/Factory/SequenceFactoryI.pm	(revision 14654)
> +++ Bio/Factory/SequenceFactoryI.pm	(working copy)
> @@ -20,7 +20,7 @@
>  # get a Bio::Factory::SequenceFactoryI object like
>
>      use Bio::Seq::SeqFactory;
> -    my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>  
> 'Bio::PrimarySeq');
> +    my $seqbuilder = Bio::Seq::SeqFactory->new('-type' =>  
> 'Bio::PrimarySeq');
>
>      my $seq = $seqbuilder->create(-seq => 'ACTGAT',
>  				  -display_id => 'exampleseq');
>
> -- 
>  "Well, I'm a moon around you"                                Adam  
> Sj?gren
>                                                           
> asjo at koldfront.dk
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From mmokrejs at ribosome.natur.cuni.cz  Fri Apr 11 21:32:14 2008
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Sat, 12 Apr 2008 03:32:14 +0200
Subject: [Bioperl-l] [BioSQL-l] Loading sequences with novel NCBI
	taxon_id
In-Reply-To: <CE3675B2-2AFD-46AA-A348-16C9FEA51E0E@uiuc.edu>
References: <320fb6e00803130806w46148bacm54c3ead9a50b038f@mail.gmail.com>	<32EB5B0C-4CC8-4C33-9F41-5D4465B6AC48@gmx.net>	<320fb6e00803131613o20eae2b7y325814ef26d2738f@mail.gmail.com>	<CEA4F4E7-A66B-4C62-AE32-511E177BC485@gmx.net>	<93b45ca50803140648s5098a7d0sec621f448ef03040@mail.gmail.com>
	<CE3675B2-2AFD-46AA-A348-16C9FEA51E0E@uiuc.edu>
Message-ID: <4800111E.3030802@ribosome.natur.cuni.cz>

Chris Fields wrote:
> The counter to that perspective (using new sequences with old tax info) 
> would be to regularly update NCBI taxonomy, particularly in 
> circumstances prior to adding new sequences.  Hilmar mentioned that once 
> tax is loaded it doesn't take as long to update, so you could set up a 
> cron job to update regularly.
> 
> I remember someone mentioning weekly or monthly updates on the list 
> quite a while ago, but I'm unsure how often NCBI updates tax information 
> (i.e. with every release, monthly, weekly, etc).  I can see instances 
> popping up where you used the an up-to-date taxonomy but a new sequence 
> contains a tax ID not present.  I think bioperl-db handles these but I'm 
> not sure what other Bio* do.
> 

I spent some time benchmarking this and inspecting the mysql log files.
The current load_ncbi_taxonomy.pl script with minor modification to
show timestamps does this on initial import into mysql and then update
of the database using exactly same dataset (but anyway it has to walk
through all the data):

$ ./load_ncbi_taxonomy.pl --dbname=biosqldb --driver=mysql --host=127.0.01 \
  --port=3306 --directory=/home/mmokrejs/bioinformatics/databases/ncbitax/dump \
  --chunksize=0 --verbose=2 --mycnf=~/.my.cnf
Sat Apr 12 01:58:43 MEST 2008
Loading NCBI taxon database in /home/mmokrejs/bioinformatics/databases/ncbitax/dump:
       ... retrieving all taxon nodes in the database
Sat Apr 12 01:58:43 MEST 2008
       ... reading in taxon nodes from nodes.dmp
Sat Apr 12 01:58:58 MEST 2008
       ... insert / update / delete taxon nodes
                10000/421098 done (in 5 secs, 2000.0 rows/s)
                20000/421098 done (in 4 secs, 2500.0 rows/s)
...
                420000/421098 done (in 4 secs, 2500.0 rows/s)
Sat Apr 12 02:02:21 MEST 2008
       ... (committing nodes)
Sat Apr 12 02:02:21 MEST 2008
       ... rebuilding nested set left/right values
                10000 done (in 24 secs, 416.7 rows/s)
                20000 done (in 26 secs, 384.6 rows/s)
                30000 done (in 24 secs, 416.7 rows/s)
...
                420004 done (in 23 secs, 434.8 rows/s)
Sat Apr 12 02:19:25 MEST 2008
       ... reading in taxon names from names.dmp
Sat Apr 12 02:19:25 MEST 2008
       ... deleting old taxon names
Sat Apr 12 02:19:25 MEST 2008
       ... inserting new taxon names
                10000 done (in 8 secs, 1250.0 rows/s)
                20000 done (in 8 secs, 1250.0 rows/s)
...
                580000 done (in 5 secs, 2000.0 rows/s)
Sat Apr 12 02:24:48 MEST 2008
       ... cleaning up
Sat Apr 12 02:24:49 MEST 2008
Done.
$


I decided to re-import the same data to mimic at least somehow
the future updates, although no record should be UPDATEd,
except zapping left and right values with NULL. :((

$ ./load_ncbi_taxonomy.pl --dbname=biosqldb --driver=mysql --host=127.0.01
  --port=3306 --directory=/home/mmokrejs/bioinformatics/databases/ncbitax/dump \
  --chunksize=0 --verbose=2 --mycnf=~/.my.cnf
Sat Apr 12 02:35:20 MEST 2008
Loading NCBI taxon database in /home/mmokrejs/bioinformatics/databases/ncbitax/dump:
        ... retrieving all taxon nodes in the database
Sat Apr 12 02:35:26 MEST 2008
       ... reading in taxon nodes from nodes.dmp
Sat Apr 12 02:35:46 MEST 2008
       ... insert / update / delete taxon nodes
                10000/421098 done (in 0 secs, 10000.0 rows/s)
                20000/421098 done (in 0 secs, 10000.0 rows/s)
...
                410000/421098 done (in 0 secs, 10000.0 rows/s)
                420000/421098 done (in 0 secs, 10000.0 rows/s)
Sat Apr 12 02:35:55 MEST 2008
       ... (committing nodes)
Sat Apr 12 02:35:55 MEST 2008
       ... rebuilding nested set left/right values
                10000 done (in 9 secs, 1111.1 rows/s)
                20000 done (in 9 secs, 1111.1 rows/s)
...
                410004 done (in 8 secs, 1250.0 rows/s)
                420004 done (in 9 secs, 1111.1 rows/s)
Sat Apr 12 02:41:54 MEST 2008
       ... reading in taxon names from names.dmp
Sat Apr 12 02:41:54 MEST 2008
       ... deleting old taxon names
Sat Apr 12 02:41:55 MEST 2008
       ... inserting new taxon names
                10000 done (in 5 secs, 2000.0 rows/s)
                20000 done (in 5 secs, 2000.0 rows/s)
...
                570000 done (in 6 secs, 1666.7 rows/s)
                580000 done (in 5 secs, 2000.0 rows/s)
Sat Apr 12 02:47:27 MEST 2008
       ... cleaning up
Sat Apr 12 02:47:27 MEST 2008
Done.
$ ls -la /var/log/mysql/mysql.log 
-rw-rw---- 1 mysql mysql 483443314 Apr 12 03:15 /var/log/mysql/mysql.log
$

Pentium4 M laptop, 1.8GHz, 1 GB RAM, mysql-5.0.56 with enabled
SQL text logging, the slow version of logging all SQL commands
compared to binary logging. The log was cleared before the tests.
I could provide some bits from the log or upload it somewhere
if anybody else would like to dig into the details.


I believe the recalculation step could be made faster. See what
happens:

                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '1' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '10239' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12333' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12335' ORDER BY ncbi_taxon_id
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE left_value = '4'
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE right_value = '5'
                     31 Query       UPDATE taxon SET left_value = '4', right_value = '5' WHERE taxon_id = '12335'
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12340' ORDER BY ncbi_taxon_id
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE left_value = '6'
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE right_value = '7'
                     31 Query       UPDATE taxon SET left_value = '6', right_value = '7' WHERE taxon_id = '12340'


The columns left_value and right_value have NULL value upon
the table is created, so no need to write again NULL into
them. This would mean writing a wrapper function which would
mimic update() but before doing that it would do 'SELECT * FROM',
compare the values with those to be written and include in the
final UPDATE statement only those columns for which values have
been changed. We use such a smart wrapper for our code in python.
;-)

When the columns for left and right are to be made NULL during
update of an existing database, I think it would be much faster
to drop the columns and re-create them again with NULL values.


I think it could be investigated more the possibility to create
empty taxon and taxon_name tables as MyISAM tables and only after
all the import and updates they could be converted into InnoDB
tables. One would have to probably think a bit more of the foreign
keys but it might be they would not even be lost during the conversion
back and forth.

Actually, easy to check. Dump your current taxon and taxon_name
tables (maybe even without sql data using --without-data), run
'ALTER TABLE taxon ... type=MyISAM'
followed by
'ALTER TABLE taxon ... type=InnoDB'
dump again the database structure and compare by diff with
the original.

But, time for sleep here.
Martin


From sdavis2 at mail.nih.gov  Fri Apr 11 23:50:44 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 11 Apr 2008 23:50:44 -0400
Subject: [Bioperl-l] Bio::ASN1::EntrezGene parse so slowly?
In-Reply-To: <16602210.post@talk.nabble.com>
References: <16602210.post@talk.nabble.com>
Message-ID: <264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>

gene_info is a tab-delimited text file, if I recall correctly.  Have
you looked at it?  If it is, you should be able to parse it in a few
seconds with just a couple lines of code.

Sean


On Thu, Apr 10, 2008 at 1:08 AM, zoujing <1zoujing at 163.com> wrote:
>
>   I want to parse a file "gene_info" from NCBI. The format of Gene in NCBI is
>  ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
>  properly/too slow. The file is about 500M.
>   The code is following:
>   use Bio::ASN1::EntrezGene;
>   my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
>   my $i = 0;
>   while(my $result = $parser->next_seq)
>   { last; #something to do there, here use last for test}
>
>   When it goes to the "while" part, it is processing on and on, it does not
>  went out, even I used "last" in the "while" part.
>    So I wonder whether it is too slow or the module is not fit for this job,
>  or I did something wrong?
>
>   Thank you!
>  --
>  View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
>  Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
>  _______________________________________________
>  Bioperl-l mailing list
>  Bioperl-l at lists.open-bio.org
>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From david at burt7259.freeserve.co.uk  Sat Apr 12 13:01:57 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sat, 12 Apr 2008 18:01:57 +0100
Subject: [Bioperl-l] bioperl-db
Message-ID: <BFCB174E-B59E-4249-BDF8-4B0F2E2273C9@burt7259.freeserve.co.uk>

Hi Hilmar,

Hope you can help ? I am using bioperl-db to create a biosql database

I have used scripts load_seqdatabase.pl and load_ontology.pl to  
install human swissprot entries, gene ontology, sequence ontology and  
now want to load interpro

Here?s the command line I have tried

perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
root --dbpass chicken --driver mysql \
--namespace "InterPro" --format InterPro interpro.xml

But I get this message

Can't call method "identifier" on an undefined value at  /cygdrive/c/ 
Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
SimpleOntologyEngine.pm line 395

Any ideas?

Dave

PS: here?s the top of the interpro.xml file

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE interprodb SYSTEM "interpro.dtd">


<interprodb>
     <release>
       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
file_date="04-OCT-2006 00:00:00" />
       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
file_date="22-NOV-2006 00:00:00" />
       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
file_date="12-JUN-2007 00:00:00" />
       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
file_date="22-SEP-2005 00:00:00" />
       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
file_date="23-APR-2004 00:00:00" />
       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
file_date="14-NOV-2006 00:00:00" />
       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
file_date="27-JUL-2007 00:00:00" />
       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
file_date="28-SEP-2007 00:00:00" />
       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
file_date="11-SEP-2006 00:00:00" />
       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
file_date="30-NOV-2006 00:00:00" />
       <dbinfo dbname="SWISSPROT" version="55.1" entry_count="359942"  
file_date="18-MAR-2008 00:00:00" />
       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
file_date="18-MAR-2008 00:00:00" />
       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
file_date="19-MAR-2008 00:00:00" />
       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
file_date="27-MAR-2007 00:00:00" />
       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
file_date="12-JUL-2007 16:56:17" />
     </release>
   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
protein_count="352">
     <name>Kringle</name>
     <abstract>

  
From hlapp at gmx.net  Sat Apr 12 14:10:44 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 14:10:44 -0400
Subject: [Bioperl-l] personal vs list email
Message-ID: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>

I'm not sure why but I have received several Bioperl or BioSQL- 
related email inquiries directed to me *personally* over the past few  
weeks.

I have been responding as I get to them, but I feel that I am doing  
both the senders and this community a poor service, because sometimes  
someone else on the list could have responded much faster, and when I  
respond, others on the list who happen to be interested in the same  
question don't get to see the answer.

So from now on as a policy I will redirect *every* email sent to me  
personally and that asks a question related to one of the projects to  
the respective mailing list. If you don't want this, please  
conspicuously say so at the top of your email, and in that case if  
you do ask a project-related question be prepared to wait and to  
possibly needing to follow up.

As an aside, it's a pretty safe assumption to make that all other  
core developers, and quite possibly *all* developers are following a  
similar policy, whether expressly or not.

Isn't this somewhere in the FAQ too?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Apr 12 14:16:13 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 14:16:13 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
Message-ID: <C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>

Hi Burt,

can you try format interprosax instead of interpro? That variant is  
also much more graceful regarding required space.

	-hilmar

On Apr 12, 2008, at 1:01 PM, David Burt wrote:

> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Apr 12 16:17:43 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 12 Apr 2008 15:17:43 -0500
Subject: [Bioperl-l] [BioSQL-l] personal vs list email
In-Reply-To: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
References: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
Message-ID: <E7962E90-8309-4ADA-B002-950793B61D74@uiuc.edu>


On Apr 12, 2008, at 1:10 PM, Hilmar Lapp wrote:

> I'm not sure why but I have received several Bioperl or BioSQL- 
> related email inquiries directed to me *personally* over the past  
> few weeks.
>
> I have been responding as I get to them, but I feel that I am doing  
> both the senders and this community a poor service, because  
> sometimes someone else on the list could have responded much faster,  
> and when I respond, others on the list who happen to be interested  
> in the same question don't get to see the answer.
>
> So from now on as a policy I will redirect *every* email sent to me  
> personally and that asks a question related to one of the projects  
> to the respective mailing list. If you don't want this, please  
> conspicuously say so at the top of your email, and in that case if  
> you do ask a project-related question be prepared to wait and to  
> possibly needing to follow up.
>
> As an aside, it's a pretty safe assumption to make that all other  
> core developers, and quite possibly *all* developers are following a  
> similar policy, whether expressly or not.

I agree; I'm sure several other core devs feel the same way.  I always  
try to forward these to the list if I feel it is more relevant there.

> Isn't this somewhere in the FAQ too?
>
> 	-hilmar

No, but I've added it to the bioperl FAQ; might be worth checking over  
and editing.

chris


From hlapp at gmx.net  Sat Apr 12 18:40:53 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 18:40:53 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89ce2$5400a710$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce2$5400a710$0202a8c0@STUDYPC>
Message-ID: <3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>

Burt - please keep your replies on the list. Others may have input  
too, or benefit from the answer too.

As there is no name() method call on line 914 in the current version  
let's check first that you run a current version of BioPerl. It will  
need to be at least 1.5.2.

However, I do suspect a problem in either the InterPro file itself  
(wouldn't be the first time), or the InterPro parser.

	-hilmar

On Apr 12, 2008, at 5:15 PM, David Burt wrote:

> Hilmar
>
> Many thanks seems to be working
>
> But got this output ? any comments/ideas what it means ?
>
> Dave
>
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> > --namespace "InterPro" --format interprosax interpro.xml
>         ...deleting all relationships for InterPro
>         ...parsing and loading InterPro
> Can't call method "name" on an undefined value at load_ontology.pl  
> line 914.
>
> HERE?S the name and definition in the ontology table
>
> Name = InterPro
>
> Definition =
>
> PANTHER version 6.1, 30128 entries, 04-OCT-2006
> PFAM version 21.0, 8957 entries, 22-NOV-2006
> PIRSF version 2.70, 2877 entries, 12-JUN-2007
> PRINTS version 38.0, 1900 entries, 22-SEP-2005
> PRODOM version 2005.1, 1522 entries, 23-APR-2004
> PROSITE version 20.0, 2006 entries, 14-NOV-2006
> SMART version 5.1, 724 entries, 27-JUL-2007
> TIGRFAMs version 7.0, 3423 entries, 28-SEP-2007
> GENE3D version 3.0.0, 2147 entries, 11-SEP-2006
> SSF version 1.69, 1538 entries, 30-NOV-2006
> SWISSPROT version 55.1, 359942 entries, 18-MAR-2008
> TREMBL version 38.1, 5443281 entries, 18-MAR-2008
> INTERPRO version 17.0, 16175 entries, 19-MAR-2008
> GO version N/A, 23937 entries, 27-MAR-2007
> MEROPS version 7.8, 2831 entries, 12-JUL-2007 |
>
>
>
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: 12 April 2008 19:16
> To: David Burt
> Cc: Bioperl BioPerl
> Subject: Re: bioperl-db
>
> Hi Burt,
>
> can you try format interprosax instead of interpro? That variant is  
> also much more graceful regarding required space.
>
>             -hilmar
>
> On Apr 12, 2008, at 1:01 PM, David Burt wrote:
>
>
> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Apr 12 18:43:25 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 18:43:25 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
Message-ID: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>

I'm not sure what you mean by 'Check interpro.xml', but you can use  
the --safe command-line option to keep going if an individual term  
fails to load for whatever reason.

Can you post the data for the seemingly offending record? (and please  
cc the list)

	-hilmar

On Apr 12, 2008, at 5:39 PM, David Burt wrote:

> Hi Hilmar
>
> Just checked mysql database and only have 39 entries under interpro  
> and loaded up to IPR000035
>
> Check unterpro.xml looks OK from IPR000036 and onwards
>
> So seems to have crashed at IPR000035 ?
>
> dave
>
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: 12 April 2008 19:16
> To: David Burt
> Cc: Bioperl BioPerl
> Subject: Re: bioperl-db
>
> Hi Burt,
>
> can you try format interprosax instead of interpro? That variant is  
> also much more graceful regarding required space.
>
>             -hilmar
>
> On Apr 12, 2008, at 1:01 PM, David Burt wrote:
>
>
> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From Russell.Smithies at agresearch.co.nz  Sun Apr 13 22:51:41 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 14 Apr 2008 14:51:41 +1200
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC><C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net><000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>

Has anyone tried TRF? 
I notice UCSC is using it for all their simple repeat annotations and thought it might be better than what we're currently using (Sputnik)

And is there a BioPerl parser for it's output or am I going to have to write my own ?

Thanx,


Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E? russell.smithies at agresearch.co.nz 

Invermay? Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T? +64 3 489 3809?? 
F? +64 3 489 9174? 
www.agresearch.co.nz 


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From Russell.Smithies at agresearch.co.nz  Sun Apr 13 22:53:46 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 14 Apr 2008 14:53:46 +1200
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C03B09DE9@imail.agresearch.co.nz>
References: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
	<D5DBA313349A4B458528BE63B387F36C03B09DE9@imail.agresearch.co.nz>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEA881@imail.agresearch.co.nz>

Scratch the need for a parser.
I turned off html output and it's all nice white-space separated text  :-)

Russell

> -----Original Message-----
> From: Smithies, Russell
> Sent: Monday, 14 April 2008 2:52 p.m.
> To: 'Bioperl BioPerl'
> Subject: Tandem Repeats Finder?
> 
> Has anyone tried TRF?
> I notice UCSC is using it for all their simple repeat annotations and thought it might
> be better than what we're currently using (Sputnik)
> 
> And is there a BioPerl parser for it's output or am I going to have to write my own ?
> 
> Thanx,
> 
> 
> Russell Smithies
> 
> Bioinformatics Applications Developer
> T +64 3 489 9085
> E? russell.smithies at agresearch.co.nz
> 
> Invermay? Research Centre
> Puddle Alley,
> Mosgiel,
> New Zealand
> T? +64 3 489 3809
> F? +64 3 489 9174
> www.agresearch.co.nz
> 

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From csaba.ortutay at gmail.com  Mon Apr 14 00:15:22 2008
From: csaba.ortutay at gmail.com (Ortutay Csaba =?iso-8859-1?q?P=E9ter?=)
Date: Mon, 14 Apr 2008 07:15:22 +0300
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
	<D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>
Message-ID: <200804140715.22702.csaba.ortutay@gmail.com>

Hello, I have used TRF in my earlier projects. It is nice and quick tool.

There was not ready made parsers those times (5-6 years ago) so we have 
written our own.

Csaba

> Has anyone tried TRF?
> I notice UCSC is using it for all their simple repeat annotations and
> thought it might be better than what we're currently using (Sputnik)
>
> And is there a BioPerl parser for it's output or am I going to have to
> write my own ?
>
> Thanx,


-- 
Csaba Ortutay PhD
IMT Bioinformatics
University of Tampere
Finland


From avilella at gmail.com  Mon Apr 14 07:13:26 2008
From: avilella at gmail.com (Albert Vilella)
Date: Mon, 14 Apr 2008 12:13:26 +0100
Subject: [Bioperl-l] how can I print a Bio::Tree newick sortby given list?
Message-ID: <358f4d650804140413x4271f18bx40af1b9054306df8@mail.gmail.com>

Hi,

I have a newick file that I want to sort by a given order and print again as
newick.
For example, if I have

(((ENSPTRG00000013811:0.0011,ENSG00000142192:0.0021):0.0033,ENSPPYG00000003902:0.0326):0.0000,ENSMMUG00000014384:0.0366):0.3638;

I want to sort it by "ENSG:ENSPTRG:ENSPPYG:ENSMMUG".

Any suggestions on how to do this in bioperl?

Cheers,

    Albert.


From lamq at usal.es  Mon Apr 14 11:01:51 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Mon, 14 Apr 2008 17:01:51 +0200
Subject: [Bioperl-l] xyplot glyph: scale problems
Message-ID: <480371DF.7040900@usal.es>

I have some problem with the xyplot scale numbers calculated by the glyph.

The shape of the graph looks fine, but the scale number 10 and his 
position in the ouput is not correct.

I send the source code, simplified input file and the png output.

Thank you


Source code

ex1.pl  (also in http://avellano.usal.es/~luis/bioperl-l/ex1.pl)
============================
#!/usr/bin/perl
use Bio::DB::GFF;
use Bio::Graphics::Panel;
use strict;

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,-adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}' );
my $segment  = $db->segment('chr1');
my @features = $segment->features('at');
my $panel = Bio::Graphics::Panel->new(
       -offset    => 0, -grid    => 100,                               
       -length    => 500, -width     => 800,
       -pad_left  => 50, -pad_right => 50 );
$panel->add_track($segment, -glyph   => 'generic',
                           -bgcolor => 'blue', -label   => 
1);                                    
$panel->add_track(\@features,
                    -glyph => 'xyplot',
                    -graph_type=>'boxes',
                    -scale=>'left',
                    -height=>200,
 );
open (FI,"> sal.png");
============================

in1.gff file (also in http://avellano.usal.es/~luis/bioperl-l/in1.gff)
============================
##sequence-region chr1 1 5578650
chr1    atfreq    atpc    1    10       64.0000    .    .    atpc 1
chr1    atfreq    atpc    11    20       63.0000    .    .    atpc 1
chr1    atfreq    atpc    21    30       62.0000    .    .    atpc 1
chr1    atfreq    atpc    31    40       59.0000    .    .    atpc 1
chr1    atfreq    atpc    41    50       59.0000    .    .    atpc 1
chr1    atfreq    atpc    51    60       59.0000    .    .    atpc 1
chr1    atfreq    atpc    61    70       59.0000    .    .    atpc 1
chr1    atfreq    atpc    71    80       59.0000    .    .    atpc 1
chr1    atfreq    atpc    81    90       61.0000    .    .    atpc 1
chr1    atfreq    atpc    91    100       60.0000    .    .    atpc 1
chr1    atfreq    atpc    101    110       60.0000    .    .    atpc 1
chr1    atfreq    atpc    111    120       64.0000    .    .    atpc 1
chr1    atfreq    atpc    121    130       64.0000    .    .    atpc 1
chr1    atfreq    atpc    131    140       60.0000    .    .    atpc 1
chr1    atfreq    atpc    141    150       60.0000    .    .    atpc 1
chr1    atfreq    atpc    151    160       63.0000    .    .    atpc 1
chr1    atfreq    atpc    161    170       62.0000    .    .    atpc 1
chr1    atfreq    atpc    171    180       59.0000    .    .    atpc 1
chr1    atfreq    atpc    181    190       54.0000    .    .    atpc 1
chr1    atfreq    atpc    191    200       53.0000    .    .    atpc 1
chr1    atfreq    atpc    201    210       54.0000    .    .    atpc 1
chr1    atfreq    atpc    211    220       50.0000    .    .    atpc 1
chr1    atfreq    atpc    221    230       51.0000    .    .    atpc 1
chr1    atfreq    atpc    231    240       56.0000    .    .    atpc 1
chr1    atfreq    atpc    241    250       58.0000    .    .    atpc 1
chr1    atfreq    atpc    251    260       55.0000    .    .    atpc 1
chr1    atfreq    atpc    261    270       54.0000    .    .    atpc 1
chr1    atfreq    atpc    271    280       56.0000    .    .    atpc 1
chr1    atfreq    atpc    281    290       59.0000    .    .    atpc 1
chr1    atfreq    atpc    291    300       58.0000    .    .    atpc 1
chr1    atfreq    atpc    301    310       60.0000    .    .    atpc 1
chr1    atfreq    atpc    311    320       59.0000    .    .    atpc 1
chr1    atfreq    atpc    321    330       59.0000    .    .    atpc 1
chr1    atfreq    atpc    331    340       57.0000    .    .    atpc 1
chr1    atfreq    atpc    341    350       56.0000    .    .    atpc 1
chr1    atfreq    atpc    351    360       57.0000    .    .    atpc 1
chr1    atfreq    atpc    361    370       57.0000    .    .    atpc 1
chr1    atfreq    atpc    371    380       58.0000    .    .    atpc 1
chr1    atfreq    atpc    381    390       56.0000    .    .    atpc 1
chr1    atfreq    atpc    391    400       58.0000    .    .    atpc 1
chr1    atfreq    atpc    401    410       56.0000    .    .    atpc 1
chr1    atfreq    atpc    411    420       59.0000    .    .    atpc 1
chr1    atfreq    atpc    421    430       58.0000    .    .    atpc 1
chr1    atfreq    atpc    431    440       59.0000    .    .    atpc 1
chr1    atfreq    atpc    441    450       58.0000    .    .    atpc 1
chr1    atfreq    atpc    451    460       58.0000    .    .    atpc 1
chr1    atfreq    atpc    461    470       56.0000    .    .    atpc 1
chr1    atfreq    atpc    471    480       57.0000    .    .    atpc 1
chr1    atfreq    atpc    481    490       59.0000    .    .    atpc 1
============================


The sal.png :
http://avellano.usal.es/~luis/bioperl-l/sal.png

Thank you.


-- 
==================================================
 Luis Antonio Miguel Quintales
 Departamento de Inform?tica y Autom?tica
 Facultad de Ciencias
 Universidad de Salamanca
 Plaza de la Merced s/n
 37008-SALAMANCA
 SPAIN
==================================================
 Tel.: +34-923-294400(ext.1513)
 Fax.: +34-923-294584
 E-mail: lamq at usal.es
==================================================


From aaron.j.mackey at gsk.com  Mon Apr 14 09:00:52 2008
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Mon, 14 Apr 2008 09:00:52 -0400
Subject: [Bioperl-l] personal vs list email
In-Reply-To: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
Message-ID: <OF3ED0BD19.1CBA005A-ON8525742B.00473A95-8525742B.00477DEC@gsk.com>

I try to take it even one step further: I require the person to re-ask 
their question on the mailing list (and then try to answer it there). This 
has the added benefit of causing the person to pause a moment to reflect 
on their question, and (sometimes) to spend a bit more time preparing the 
question for more broader public consumption.

-Aaron


From sutripa at vbi.vt.edu  Mon Apr 14 12:54:47 2008
From: sutripa at vbi.vt.edu (Sucheta Tripathy)
Date: Mon, 14 Apr 2008 12:54:47 -0400 (EDT)
Subject: [Bioperl-l] Error installing XML::Parser
Message-ID: <1285.99.152.150.87.1208192087.squirrel@webmail.vbi.vt.edu>


Hello List,

I have recently installed bioperl using the following command. The
installation was successful. Now I am trying to install XML::Parser but it
returns with  error messages. Any clue what I may be doing wrong?

Thanks

Sucheta

Following is the last part of the error message:

### Error Message #######

Expat.c: In function ??~XS_XML__Parser__Expat_SkipUntil??T:
Expat.c:2664: error: ??~XML_Parser??T undeclared (first use in this function)
Expat.c:2664: error: expected ??~;??T before ??~parser??T
Expat.c:2665: warning: ISO C90 forbids mixed declarations and code
Expat.xs:2179: error: ??~parser??T undeclared (first use in this function)
Expat.xs:2179: warning: cast to pointer from integer of different size
Expat.xs:2180: error: ??~CallbackVector??T has no member named
??~st_serial??T
Expat.xs:2182: error: ??~CallbackVector??T has no member named
??~skip_until??T
Expat.c: In function ??~XS_XML__Parser__Expat_Do_External_Parse??T:
Expat.c:2687: error: ??~XML_Parser??T undeclared (first use in this function)
Expat.c:2687: error: expected ??~;??T before ??~parser??T
Expat.c:2688: warning: ISO C90 forbids mixed declarations and code
Expat.xs:2194: error: ??~parser??T undeclared (first use in this function)
Expat.xs:2194: warning: cast to pointer from integer of different size
Expat.xs:2205: warning: unused variable ??~pret??T
Expat.xs:2194: warning: unused variable ??~cbv??T
Expat.xs:2192: warning: unused variable ??~type??T
make[1]: *** [Expat.o] Error 1
make[1]: Leaving directory `/root/.cpan/build/XML-Parser-2.36/Expat'
make: *** [subdirs] Error 2
  /usr/bin/make  -- NOT OK
Running make test
  Can't test without successful make
Running make install
  make had returned bad status, install seems impossible

#####

-- 
Sucheta Tripathy, Ph.D.
Virginia Bioinformatics Institute Phase-I
Washington street.
Virginia Tech.
Blacksburg,VA 24061-0447
phone:(540)231-8138
Fax:  (540) 231-2606


From mmokrejs at ribosome.natur.cuni.cz  Tue Apr 15 06:45:48 2008
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Tue, 15 Apr 2008 12:45:48 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>	<47F9F3AA.2090003@uv.es>
	<200804071448.34769.heikki@sanbi.ac.za>	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>	<47FA4AD2.5030206@uv.es>
	<CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>
Message-ID: <4804875C.80506@ribosome.natur.cuni.cz>

Chris Fields wrote:
> Note in the example I gave that, during the revision history, the 
> DBSOURCE changed at the point of the creation date (the original nuc.
>  record was a M. tuberculosis contig sequence, which later changed to
> an updated full M. tuberculosis genome record at the time of the
> 'create date').
> 
> Couldn't find anything specific in the GenBank docs on this, but it 
> appears (at least for a protein record) the creation date reflects
> the date in which the sequence was either originally deposited or
> originally derived from the nucleotide source record present in the
> record.  In other words, it may not reflect the original date of
> deposition (which could have come from a different record, as in this
> case).
> 
> chris

Hi,
I have few answers from the past from NCBI staff to my similar questions
regarding DATE issues and VERSION numbers not being increased upon
"changes" in a record.
I tried below to put into a more readable form my former correspondence.
Hope this helps everybody to understand what happens in the black box. ;)
Martin


Date: Thu, 17 Jan 2002 15:40:07 -0500 (EST)
From: David Wheeler
Subject: Brucella_melitensis on ftp site

> Hi, I'd like to point you to the fact, that the descriptions of 
> Brucella_melitensis differ in 
> ftp.ncbi.nih.nlm.gov/genomes/Bacteria/Brucella_melitensis and 
> ftp.ncbi.nih.nlm.gov/genbank/genomes/Bacteria/Brucella_melitensis
> 
> Namely, the description of the strain is retained in *.gbk files
> under /genomes/Bacteria/Brucella_melitensis only under the strain
> description field, but not in the DEFINITION line, where it is
> present in *.gbk files under
> /genbank/genomes/Bacteria/Brucella_melitensis.
> 
> LOCUS       NC_003318 1177787 bp    DNA   circular  BCT
> 13-NOV-2001 DEFINITION  Brucella melitensis chromosome II, complete
> sequence. ACCESSION   NC_003318 VERSION     NC_003318.1  GI:17988344
> 
> compared to
> 
> LOCUS       AE008918  1177787 bp    DNA   circular  BCT
> 27-DEC-2001 DEFINITION  Brucella melitensis strain 16M chromosome II,
> complete sequence. ACCESSION   AE008918 VERSION     AE008918
> 
> This makes me worried about the data. Why is the release date of 
> NON-curated files (AE008918) newer than the release data of CURATED
> data (NC_003318)? Is it expected case? Could someone explain me the
> difference between them (i.e. CURATED vs. NONCURATED)?

The curated record is initially a copy of the non-curated record with certain 
changes in documentation made in order to comply with the NCBI standard for 
reference genomes. One change which you have noticed is the difference in 
Definition line format.  Curated genomic records are created in order to 
standardize annotation for genomes in the Entrez Genomes database while leaving 
editorial control for the parent GenBank records in the hands of the original 
submitters.

Regardles of the date you see on the record, the curated version is derived from 
the non-curated one.  In this case, it appears that the processing of the 
non-curated version lagged a little bit relative to that of the curated version. 
Normally, however, the non-curated version will have the earlier date.


Date: Sun, 27 Jan 2002 00:16:55 -0500 (EST)
From: David Wheeler
Subject: Re: CONSULT: Brucella_melitensis on ftp site

> Are the raw sequence data always same in non-curated and curated 
> flatfiles?
> 
> Is the annotation of orf's/proteins different between them?
> 
> Are there any new or withdrawn orf's or proteins in the curated
> flatfiles compared to non-curated ones?
> 
> My feeling is that no-one except original submitters can modify
> submitted data, so you cannot modify non-curated files, i.e. cannot
> modify them and increase the version number.
> 
> Because of that, you've introduced curated versions, which are just
> copies of original but public data so you are free to modify it. So
> once again, are the differences between non-curated and curated
> flatfiles only in structure of the file? I don't think so. Examples
> would be Listeria genomes or the 2 Agrobacterium's, if I remember
> right.

Initially, there should be no or very few differences, however, as time
goes by, differences in the annotation will materialize.  There may also
be differences in the sequence, if errors in the original sequence come to
light, but these differences should be very rare.

So, practically speaking, you will probably find few differences but,
since the purpose of the Refseq is to curate, there may well be some
differences.


Date: Mon, 17 Dec 2001 11:57:06 -0500 (EST)
From: Dawn Lipshultz
Subject: Re: Buggy date in Staphylococcus aureus N315

>>>> Hi, I've found there has been released Staphylococcus aureus
>>>> N315 on 01-JAN-1900, which is nonsense. I guss you had y2K bug.
>>>> 
>>>> 
>>>> Please see
>>>> 
>> ftp://ncbi.nlm.nih.gov/genbank/genomes/Bacteria/Staphylococcus_aureus_N315/BA000018.gbk
>> 
>>>> 
>>>> Can you please tell me the real release date?
>>>> 
>>>> Also, is newer the NC_xxxx for Staphylococcus aureus N315 under
>>>>  
>>>> ftp://ncbi.nlm.nih.gov/genomes/Bacteria/Staphylococcus_aureus_N315/
>>>>  or this BA000018 non-cured version?
>>>> 
>>>> 
>>>> LOCUS       BA000018  2814816 bp    DNA   circular  BCT
>>>> 01-JAN-1900 DEFINITION  Staphylococcus aureus strain N315,
>>>> complete genome.

>>> AP003129-AP003138. They are all dated June 2001.
>>> 
>>> The date for the record in the ftp file is April 2001. The record
>>> in GenBank (NC_002745) is dated October 2001. This version is
>>> apparently more updated than the one on the ftp site. Therefore,
>>> you may want to download the sequence from GenBank rather than
>>> the ftp site.
>>> 
>>> Regards, Dawn S. Lipshultz

>> I cannot find the record to which you refer in your message. When I
>>  did a search for accession number BA000018, I received results for
>>  accession numbers AP003129-AP003138. They are all dated June 2001.
>> 
>> 
>> The date for the record in the ftp file is April 2001. The record
>> in GenBank (NC_002745) is dated October 2001. This version is
>> apparently more updated than the one on the ftp site. Therefore,
>> you may want to download the sequence from GenBank rather than the
>> ftp site. Regards, Dawn S. Lipshultz

> 
> Hmm, but I do get: 
> http://www.ncbi.nlm.nih.gov:80/cgi-bin/Entrez/framik?db=genome&gi=179
> 
> 
> look at the "GenBank: NC_002745" text in left upper part of the
> window, it points to that OLD ftp file. The "RefSeq: NC_002745"
> points to the April 2001 version. So what is the right way to get the
> October 2001 release?
> 
> Where can I find the difference between NC_002745 from April compared
>  to NC_002745 from October?
> 
> What do you mean with "you may want to download the sequence from 
> GenBank rather than the ftp site."?
> 
> BOTH ftp directories at ftp://ncbi.nlm.nih.gov are outdated. I mean 
> the genomes/Bacteria/Staphylococcus_aureus_N315/NC_002745.* version 
> and also the 
> genbank/genomes/Bacteria/Staphylococcus_aureus_N315/BA000018.* 
> version.
> 
> The web links from www.ncbi.nlm.nih.gov:80/cgi-bin/Entrez/ point 
> anyway to the ftp site. Do you want to say that the ftp version
> aren't updated anymore?

The genome was originally released into the database on 4/20/2001
as 10 pieces with secondary accession number BA000018.  You can 
find these pieces in Entrez nucleotides by querying with BA000018.

The Genomes group here will fix the date on the record that is available
from Entrez genomes.

Regards,
Dawn


Date: Fri, 16 Nov 2001 16:09:59 -0500 (EST)
From: Susan Dombrowski
Subject: Re: Agrobacterium tumefaciens C58

> Dear colleague, I've noticed that there're somehow updated on Oct 17
> the genomic flatfiles of Agrobacterium tumefaciens C58 at 
> ftp://ncbi.nlm.nih.gov/genbank/genomes/Bacteria/Agrobacterium_tumefaciens/.
>  However, for example the AE007869.gbs does NOT self-explain what has
> been changed and also the VERSION number is not increased. Would you
> please explain what's the change, when can I find such information
> next time on web?
> 
> I've used the published sequence from your ftp site on 2001-08-29
> with same ID and would like to know, what differs.
> 
> LOCUS       AE007869  2841581 bp    DNA   circular  CON
> 17-OCT-2001 DEFINITION  Agrobacterium tumefaciens strain C58 circular
> chromosome, complete sequence. ACCESSION   AE007869 VERSION
> AE007869

Dear Colleague,
The version number of a sequence will *only* change if the content of the actual 
sequence has changed in any way since it was first made available. Although the 
date has changed, this date refers to the last time the actual record was 
manipulated by an NCBI staff member. Even if there is something simple, like 
adding a reference, changing a spelling mistake, etc., this will cause a change 
in the date field of the record. 

Thus, since the version has not changed, there are no differences to report.
Best Regards,
Susan


Date: Wed, 26 Jun 2002 11:04:48 -0400 (EDT)
From: Eric Sayers
Subject: Re: Mesorhizobium_loti flatfiles

>>>>> Hi,
>>>>>   I've found that you again silently changed flatfiles lying on your ftp
>>>>> some time ago without changing the revision number. Please apologize me,
>>>>> but this really causes troubles to other people working in this so called
>>>>> bioinformatics. :(
>>>>> 
>>>>> A week ago there was:
>>>>> 
>>>>> LOCUS       NC_002678            7036074 bp    DNA     circular BCT 10-SEP-2001
>>>>> DEFINITION  Mesorhizobium loti, complete genome.
>>>>> ACCESSION   NC_002678
>>>>> VERSION     NC_002678.1  GI:13470324
>>>>> 
>>>>> 
>>>>> and two other plasmid sequences. This yelds 7275 proteins.
>>>>> 
>>>>> But, last autumn there was:
>>>>> 
>>>>> LOCUS       NC_002678 7036074 bp    DNA   circular  BCT       28-MAR-2001
>>>>> DEFINITION  Mesorhizobium loti, complete genome.
>>>>> ACCESSION   NC_002678
>>>>> VERSION     NC_002678.1  GI:13470324
>>>>> 
>>>>> 
>>>>> That version had 7281 proteins in total.
>>>>> I have simple questions: "Why was NOT changed the VERSION number?".
>>>>>
>>>>> Do I understand it wrong, that it should get updated whenever a single
>>>>> character in the file contents is changed?
>>> 
>>>> The version number of a sequence only changes if the sequence itself is
>>>> modified. If anything else in the flat file is changed (ie spelling, authors,
>>>> annotations, etc) the version will not change. However, the modification date in
>>> 
>>> Sorry, do you under annotation also mean number of predicted genes, their
>>> coordinates(position) etc?
>>> 
>>>> the top line of the flat file will change for any of these modifications. (Note
>>>> that the dates are different in the file you display: Mar 28, 2001 vs Sept 10,
>>>> 2001.) I would track the modification date rather than or as well as the version
>>>> number to catch all changes in the files.
>>>> Regards,
>>>> Eric W. Sayers, Ph.D.
>>> 
>>> OK, but unless some of our programs have been buggy before or now (in
>>> either of those cases have failed to extract genes from flatfiles), I do
>>> not have an explanation for the differencies in amount of
>>> predicted/annotated genes.
>>> 
>>> I do not have anymore available the old flatfiles from Mar 28, but it
>>> seems to me that these were newly introduced in the Sept. 10 version:
>>> gi_15600768, gi_15600770, gi_15600769, gi_15600766, gi_15600767
>> 
>> Dear Colleague,
>> Again, the only reason the version number will change is if the sequence itself 
>> changes. The number of annotated/predicted genes is merely an annotation on the 
>> sequence, and does not change the sequence itself. Therefore, the version will 
>> not change when the number of annotations changes. The modification date on the 
>> flat file will (and did) change, of course.
>> 
>> Regards,
>> Eric W. Sayers, Ph.D.
> 
> Finally I've heard that from someone, thanks!
> Now just tell me, how can I figure out what changed between those
> different "date" releases? Is there a changelog available?
> I consider annotations changes very important.

We do not provide the details of flat file changes on our public websites, 
except for changes in the version number (ie actual sequence changes). In that 
particular case, all of the previous versions are linked to the current one. My 
advice to you if you want to chronicle non-sequence changes would be to check 
the flat files of interest periodically (by a script, for example) and look for 
changes in the modification dates. You could then simply compare the before and 
after flat files.

Regards,
Eric W. Sayers, Ph.D.


> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at:  
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id1_fetch.html
> 
> Here is an example:
> 
>> >id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD


From david at burt7259.freeserve.co.uk  Sun Apr 13 10:32:31 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sun, 13 Apr 2008 15:32:31 +0100
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce2$5400a710$0202a8c0@STUDYPC>
	<3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>
Message-ID: <000001c89d73$3b49eec0$0202a8c0@STUDYPC>

Hi Hilmar

 
Many thanks for info - tried a few things

 
1. First tried --safe flag

 
perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser root
--dbpass chicken --driver mysql --safe \

 --namespace "InterPro" --format interprosax interpro.xml

 
Still got same output as before

 
        ...deleting all relationships for InterPro

        ...parsing and loading InterPro

 
Can't call method "name" on an undefined value at load_ontology.pl line 914

 
Only 35 interpro entries entered into database

 
2. I am using bioperl 1.5.2

 
3. I downloaded Release 17.0, 20 March 2008 of the interpro.xml file from
ftp://ftp.ebi.ac.uk/pub/databases/interpro/

 
I did not send this file, sine it was ~10Mb gzipped

 
Dave

 
From david at burt7259.freeserve.co.uk  Sun Apr 13 10:53:43 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sun, 13 Apr 2008 15:53:43 +0100
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
Message-ID: <000001c89d76$319be060$0202a8c0@STUDYPC>

Hilmar

 
Also updated copy of bioperl - see output below

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ perl -MBio::Perl -le 'print Bio::Perl->VERSION;'

1.005002101

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl login

Logging in to :pserver:cvs at cvs.bioperl.org:2401/home/repository/bioperl

CVS password:

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ cd bioperl-live

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src/bioperl-live

$ cvs -q update -d -P -r bioperl-release-1-5-2

P Build.PL

P ModuleBuildBioperl.pm

P Bio/Root/Version.pm

cvs update: warning: t/data/taxdump/names.dmp was lost

U t/data/taxdump/names.dmp

cvs update: warning: t/data/taxdump/nodes.dmp was lost

U t/data/taxdump/nodes.dmp

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src/bioperl-live

$ perl -MBio::Perl -le 'print Bio::Perl->VERSION;'

1.0050021

 
Why is the VERSION 1.0050021 rather than 1.5.2 ?

 
Dave


From heikki at sanbi.ac.za  Wed Apr 16 07:36:16 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 16 Apr 2008 13:36:16 +0200
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
Message-ID: <200804161336.16879.heikki@sanbi.ac.za>

FYI,

Christoper Jones has just published 
[http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an 
article in Bioinformatics] about his 
[http://search.cpan.org/perldoc?Microarray Microarray perl module] in CPAN.

(The text added into BioPerl wiki.)

	-Heikki


On Friday 26 January 2007 16:05:01 Chris Fields wrote:
> Don't know if it's worth it, but could the microarray package be
> modified so that it deals with data generated from or interacts
> directly with Bioconductor (i.e. maybe including some specialized
> bioperl-run set of classes to run Bioconductor tasks, return
> lightweight bioperl microarray classes)?  Allen pointed out in a
> previous post that Bioconductor is the best pick for certain tasks,
> while Perl excels at others:
>
> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>
> Might be nice if we could merge both strengths together in some way.
>
> chris
>
> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
> >> Eh, there is some discussion activity on the list, but not much.  You
> >> are really better off moving to Bioconductor.
> >
> > Ok, thanks. I added that to the wiki page:
> >
> >     http://www.bioperl.org/wiki/Microarray_package
> >
> > j
> > seqlab.net
> > http://www.bioperl.org/wiki/User:Jhannah
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From heikki at sanbi.ac.za  Wed Apr 16 07:36:16 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 16 Apr 2008 13:36:16 +0200
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
Message-ID: <200804161336.16879.heikki@sanbi.ac.za>

FYI,

Christoper Jones has just published 
[http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an 
article in Bioinformatics] about his 
[http://search.cpan.org/perldoc?Microarray Microarray perl module] in CPAN.

(The text added into BioPerl wiki.)

	-Heikki


On Friday 26 January 2007 16:05:01 Chris Fields wrote:
> Don't know if it's worth it, but could the microarray package be
> modified so that it deals with data generated from or interacts
> directly with Bioconductor (i.e. maybe including some specialized
> bioperl-run set of classes to run Bioconductor tasks, return
> lightweight bioperl microarray classes)?  Allen pointed out in a
> previous post that Bioconductor is the best pick for certain tasks,
> while Perl excels at others:
>
> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>
> Might be nice if we could merge both strengths together in some way.
>
> chris
>
> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
> >> Eh, there is some discussion activity on the list, but not much.  You
> >> are really better off moving to Bioconductor.
> >
> > Ok, thanks. I added that to the wiki page:
> >
> >     http://www.bioperl.org/wiki/Microarray_package
> >
> > j
> > seqlab.net
> > http://www.bioperl.org/wiki/User:Jhannah
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From pan.mueller at yahoo.de  Wed Apr 16 08:34:51 2008
From: pan.mueller at yahoo.de (=?iso-8859-1?Q?Peter_M=FCller?=)
Date: Wed, 16 Apr 2008 12:34:51 +0000 (GMT)
Subject: [Bioperl-l] load_seqdatabase.pl --pipeline
Message-ID: <297809.47580.qm@web28203.mail.ukl.yahoo.com>

Dear list,

a want to add gene symbols to unigene-cluster which were in a biosql database and lacks this information.

So one way is to make a post-update script:
my $adp = $db->get_object_adaptor('Bio::ClusterI');
my $pseq = $adp->find_by_primary_key(n);
$adp->remove($pseq);
$pseq->gene('symbol');
$adp->store($pseq);
$adp->commit();

O.k., this works (I ask me why to remove the cluster first - bug or feature...?)

Second way - perhaps:
Using the --pipeline option, but it looks like useable only for seq-objects (Bio::Factory::SeqProcessoI) right?

regards
pan


      Machen Sie Yahoo! zu Ihrer Startseite. Los geht's: 
http://de.yahoo.com/set


From cjfields at uiuc.edu  Wed Apr 16 11:00:51 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 16 Apr 2008 10:00:51 -0500
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <479BD5A4-9C9A-4733-889D-65942F24A7F3@uiuc.edu>

That would be worth looking into at some point, if anyone's interested  
(though it may be best to build a 'bridging' module).  Wonder if it  
uses BioConductor and, if not, how performance is vs BioConductor?

chris

On Apr 16, 2008, at 6:36 AM, Heikki Lehvaslaiho wrote:

> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/ 
> 24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module]  
> in CPAN.
>
> (The text added into BioPerl wiki.)
>
> 	-Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>>> On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>>>> Eh, there is some discussion activity on the list, but not much.   
>>>> You
>>>> are really better off moving to Bioconductor.
>>>
>>> Ok, thanks. I added that to the wiki page:
>>>
>>>    http://www.bioperl.org/wiki/Microarray_package
>>>
>>> j
>>> seqlab.net
>>> http://www.bioperl.org/wiki/User:Jhannah
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From j-keller2 at md.northwestern.edu  Wed Apr 16 12:12:27 2008
From: j-keller2 at md.northwestern.edu (Jacob Keller)
Date: Wed, 16 Apr 2008 11:12:27 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>

Hello All,

I am new to this list, so am not totally sure this is the right forum, so 
please forgive if this is not the right place to asl the following question: 
I am seeking to get all sequences that have a given domain architecture, or 
at least that contain two given domains. I have thought of a few ways to do 
this.

1. Blast/Psi-blast for each domain, then compare the results for common 
sequences between the two lists, and fetch those. I would need to write a 
(simple) script to do this, but would prefer not to re-invent the wheel.

2. Search with a paradigm sequence of desired architecture/domain 
composition, somehow tweaking the psiblast parameters to find only matches 
over the whole search sequence, thereby finding both desired domains. I am 
not sure how to tweak blast to do this, though.

3. Pfam has this capability, i.e. to show all domains with a given 
architecture, but it is difficult to get at the actual sequences or even a 
list of accession numbers.

Does anybody have any suggestions as to how optimally to get these seq's?

Thanks for your consideration,

Jacob

*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-keller2 at northwestern.edu
*******************************************

----- Original Message ----- 
From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za>
To: <bioperl-l at lists.open-bio.org>
Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay Hannah" 
<jay at jays.net>; <bioperl-l at bioperl.org>
Sent: Wednesday, April 16, 2008 6:36 AM
Subject: Re: [Bioperl-l] bioperl-microarray: status?


> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module] in 
> CPAN.
>
> (The text added into BioPerl wiki.)
>
> -Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>> >> Eh, there is some discussion activity on the list, but not much.  You
>> >> are really better off moving to Bioconductor.
>> >
>> > Ok, thanks. I added that to the wiki page:
>> >
>> >     http://www.bioperl.org/wiki/Microarray_package
>> >
>> > j
>> > seqlab.net
>> > http://www.bioperl.org/wiki/User:Jhannah
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From j-keller2 at md.northwestern.edu  Wed Apr 16 12:12:27 2008
From: j-keller2 at md.northwestern.edu (Jacob Keller)
Date: Wed, 16 Apr 2008 11:12:27 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>

Hello All,

I am new to this list, so am not totally sure this is the right forum, so 
please forgive if this is not the right place to asl the following question: 
I am seeking to get all sequences that have a given domain architecture, or 
at least that contain two given domains. I have thought of a few ways to do 
this.

1. Blast/Psi-blast for each domain, then compare the results for common 
sequences between the two lists, and fetch those. I would need to write a 
(simple) script to do this, but would prefer not to re-invent the wheel.

2. Search with a paradigm sequence of desired architecture/domain 
composition, somehow tweaking the psiblast parameters to find only matches 
over the whole search sequence, thereby finding both desired domains. I am 
not sure how to tweak blast to do this, though.

3. Pfam has this capability, i.e. to show all domains with a given 
architecture, but it is difficult to get at the actual sequences or even a 
list of accession numbers.

Does anybody have any suggestions as to how optimally to get these seq's?

Thanks for your consideration,

Jacob

*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-keller2 at northwestern.edu
*******************************************

----- Original Message ----- 
From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za>
To: <bioperl-l at lists.open-bio.org>
Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay Hannah" 
<jay at jays.net>; <bioperl-l at bioperl.org>
Sent: Wednesday, April 16, 2008 6:36 AM
Subject: Re: [Bioperl-l] bioperl-microarray: status?


> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module] in 
> CPAN.
>
> (The text added into BioPerl wiki.)
>
> -Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>> >> Eh, there is some discussion activity on the list, but not much.  You
>> >> are really better off moving to Bioconductor.
>> >
>> > Ok, thanks. I added that to the wiki page:
>> >
>> >     http://www.bioperl.org/wiki/Microarray_package
>> >
>> > j
>> > seqlab.net
>> > http://www.bioperl.org/wiki/User:Jhannah
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From frederic.romagne at gmail.com  Wed Apr 16 13:25:18 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 16 Apr 2008 12:25:18 -0500
Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
Message-ID: <1208366718.19084.15.camel@kiss-laptop>

Hello,
i made a program which use Bio::Index::GenBank and i tested it under
unix, that worked well.

But i have to launch it under windows and it seems not to work on.

Here is the problem : 

my $dbobj = Bio::Index::Abstract->new("Data/$db");
?my $seq = $dbobj->get_Seq_by_acc($id);
print $seq->display_id."\n";

did not print the same number than $id !!! So i don't work on the
sequence expected...

I use the SVN sources on unix and the Perl package manager for
windows...

Thanks.


From cjfields at uiuc.edu  Wed Apr 16 13:52:59 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 16 Apr 2008 12:52:59 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
Message-ID: <BAA878A0-94B4-481F-B01C-A12086FD41E3@uiuc.edu>

You can try CDART:

http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi?cmd=rps

There are probably other tools out there as well.

If you want to roll your own, you can use bioperl wrappers for all of  
these (Bio::Tools::Run::StandAloneBlast is in bioperl-live,  
Bio::Tools::Run::Hmmer in bioperl-run), tweaking the parameters as you  
see fit, and either parse while running them or store the file for  
parsing later using Bio::SearchIO.  Personally, I wouldn't go with (2)  
unless you are absolutely sure the domains are found only once per  
sequence, are spatially conserved, and don't overlap.  For instance,  
with many proteins you could have a domain structure like dom1-dom2,  
dom2-dom1, dom1-dom1-dom2, etc.

If you just want accessions from Pfam's Stockholm format (which are  
UniProt, I believe) you can get at accessions using  
Bio::AlignIO::stockholm (using perl 5.10):

use Bio::AlignIO;
use feature 'say';

my $file = shift || die "Must pass file as argument\n";

my $in = Bio::AlignIO->new(-format => 'stockholm',
                            -file => $file);

while (my $aln = $in->next_aln) {
     my @accs;
     for my $seq ($aln->each_seq) {
         push @accs, $seq->accession_number;
     }
     say join(',', at accs);
}

chris

On Apr 16, 2008, at 11:12 AM, Jacob Keller wrote:

> Hello All,
>
> I am new to this list, so am not totally sure this is the right  
> forum, so please forgive if this is not the right place to asl the  
> following question: I am seeking to get all sequences that have a  
> given domain architecture, or at least that contain two given  
> domains. I have thought of a few ways to do this.
>
> 1. Blast/Psi-blast for each domain, then compare the results for  
> common sequences between the two lists, and fetch those. I would  
> need to write a (simple) script to do this, but would prefer not to  
> re-invent the wheel.
>
> 2. Search with a paradigm sequence of desired architecture/domain  
> composition, somehow tweaking the psiblast parameters to find only  
> matches over the whole search sequence, thereby finding both desired  
> domains. I am not sure how to tweak blast to do this, though.
>
> 3. Pfam has this capability, i.e. to show all domains with a given  
> architecture, but it is difficult to get at the actual sequences or  
> even a list of accession numbers.
>
> Does anybody have any suggestions as to how optimally to get these  
> seq's?
>
> Thanks for your consideration,
>
> Jacob
>
> *******************************************
> Jacob Pearson Keller
> Northwestern University
> Medical Scientist Training Program
> Dallos Laboratory
> F. Searle 1-240
> 2240 Campus Drive
> Evanston IL 60208
> lab: 847.491.2438
> cel: 773.608.9185
> email: j-keller2 at northwestern.edu
> *******************************************
>
> ----- Original Message ----- From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za 
> >
> To: <bioperl-l at lists.open-bio.org>
> Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay  
> Hannah" <jay at jays.net>; <bioperl-l at bioperl.org>
> Sent: Wednesday, April 16, 2008 6:36 AM
> Subject: Re: [Bioperl-l] bioperl-microarray: status?
>
>
>> FYI,
>>
>> Christoper Jones has just published
>> [http://bioinformatics.oxfordjournals.org/cgi/content/short/ 
>> 24/8/1102 an
>> article in Bioinformatics] about his
>> [http://search.cpan.org/perldoc?Microarray Microarray perl module]  
>> in CPAN.
>>
>> (The text added into BioPerl wiki.)
>>
>> -Heikki
>>
>>
>> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>>> Don't know if it's worth it, but could the microarray package be
>>> modified so that it deals with data generated from or interacts
>>> directly with Bioconductor (i.e. maybe including some specialized
>>> bioperl-run set of classes to run Bioconductor tasks, return
>>> lightweight bioperl microarray classes)?  Allen pointed out in a
>>> previous post that Bioconductor is the best pick for certain tasks,
>>> while Perl excels at others:
>>>
>>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>>
>>> Might be nice if we could merge both strengths together in some way.
>>>
>>> chris
>>>
>>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>>> >> Eh, there is some discussion activity on the list, but not  
>>> much.  You
>>> >> are really better off moving to Bioconductor.
>>> >
>>> > Ok, thanks. I added that to the wiki page:
>>> >
>>> >     http://www.bioperl.org/wiki/Microarray_package
>>> >
>>> > j
>>> > seqlab.net
>>> > http://www.bioperl.org/wiki/User:Jhannah
>>> >
>>> > _______________________________________________
>>> > Bioperl-l mailing list
>>> > Bioperl-l at lists.open-bio.org
>>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> -- 
>> ______ _/      _/ 
>> _____________________________________________________
>>     _/      _/
>>    _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>   _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>  _/  _/  _/  SANBI, South African National Bioinformatics Institute
>> _/  _/  _/  University of Western Cape, South Africa
>>    _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/ 
>> ________________________________________________________
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From David.Messina at sbc.su.se  Wed Apr 16 14:23:27 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 16 Apr 2008 20:23:27 +0200
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
Message-ID: <628aabb70804161123s453bd96bqd2213b938dfdb3a2@mail.gmail.com>

Hey Jacob,

This forum is mostly geared toward the BioPerl software package rather than
general bioinformatics assistance.

That being said, I would recommend using Pfam's Sequence Search to determine
the domain content of your sequences and then simply looking at those which
have the same two domains of interest.

If there are more sequences matching this criterion than can be examined
manually, you could write up something (potentially using BioPerl) to then
look at the relative order and number of those domains in your sequences.

However, if these sequences have UniProt IDs, you can start with the domains
and Pfam will hand you a list of all the UniProt seqs having those domains.
On the Pfam website's main page, click on "Help" (right side of menu at the
top of the page) and then "Tools and Services" (left side menu).


Dave


From Russell.Smithies at agresearch.co.nz  Wed Apr 16 16:49:49 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Apr 2008 08:49:49 +1200
Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
In-Reply-To: <1208366718.19084.15.camel@kiss-laptop>
References: <1208366718.19084.15.camel@kiss-laptop>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>

Did you check the format of your input file?
i.e. DOS or UNIX line endings?

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-
> bio.org] On Behalf Of Fr?d?ric Romagn?
> Sent: Thursday, 17 April 2008 5:25 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> 
> Hello,
> i made a program which use Bio::Index::GenBank and i tested it under
> unix, that worked well.
> 
> But i have to launch it under windows and it seems not to work on.
> 
> Here is the problem :
> 
> my $dbobj = Bio::Index::Abstract->new("Data/$db");
> ?my $seq = $dbobj->get_Seq_by_acc($id);
> print $seq->display_id."\n";
> 
> did not print the same number than $id !!! So i don't work on the
> sequence expected...
> 
> I use the SVN sources on unix and the Perl package manager for
> windows...
> 
> Thanks.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From frederic.romagne at gmail.com  Wed Apr 16 17:39:07 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 16 Apr 2008 16:39:07 -0500
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
Message-ID: <1208381947.16620.6.camel@kiss-laptop>

Well, if with input file you mean the database used, it's created
with ?Bio::Index::GenBank from a ncbi FTP's genbank file.

$id is an accession number read from a file but i chomp the line...

I am trying to install the svn version of bioperl under windows to see
if there is an improvement.

Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
> Did you check the format of your input file?
> i.e. DOS or UNIX line endings?
> 
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-
> > bio.org] On Behalf Of Fr?d?ric Romagn?
> > Sent: Thursday, 17 April 2008 5:25 a.m.
> > To: bioperl-l at lists.open-bio.org
> > Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> > 
> > Hello,
> > i made a program which use Bio::Index::GenBank and i tested it under
> > unix, that worked well.
> > 
> > But i have to launch it under windows and it seems not to work on.
> > 
> > Here is the problem :
> > 
> > my $dbobj = Bio::Index::Abstract->new("Data/$db");
> > ?my $seq = $dbobj->get_Seq_by_acc($id);
> > print $seq->display_id."\n";
> > 
> > did not print the same number than $id !!! So i don't work on the
> > sequence expected...
> > 
> > I use the SVN sources on unix and the Perl package manager for
> > windows...
> > 
> > Thanks.
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


From hubert.gaynor at yahoo.com  Thu Apr 17 02:19:11 2008
From: hubert.gaynor at yahoo.com (Hubert Gaynor)
Date: Wed, 16 Apr 2008 23:19:11 -0700 (PDT)
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
Message-ID: <657734.41592.qm@web46008.mail.sp1.yahoo.com>

Hi,

As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier? 

Thanks!

Hubert.


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


From sdavis2 at mail.nih.gov  Thu Apr 17 06:36:32 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 17 Apr 2008 06:36:32 -0400
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
In-Reply-To: <657734.41592.qm@web46008.mail.sp1.yahoo.com>
References: <657734.41592.qm@web46008.mail.sp1.yahoo.com>
Message-ID: <264855a00804170336o2a2bcff9xfcb05a33bac4c8dc@mail.gmail.com>

On Thu, Apr 17, 2008 at 2:19 AM, Hubert Gaynor <hubert.gaynor at yahoo.com> wrote:
> Hi,
>
>  As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier?
>

formatdb is used to make a representation that can be used efficiently
by blast.  That representation already makes blast faster.  MySQL
can't be used for such things.  As for speeding blast, if you have a
multiprocessor machine, you can take advantage of those using blast
and increasing the number of processors.  Also, while blast is a very
versatile program, it is not the only alignment program available.
Depending on your needs, you could look at other programs such as blat
or gmap that can be 2-3 orders of magnitude faster than blast.

Sean


From stefan.kirov at bms.com  Thu Apr 17 09:40:29 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 09:40:29 -0400
Subject: [Bioperl-l] bioperl-db woes
Message-ID: <4807534D.80105@bms.com>

I'm having problems passing all the tests for bioperl-db. There are 2
distinct errors, first one:
Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
   ***Which by the way is embed deep into several layers of eval, so I
am getting the actual error from the test:
    ***t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"
via package "Bio::Ontology::Term" at    
       
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
line 552, <GEN0> line 78.
       or
       ------------- EXCEPTION: Bio::Root::Exception -------------

    MSG: Annotation of class Bio::Annotation::Collection not
    type-mapped. Internal error?
    STACK: Error::throw
    STACK: Bio::Root::Root::throw
    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
    STACK:
    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
    STACK: Bio::DB::Persistent::PersistentObject::store
    Bio/DB/Persistent/PersistentObject.pm:271
    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
    Bio/DB/BioSQL/SeqAdaptor.pm:224
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
    STACK: Bio::DB::Persistent::PersistentObject::create
    Bio/DB/Persistent/PersistentObject.pm:244
    STACK: t/04swiss.t:36
    -----------------------------------------------------------

It turns out the adaptor is really not there???
My bioperl-db is from
dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
bioperl-db (revision 14661)
Is this module being deprecated (I am sure it is not) my download
incomplete....?
The other problem was:
DBD::Oracle::st execute failed: ORA-02292: integrity constraint
(BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
ParamValues: :p1=9606] at
/home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 320.
not ok 76
# Test 76 got: <UNDEF> (t/02species.t at line 71)
I have not tried to debug this one....
Thanks!
Stefan


From stefan.kirov at bms.com  Thu Apr 17 10:18:30 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 10:18:30 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
Message-ID: <Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>


On Thu, 17 Apr 2008, Chris Fields wrote:

> The 'get_dbxrefs' problem looks related to recent changes I made when rolling 
> back the significant feature/annotation changes introduced just prior to the 
> 1.5 release, none which were fully implemented.  I can check that one out. 
> Odd though; these passed for me, but I'm using MySQL not oracle.
get_dbxref is not the problem- I think the error message is misleading:
kirovs at horta:~/bioperl-db> grep get_dbxrefs 
/home/kirovs/bioperl-live/Bio/Ontology/Term.pm
            get_dbxrefs() instead, which handles both strings and DBLink
                       "Use get_dbxrefs() instead");
     $self->get_dbxrefs($context);
=head2 get_dbxrefs
  Title   : get_dbxrefs()
  Usage   : @ds = $term->get_dbxrefs();
sub get_dbxrefs {
} # get_dbxrefs
     my @old = $self->get_dbxrefs($context);
sub each_dblink {shift->throw("use of each_dblink() is deprecated; use 
get_dbxrefs() instead")}

So it is there.
In any case I debugged and tracked that down to the RichSeq adaptor module 
missing. It is not in the distro I downloaded, so I think this is my 
problem. It is a different question why...
I looked at different repos (SVN, CVS, trunk, different tags) and I did 
not see RichSeq.pm. I am not sure what is going on. Perhaps Hilmar will be 
able to help when he is around.
Thanks for the help Chris.... 
Stefan

>
> You may want to make sure you are using bioperl-live and that there isn't an 
> older bioperl installation getting into the mix.
>
> chris
>
> On Apr 17, 2008, at 8:40 AM, Stefan Kirov wrote:
>
>> I'm having problems passing all the tests for bioperl-db. There are 2
>> distinct errors, first one:
>> Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
>>  ***Which by the way is embed deep into several layers of eval, so I
>> am getting the actual error from the test:
>>   ***t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::Term" at
>> 
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 78.
>>      or
>>      ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>   MSG: Annotation of class Bio::Annotation::Collection not
>>   type-mapped. Internal error?
>>   STACK: Error::throw
>>   STACK: Bio::Root::Root::throw
>>   /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>   STACK:
>>   Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>   STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>   STACK: Bio::DB::Persistent::PersistentObject::store
>>   Bio/DB/Persistent/PersistentObject.pm:271
>>   STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>   Bio/DB/BioSQL/SeqAdaptor.pm:224
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>   STACK: Bio::DB::Persistent::PersistentObject::create
>>   Bio/DB/Persistent/PersistentObject.pm:244
>>   STACK: t/04swiss.t:36
>>   -----------------------------------------------------------
>> 
>> It turns out the adaptor is really not there???
>> My bioperl-db is from
>> dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
>> bioperl-db (revision 14661)
>> Is this module being deprecated (I am sure it is not) my download
>> incomplete....?
>> The other problem was:
>> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
>> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
>> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
>> ParamValues: :p1=9606] at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
>> line 320.
>> not ok 76
>> # Test 76 got: <UNDEF> (t/02species.t at line 71)
>> I have not tried to debug this one....
>> Thanks!
>> Stefan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>


From cjfields at uiuc.edu  Thu Apr 17 09:59:57 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 08:59:57 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <4807534D.80105@bms.com>
References: <4807534D.80105@bms.com>
Message-ID: <82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>

The 'get_dbxrefs' problem looks related to recent changes I made when  
rolling back the significant feature/annotation changes introduced  
just prior to the 1.5 release, none which were fully implemented.  I  
can check that one out.  Odd though; these passed for me, but I'm  
using MySQL not oracle.

You may want to make sure you are using bioperl-live and that there  
isn't an older bioperl installation getting into the mix.

chris

On Apr 17, 2008, at 8:40 AM, Stefan Kirov wrote:

> I'm having problems passing all the tests for bioperl-db. There are 2
> distinct errors, first one:
> Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
>   ***Which by the way is embed deep into several layers of eval, so I
> am getting the actual error from the test:
>    ***t/04swiss.........ok 3/52Can't locate object method  
> "get_dbxrefs"
> via package "Bio::Ontology::Term" at
>
> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
> line 552, <GEN0> line 78.
>       or
>       ------------- EXCEPTION: Bio::Root::Exception -------------
>
>    MSG: Annotation of class Bio::Annotation::Collection not
>    type-mapped. Internal error?
>    STACK: Error::throw
>    STACK: Bio::Root::Root::throw
>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>    STACK:
>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>    STACK: Bio::DB::Persistent::PersistentObject::store
>    Bio/DB/Persistent/PersistentObject.pm:271
>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>    STACK: Bio::DB::Persistent::PersistentObject::create
>    Bio/DB/Persistent/PersistentObject.pm:244
>    STACK: t/04swiss.t:36
>    -----------------------------------------------------------
>
> It turns out the adaptor is really not there???
> My bioperl-db is from
> dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
> bioperl-db (revision 14661)
> Is this module being deprecated (I am sure it is not) my download
> incomplete....?
> The other problem was:
> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
> ParamValues: :p1=9606] at
> /home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm
> line 320.
> not ok 76
> # Test 76 got: <UNDEF> (t/02species.t at line 71)
> I have not tried to debug this one....
> Thanks!
> Stefan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stefan.kirov at bms.com  Thu Apr 17 10:52:32 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 10:52:32 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>
References: <4807534D.80105@bms.com>
	<9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>
Message-ID: <Pine.WNT.4.64.0804171052070.2732@A161887.one.ads.bms.com>

That is correct and I assumed I should not be concerned with this error.
Thanks
Stefan

On Thu, 17 Apr 2008, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 9:40 AM, Stefan Kirov wrote:
>> The other problem was:
>> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
>> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
>> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
>> ParamValues: :p1=9606] at
>
>
> This sounds like you are running the tests against a non-empty database?
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>


From hlapp at gmx.net  Thu Apr 17 10:47:58 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Apr 2008 10:47:58 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
Message-ID: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>


On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
> In any case I debugged and tracked that down to the RichSeq adaptor  
> module missing.


That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
Bio::Seq and a SeqAdaptor is present.

I'm afraid it gets stuck somewhere else and frankly I didn't see the  
RichSeqAdaptor failing to load in your stack trace:

>        ------------- EXCEPTION: Bio::Root::Exception -------------
>
>     MSG: Annotation of class Bio::Annotation::Collection not
>     type-mapped. Internal error?
>     STACK: Error::throw
>     STACK: Bio::Root::Root::throw
>     /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>     STACK:
>     Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>     Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>     STACK:  
> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>     Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>     STACK: Bio::DB::Persistent::PersistentObject::store
>     Bio/DB/Persistent/PersistentObject.pm:271
>     STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>     Bio/DB/BioSQL/SeqAdaptor.pm:224
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>     STACK: Bio::DB::Persistent::PersistentObject::create
>     Bio/DB/Persistent/PersistentObject.pm:244
>     STACK: t/04swiss.t:36
>     -----------------------------------------------------------

What that tells me is that when bioperl-db tries to store the  
annotation bundle of the (SwissProt) sequence, one of the annotations  
that it encounters is of type Bio::Annotation::Collection. At present  
bioperl-db doesn't know what to do with it; i.e., bioperl-db can't  
yet handle hierarchical annotation collections (collections within  
collections).

I believe this is due to recent changes in how the GN line is parsed  
in BioPerl - Chris does this ring the right bell? I thought though  
you had built in a method would allow flattening out?

It's worth noting that BioSQL itself can't really represent nested  
annotation collections other than by using ontology terms and their  
hierarchy, which at present I think isn't really appropriate, but I  
have to think through the issue more. In other words, in BioSQL you  
can't directly tie together a bunch of qualifier value pairs into a  
"bag" and then nest this bag within another. The way to make this  
work with the current schema is to flatten out the nesting.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Thu Apr 17 10:48:52 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Apr 2008 10:48:52 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <4807534D.80105@bms.com>
References: <4807534D.80105@bms.com>
Message-ID: <9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>


On Apr 17, 2008, at 9:40 AM, Stefan Kirov wrote:
> The other problem was:
> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
> ParamValues: :p1=9606] at


This sounds like you are running the tests against a non-empty database?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From stefan.kirov at bms.com  Thu Apr 17 11:28:42 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 11:28:42 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
Message-ID: <Pine.WNT.4.64.0804171052430.2732@A161887.one.ads.bms.com>

Hilmar,
I think I saw what happens with this adaptor-
In Bio::DB::BioSQL::DBAdaptor::_load_object_adaptor (call from 
create_persistent) there is request that this module is loaded:
Bio/DB/BioSQL/RichSeqAdaptor.pm
There is no such module... This always fails, but since it is evaled, 
there is no actual error- instead. Perhaps this is leftover...?
This got me fooled...

I guess Chris could be right-
  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key is 
being passed Bio::Annotation::Collection as a value for $obj->obj(). Or 
recursing too far?
Anyway, I am just guessing here- I do not know the architecture of 
bioperl-db...
Thanks again for the help...
Stefan

  On Thu, 17 Apr 2008, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>> In any case I debugged and tracked that down to the RichSeq adaptor module 
>> missing.
>
>
> That almost can't be the problem. Every Bio::Seq::RichSeq is-a Bio::Seq and a 
> SeqAdaptor is present.
>
> I'm afraid it gets stuck somewhere else and frankly I didn't see the 
> RichSeqAdaptor failing to load in your stack trace:
>
>>       ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>    MSG: Annotation of class Bio::Annotation::Collection not
>>    type-mapped. Internal error?
>>    STACK: Error::throw
>>    STACK: Bio::Root::Root::throw
>>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>    STACK:
>>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>    STACK: Bio::DB::Persistent::PersistentObject::store
>>    Bio/DB/Persistent/PersistentObject.pm:271
>>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::Persistent::PersistentObject::create
>>    Bio/DB/Persistent/PersistentObject.pm:244
>>    STACK: t/04swiss.t:36
>>    -----------------------------------------------------------
>
> What that tells me is that when bioperl-db tries to store the annotation 
> bundle of the (SwissProt) sequence, one of the annotations that it encounters 
> is of type Bio::Annotation::Collection. At present bioperl-db doesn't know 
> what to do with it; i.e., bioperl-db can't yet handle hierarchical annotation 
> collections (collections within collections).
>
> I believe this is due to recent changes in how the GN line is parsed in 
> BioPerl - Chris does this ring the right bell? I thought though you had built 
> in a method would allow flattening out?
>
> It's worth noting that BioSQL itself can't really represent nested annotation 
> collections other than by using ontology terms and their hierarchy, which at 
> present I think isn't really appropriate, but I have to think through the 
> issue more. In other words, in BioSQL you can't directly tie together a bunch 
> of qualifier value pairs into a "bag" and then nest this bag within another. 
> The way to make this work with the current schema is to flatten out the 
> nesting.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>


From cjfields at uiuc.edu  Thu Apr 17 12:26:41 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 11:26:41 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
Message-ID: <AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>


On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>> In any case I debugged and tracked that down to the RichSeq adaptor  
>> module missing.
>
>
> That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
> Bio::Seq and a SeqAdaptor is present.
>
> I'm afraid it gets stuck somewhere else and frankly I didn't see the  
> RichSeqAdaptor failing to load in your stack trace:
>
>>       ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>    MSG: Annotation of class Bio::Annotation::Collection not
>>    type-mapped. Internal error?
>>    STACK: Error::throw
>>    STACK: Bio::Root::Root::throw
>>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>    STACK:
>>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>    STACK:  
>> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>    STACK: Bio::DB::Persistent::PersistentObject::store
>>    Bio/DB/Persistent/PersistentObject.pm:271
>>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::Persistent::PersistentObject::create
>>    Bio/DB/Persistent/PersistentObject.pm:244
>>    STACK: t/04swiss.t:36
>>    -----------------------------------------------------------
>
> What that tells me is that when bioperl-db tries to store the  
> annotation bundle of the (SwissProt) sequence, one of the  
> annotations that it encounters is of type  
> Bio::Annotation::Collection. At present bioperl-db doesn't know what  
> to do with it; i.e., bioperl-db can't yet handle hierarchical  
> annotation collections (collections within collections).
>
> I believe this is due to recent changes in how the GN line is parsed  
> in BioPerl - Chris does this ring the right bell? I thought though  
> you had built in a method would allow flattening out

This appears to be using an older bioperl-live checkout, one where  
Heikki changed GN parsing to use a nested Annotation::Collection.  I  
reverted that back in a later commit to svn specifically b/c of  
bioperl-db problems.  bioperl-live's swiss.pm now uses a new subclass  
of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that  
represents nested values via Data::Stag's itext output (we can change  
that to alternatives if needed).

Here are the last few relevant revisions in bioperl-live's main trunk  
(mine is the latest):

------------------------------------------------------------------------
r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1  
line

bug 1825: updating swiss.pm/tests to try out TagTree (passes all  
tests).  Need to update Handler.t and related modules still...
------------------------------------------------------------------------
r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line

documentation for the GN line parsing and management
------------------------------------------------------------------------
r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line

GN (Gene Name) line parsing rewrite. Breaks backward compatibility.  
Can now deal with >1 gene per entry and four categories of names per  
gene. Parses old style syntax (...OR ... OR ... ) into one gene name  
and synonyms for each gene. Docs to follow.

....

I just updated all code from dev and reran bioperl-db tests w/o  
problems.  Maybe someone else could do the same to see what happens?

> It's worth noting that BioSQL itself can't really represent nested  
> annotation collections other than by using ontology terms and their  
> hierarchy, which at present I think isn't really appropriate, but I  
> have to think through the issue more. In other words, in BioSQL you  
> can't directly tie together a bunch of qualifier value pairs into a  
> "bag" and then nest this bag within another. The way to make this  
> work with the current schema is to flatten out the nesting.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Might be worth looking into for a future BioSQL release, but we have a  
decent workaround in place for now, as long as it works cross-platform  
and cross-RDB.

chris


From stefan.kirov at bms.com  Thu Apr 17 12:40:14 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 12:40:14 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
Message-ID: <Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>

Hilmar,
sorry, I missed the part after the stack trace... In any case this is 
still problem for me after I updated bioperl-live.
I see this with a number of other tests:
t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 78.
t/04swiss.........dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 6-52
         Failed 47/52 tests, 9.62% okay
t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 72.
t/05seqfeature....FAILED tests 9-48
         Failed 40/48 tests, 16.67% okay
t/06comment.......ok
t/07dblink........ok
t/08genbank.......ok
t/09fuzzy2........ok
t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1420.
t/10ensembl.......dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 3-15
         Failed 13/15 tests, 13.33% okay
t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs" via 
package "Bio::Annotation::OntologyTerm" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1.
t/11locuslink.....dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 5-110
         Failed 106/110 tests, 3.64% okay
t/12ontology......ok 1/738Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::GOterm" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 98.
t/12ontology......dubious
         Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 5-738
         Failed 734/738 tests, 0.54% okay
t/13remove........ok 2/59Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 145.
t/13remove........FAILED tests 11-59
         Failed 49/59 tests, 16.95% okay
t/14query.........ok
t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1.
t/15cluster.......dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 6-160
         Failed 155/160 tests, 3.12% okay
t/16obda..........ok

On Thu, 17 Apr 2008, Chris Fields wrote:

>
> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>
>> 
>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>> In any case I debugged and tracked that down to the RichSeq adaptor module 
>>> missing.
>> 
>> 
>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a Bio::Seq and 
>> a SeqAdaptor is present.
>> 
>> I'm afraid it gets stuck somewhere else and frankly I didn't see the 
>> RichSeqAdaptor failing to load in your stack trace:
>>
>>>      ------------- EXCEPTION: Bio::Root::Exception -------------
>>>
>>>   MSG: Annotation of class Bio::Annotation::Collection not
>>>   type-mapped. Internal error?
>>>   STACK: Error::throw
>>>   STACK: Bio::Root::Root::throw
>>>   /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>   STACK:
>>>   Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>   STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>   STACK: Bio::DB::Persistent::PersistentObject::store
>>>   Bio/DB/Persistent/PersistentObject.pm:271
>>>   STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>   Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>   STACK: Bio::DB::Persistent::PersistentObject::create
>>>   Bio/DB/Persistent/PersistentObject.pm:244
>>>   STACK: t/04swiss.t:36
>>>   -----------------------------------------------------------
>> 
>> What that tells me is that when bioperl-db tries to store the annotation 
>> bundle of the (SwissProt) sequence, one of the annotations that it 
>> encounters is of type Bio::Annotation::Collection. At present bioperl-db 
>> doesn't know what to do with it; i.e., bioperl-db can't yet handle 
>> hierarchical annotation collections (collections within collections).
>> 
>> I believe this is due to recent changes in how the GN line is parsed in 
>> BioPerl - Chris does this ring the right bell? I thought though you had 
>> built in a method would allow flattening out
>
> This appears to be using an older bioperl-live checkout, one where Heikki 
> changed GN parsing to use a nested Annotation::Collection.  I reverted that 
> back in a later commit to svn specifically b/c of bioperl-db problems. 
> bioperl-live's swiss.pm now uses a new subclass of 
> Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that represents 
> nested values via Data::Stag's itext output (we can change that to 
> alternatives if needed).
>
> Here are the last few relevant revisions in bioperl-live's main trunk (mine 
> is the latest):
>
> ------------------------------------------------------------------------
> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1 line
>
> bug 1825: updating swiss.pm/tests to try out TagTree (passes all tests). 
> Need to update Handler.t and related modules still...
> ------------------------------------------------------------------------
> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line
>
> documentation for the GN line parsing and management
> ------------------------------------------------------------------------
> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line
>
> GN (Gene Name) line parsing rewrite. Breaks backward compatibility. Can now 
> deal with >1 gene per entry and four categories of names per gene. Parses old 
> style syntax (...OR ... OR ... ) into one gene name and synonyms for each 
> gene. Docs to follow.
>
> ....
>
> I just updated all code from dev and reran bioperl-db tests w/o problems. 
> Maybe someone else could do the same to see what happens?
>
>> It's worth noting that BioSQL itself can't really represent nested 
>> annotation collections other than by using ontology terms and their 
>> hierarchy, which at present I think isn't really appropriate, but I have to 
>> think through the issue more. In other words, in BioSQL you can't directly 
>> tie together a bunch of qualifier value pairs into a "bag" and then nest 
>> this bag within another. The way to make this work with the current schema 
>> is to flatten out the nesting.
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>
> Might be worth looking into for a future BioSQL release, but we have a decent 
> workaround in place for now, as long as it works cross-platform and 
> cross-RDB.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Thu Apr 17 13:06:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 12:06:39 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
	<Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
Message-ID: <C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>

Stefan,

'get_dbxrefs' was introduced in bioperl-live a while back during the  
feature/annotation rollback detailed here:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

I still think this is an interfering old bioperl (and maybe bioperl- 
db) installation causing the problems; I had similar issues at one  
point and had to find and remove the old installation.  It might be  
worth (1) checking 'perldoc -l Bio::Root::Root', which will give the  
location of the Bio::Root::Root in lib path being used, and (2) using  
'./Build install uninst=1' to remove any old bioperl/bioperl-db  
installations.

chris

On Apr 17, 2008, at 11:40 AM, Stefan Kirov wrote:

> Hilmar,
> sorry, I missed the part after the stack trace... In any case this  
> is still problem for me after I updated bioperl-live.
> I see this with a number of other tests:
> t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 78.
> t/04swiss.........dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 6-52
>        Failed 47/52 tests, 9.62% okay
> t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 72.
> t/05seqfeature....FAILED tests 9-48
>        Failed 40/48 tests, 16.67% okay
> t/06comment.......ok
> t/07dblink........ok
> t/08genbank.......ok
> t/09fuzzy2........ok
> t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 1420.
> t/10ensembl.......dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 3-15
>        Failed 13/15 tests, 13.33% okay
> t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs"  
> via package "Bio::Annotation::OntologyTerm" at /home/kirovs/bioperl- 
> db/blib/lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0>  
> line 1.
> t/11locuslink.....dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 5-110
>        Failed 106/110 tests, 3.64% okay
> t/12ontology......ok 1/738Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::GOterm" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 98.
> t/12ontology......dubious
>        Test returned status 255 (wstat 65280, 0xff00)
> DIED. FAILED tests 5-738
>        Failed 734/738 tests, 0.54% okay
> t/13remove........ok 2/59Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 145.
> t/13remove........FAILED tests 11-59
>        Failed 49/59 tests, 16.95% okay
> t/14query.........ok
> t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 1.
> t/15cluster.......dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 6-160
>        Failed 155/160 tests, 3.12% okay
> t/16obda..........ok
>
> On Thu, 17 Apr 2008, Chris Fields wrote:
>
>>
>> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>>
>>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>>> In any case I debugged and tracked that down to the RichSeq  
>>>> adaptor module missing.
>>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
>>> Bio::Seq and a SeqAdaptor is present.
>>> I'm afraid it gets stuck somewhere else and frankly I didn't see  
>>> the RichSeqAdaptor failing to load in your stack trace:
>>>
>>>>     ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>
>>>>  MSG: Annotation of class Bio::Annotation::Collection not
>>>>  type-mapped. Internal error?
>>>>  STACK: Error::throw
>>>>  STACK: Bio::Root::Root::throw
>>>>  /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>>  STACK:
>>>>  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>>  STACK:  
>>>> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>>  STACK: Bio::DB::Persistent::PersistentObject::store
>>>>  Bio/DB/Persistent/PersistentObject.pm:271
>>>>  STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>>  Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>  STACK: Bio::DB::Persistent::PersistentObject::create
>>>>  Bio/DB/Persistent/PersistentObject.pm:244
>>>>  STACK: t/04swiss.t:36
>>>>  -----------------------------------------------------------
>>> What that tells me is that when bioperl-db tries to store the  
>>> annotation bundle of the (SwissProt) sequence, one of the  
>>> annotations that it encounters is of type  
>>> Bio::Annotation::Collection. At present bioperl-db doesn't know  
>>> what to do with it; i.e., bioperl-db can't yet handle hierarchical  
>>> annotation collections (collections within collections).
>>> I believe this is due to recent changes in how the GN line is  
>>> parsed in BioPerl - Chris does this ring the right bell? I thought  
>>> though you had built in a method would allow flattening out
>>
>> This appears to be using an older bioperl-live checkout, one where  
>> Heikki changed GN parsing to use a nested Annotation::Collection.   
>> I reverted that back in a later commit to svn specifically b/c of  
>> bioperl-db problems. bioperl-live's swiss.pm now uses a new  
>> subclass of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree)  
>> that represents nested values via Data::Stag's itext output (we can  
>> change that to alternatives if needed).
>>
>> Here are the last few relevant revisions in bioperl-live's main  
>> trunk (mine is the latest):
>>
>> ------------------------------------------------------------------------
>> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) |  
>> 1 line
>>
>> bug 1825: updating swiss.pm/tests to try out TagTree (passes all  
>> tests). Need to update Handler.t and related modules still...
>> ------------------------------------------------------------------------
>> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1  
>> line
>>
>> documentation for the GN line parsing and management
>> ------------------------------------------------------------------------
>> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1  
>> line
>>
>> GN (Gene Name) line parsing rewrite. Breaks backward compatibility.  
>> Can now deal with >1 gene per entry and four categories of names  
>> per gene. Parses old style syntax (...OR ... OR ... ) into one gene  
>> name and synonyms for each gene. Docs to follow.
>>
>> ....
>>
>> I just updated all code from dev and reran bioperl-db tests w/o  
>> problems. Maybe someone else could do the same to see what happens?
>>
>>> It's worth noting that BioSQL itself can't really represent nested  
>>> annotation collections other than by using ontology terms and  
>>> their hierarchy, which at present I think isn't really  
>>> appropriate, but I have to think through the issue more. In other  
>>> words, in BioSQL you can't directly tie together a bunch of  
>>> qualifier value pairs into a "bag" and then nest this bag within  
>>> another. The way to make this work with the current schema is to  
>>> flatten out the nesting.
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>
>> Might be worth looking into for a future BioSQL release, but we  
>> have a decent workaround in place for now, as long as it works  
>> cross-platform and cross-RDB.
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stefan.kirov at bms.com  Thu Apr 17 13:52:22 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 13:52:22 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
	<Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
	<C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>
Message-ID: <48078E56.9000404@bms.com>

Chris Fields wrote:
> Stefan,
>
> 'get_dbxrefs' was introduced in bioperl-live a while back during the
> feature/annotation rollback detailed here:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback
>
Chris was right!
> I still think this is an interfering old bioperl (and maybe
> bioperl-db) installation causing the problems; I had similar issues at
> one point and had to find and remove the old installation.  It might
> be worth (1) checking 'perldoc -l Bio::Root::Root',
This is the first thing I did and it seemed fine from command line.
So I checked a new copy (vs. updating), set PERL5LIB to the minimum
which is necessary (Build changes INC), which is
/home/kirovs/bioperl-db/bplive:/stf/sysdev/perl/newlib/perl/lib/5.8/ia64-linux-multi/
(/home/kirovs/bioperl-db/bplive being the fresh copy and the other
having Module::Build, etc., but definitely no bioperl).
This fixed the problem. I still do not see where the old module came
from, but that was a really good guess.
Thanks
Stefan
> which will give the location of the Bio::Root::Root in lib path being
> used, and (2) using './Build install uninst=1' to remove any old
> bioperl/bioperl-db installations.
Unfortunately this is not an option for me.
>
> chris
>
> On Apr 17, 2008, at 11:40 AM, Stefan Kirov wrote:
>
>> Hilmar,
>> sorry, I missed the part after the stack trace... In any case this is
>> still problem for me after I updated bioperl-live.
>> I see this with a number of other tests:
>> t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 78.
>> t/04swiss.........dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 6-52
>>        Failed 47/52 tests, 9.62% okay
>> t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 72.
>> t/05seqfeature....FAILED tests 9-48
>>        Failed 40/48 tests, 16.67% okay
>> t/06comment.......ok
>> t/07dblink........ok
>> t/08genbank.......ok
>> t/09fuzzy2........ok
>> t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1420.
>> t/10ensembl.......dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 3-15
>>        Failed 13/15 tests, 13.33% okay
>> t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs"
>> via package "Bio::Annotation::OntologyTerm" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1.
>> t/11locuslink.....dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 5-110
>>        Failed 106/110 tests, 3.64% okay
>> t/12ontology......ok 1/738Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::GOterm" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 98.
>> t/12ontology......dubious
>>        Test returned status 255 (wstat 65280, 0xff00)
>> DIED. FAILED tests 5-738
>>        Failed 734/738 tests, 0.54% okay
>> t/13remove........ok 2/59Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 145.
>> t/13remove........FAILED tests 11-59
>>        Failed 49/59 tests, 16.95% okay
>> t/14query.........ok
>> t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1.
>> t/15cluster.......dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 6-160
>>        Failed 155/160 tests, 3.12% okay
>> t/16obda..........ok
>>
>> On Thu, 17 Apr 2008, Chris Fields wrote:
>>
>>>
>>> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>>>
>>>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>>>> In any case I debugged and tracked that down to the RichSeq
>>>>> adaptor module missing.
>>>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a
>>>> Bio::Seq and a SeqAdaptor is present.
>>>> I'm afraid it gets stuck somewhere else and frankly I didn't see
>>>> the RichSeqAdaptor failing to load in your stack trace:
>>>>
>>>>>     ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>>
>>>>>  MSG: Annotation of class Bio::Annotation::Collection not
>>>>>  type-mapped. Internal error?
>>>>>  STACK: Error::throw
>>>>>  STACK: Bio::Root::Root::throw
>>>>>  /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>>>  STACK:
>>>>>  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>>>  STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>>>  STACK: Bio::DB::Persistent::PersistentObject::store
>>>>>  Bio/DB/Persistent/PersistentObject.pm:271
>>>>>  STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>>>  Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>>  STACK: Bio::DB::Persistent::PersistentObject::create
>>>>>  Bio/DB/Persistent/PersistentObject.pm:244
>>>>>  STACK: t/04swiss.t:36
>>>>>  -----------------------------------------------------------
>>>> What that tells me is that when bioperl-db tries to store the
>>>> annotation bundle of the (SwissProt) sequence, one of the
>>>> annotations that it encounters is of type
>>>> Bio::Annotation::Collection. At present bioperl-db doesn't know
>>>> what to do with it; i.e., bioperl-db can't yet handle hierarchical
>>>> annotation collections (collections within collections).
>>>> I believe this is due to recent changes in how the GN line is
>>>> parsed in BioPerl - Chris does this ring the right bell? I thought
>>>> though you had built in a method would allow flattening out
>>>
>>> This appears to be using an older bioperl-live checkout, one where
>>> Heikki changed GN parsing to use a nested Annotation::Collection.  I
>>> reverted that back in a later commit to svn specifically b/c of
>>> bioperl-db problems. bioperl-live's swiss.pm now uses a new subclass
>>> of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that
>>> represents nested values via Data::Stag's itext output (we can
>>> change that to alternatives if needed).
>>>
>>> Here are the last few relevant revisions in bioperl-live's main
>>> trunk (mine is the latest):
>>>
>>> ------------------------------------------------------------------------
>>>
>>> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1
>>> line
>>>
>>> bug 1825: updating swiss.pm/tests to try out TagTree (passes all
>>> tests). Need to update Handler.t and related modules still...
>>> ------------------------------------------------------------------------
>>>
>>> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line
>>>
>>> documentation for the GN line parsing and management
>>> ------------------------------------------------------------------------
>>>
>>> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line
>>>
>>> GN (Gene Name) line parsing rewrite. Breaks backward compatibility.
>>> Can now deal with >1 gene per entry and four categories of names per
>>> gene. Parses old style syntax (...OR ... OR ... ) into one gene name
>>> and synonyms for each gene. Docs to follow.
>>>
>>> ....
>>>
>>> I just updated all code from dev and reran bioperl-db tests w/o
>>> problems. Maybe someone else could do the same to see what happens?
>>>
>>>> It's worth noting that BioSQL itself can't really represent nested
>>>> annotation collections other than by using ontology terms and their
>>>> hierarchy, which at present I think isn't really appropriate, but I
>>>> have to think through the issue more. In other words, in BioSQL you
>>>> can't directly tie together a bunch of qualifier value pairs into a
>>>> "bag" and then nest this bag within another. The way to make this
>>>> work with the current schema is to flatten out the nesting.
>>>>
>>>>     -hilmar
>>>> --===========================================================
>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>> ===========================================================
>>>
>>> Might be worth looking into for a future BioSQL release, but we have
>>> a decent workaround in place for now, as long as it works
>>> cross-platform and cross-RDB.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


From hubert.gaynor at yahoo.com  Thu Apr 17 20:53:16 2008
From: hubert.gaynor at yahoo.com (Hubert Gaynor)
Date: Thu, 17 Apr 2008 17:53:16 -0700 (PDT)
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
Message-ID: <130971.67684.qm@web46007.mail.sp1.yahoo.com>

Hi Sean,

I got it. Thank you so much!

Hubert

----- Original Message ----
From: Sean Davis <sdavis2 at mail.nih.gov>
To: Hubert Gaynor <hubert.gaynor at yahoo.com>
Sent: Thursday, April 17, 2008 6:36:02 PM
Subject: Re: [Bioperl-l] Can I use BLAST against a database like MySQL

On Thu, Apr 17, 2008 at 2:19 AM, Hubert Gaynor <hubert.gaynor at yahoo.com> wrote:
> Hi,
>
>  As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier?
>

formatdb is used to make a representation that can be used efficiently
by blast.  That representation already makes blast faster.  MySQL
can't be used for such things.  As for speeding blast, if you have a
multiprocessor machine, you can take advantage of those using blast
and increasing the number of processors.  Also, while blast is a very
versatile program, it is not the only alignment program available.
Depending on your needs, you could look at other programs such as blat
or gmap that can be 2-3 orders of magnitude faster than blast.

Sean


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


From Russell.Smithies at agresearch.co.nz  Thu Apr 17 21:39:23 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Fri, 18 Apr 2008 13:39:23 +1200
Subject: [Bioperl-l] accessing params for custom glyphs?
In-Reply-To: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
References: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>

This is probably more of a Perl OO problem I'm having, but can anyone
tell me how to access a parameter when I create a custom glyph?

I've created a panel in the usual way then I add a feature with
'my_glyph' and want to pass the value of -new_parameter to the glyph
drawing code.

    $panel->add_track( $feature,
    			-font => gdSmallFont,
			-glyph => 'my_glyph' ,
			-height => 10,
                		-label  => 1,
                		-strand => "forward",
                		-new_parameter => "test",


In my_glyph.pm, I have the usual draw_component sub:

sub draw_component {
  my $self = shift;
  my $gd = shift;
  my ($x1,$y1,$x2,$y2) = $self->bounds(@_);
  my $fg = $self->fgcolor;
  my $params = $self->??????????   <<--- how do I access the value of
"new_parameter" set in the panel drawing code?

  $gd->line($x1,$y1,$x2,$y2,$fg);
  $gd->line($x1,$y2,$x2,$y1,$fg);

}

Any ideas?

Thanx,

Russell	Smithies			
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From David.Messina at sbc.su.se  Fri Apr 18 05:31:59 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 18 Apr 2008 11:31:59 +0200
Subject: [Bioperl-l]  Finding seqs of given domain architecture
In-Reply-To: <628aabb70804170155n4e5dfd81r7020c3e9e11094ff@mail.gmail.com>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
	<628aabb70804161112o6610ee1fkfb4b08e74730237d@mail.gmail.com>
	<1208420674.23342.15.camel@razor.sbc.su.se>
	<628aabb70804170155n4e5dfd81r7020c3e9e11094ff@mail.gmail.com>
Message-ID: <628aabb70804180231p2b9cef9dwd5441e85c31531fd@mail.gmail.com>

Jacob,

I talked about your question with a colleague of mine who has been working
in this area. Below is his reply.

[I'm reposting this *without* the attachment mentioned since the mailing
list wouldn't accept it otherwise. If anyone wants a copy of the code, just
email me.]

Dave

-------

> 3. Pfam has this capability, i.e. to show all domains with a given
> architecture, but it is difficult to get at the actual sequences or
> even a list of accession numbers.

First, this should be available right away in PfamAlyser:

http://pfamalyzer.sbc.su.se/pfamalyzer/index.html

although you might need to upgrade your browser to Java 1.6 to get it to
work.

If this does not work as suggested (an upgraded version is coming
eventually), have a look at the file:

ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/swisspfam.gz

which contains the Pfam architectures for all UniProt sequences. You can
parse that to get a file of <accession number>-<list of domain>
correspondences and just filter that to get the accession numbers.
(Please find attached a Perl script to do just that.)

Under UNIX, you can then just grep this for the domain IDs,

(like grep domainArchitectureFile.txt PF00008 | grep PF00456 >
resultFile.txt)

but I am sure there are solutions under other operating systems as well.
You could then write a script to parse out the corresponding sequences
from the UniProt fasta flatfile, if you wanted, or (again under UNIX) a
script to wget them of the webpage.

In case your sequences are not in UniProt, consider using HMMER and the
Pfam HMM files to assign domains to all sequences in your dataset. I
would then parse the HMMER output into the same format as the above, and
use the same approach following that.

Hope this helps,

Yours sincerely,

Kristoffer Forslund
krifo at sbc.su.se


From lincoln.stein at gmail.com  Fri Apr 18 15:16:19 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Fri, 18 Apr 2008 15:16:19 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] accessing params for custom glyphs?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>
References: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
	<D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>
Message-ID: <6dce9a0b0804181216q6564e580u8a805ae96c78df2e@mail.gmail.com>

Hi Russell,

It's very simple:

   my $params = $self->option('new_parameter');

Lincoln

On Thu, Apr 17, 2008 at 9:39 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

> This is probably more of a Perl OO problem I'm having, but can anyone
> tell me how to access a parameter when I create a custom glyph?
>
> I've created a panel in the usual way then I add a feature with
> 'my_glyph' and want to pass the value of -new_parameter to the glyph
> drawing code.
>
>    $panel->add_track( $feature,
>                        -font => gdSmallFont,
>                        -glyph => 'my_glyph' ,
>                        -height => 10,
>                                -label  => 1,
>                                -strand => "forward",
>                                -new_parameter => "test",
>
>
> In my_glyph.pm, I have the usual draw_component sub:
>
> sub draw_component {
>  my $self = shift;
>  my $gd = shift;
>  my ($x1,$y1,$x2,$y2) = $self->bounds(@_);
>  my $fg = $self->fgcolor;
>  my $params = $self->??????????   <<--- how do I access the value of
> "new_parameter" set in the panel drawing code?
>
>  $gd->line($x1,$y1,$x2,$y2,$fg);
>  $gd->line($x1,$y2,$x2,$y1,$fg);
>
> }
>
> Any ideas?
>
> Thanx,
>
> Russell Smithies
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Don't miss this year's exciting event. There's still time to save $100.
> Use priority code J8TL2D2.
>
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From jason at bioperl.org  Fri Apr 18 22:35:10 2008
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 18 Apr 2008 19:35:10 -0700
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <1208381947.16620.6.camel@kiss-laptop>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
	<1208381947.16620.6.camel@kiss-laptop>
Message-ID: <A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>

do you want the LOCUS or the ACCESSION?
Do you mean the result is the completely wrong record or just the  
wrong field?
accession number is available from the seq's accession_number() method.
-jason
On Apr 16, 2008, at 2:39 PM, Fr?d?ric Romagn? wrote:

> Well, if with input file you mean the database used, it's created
> with Bio::Index::GenBank from a ncbi FTP's genbank file.
>
> $id is an accession number read from a file but i chomp the line...
>
> I am trying to install the svn version of bioperl under windows to see
> if there is an improvement.
>
> Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
>> Did you check the format of your input file?
>> i.e. DOS or UNIX line endings?
>>
>>> -----Original Message-----
>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
>>> bounces at lists.open-
>>> bio.org] On Behalf Of Fr?d?ric Romagn?
>>> Sent: Thursday, 17 April 2008 5:25 a.m.
>>> To: bioperl-l at lists.open-bio.org
>>> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
>>>
>>> Hello,
>>> i made a program which use Bio::Index::GenBank and i tested it under
>>> unix, that worked well.
>>>
>>> But i have to launch it under windows and it seems not to work on.
>>>
>>> Here is the problem :
>>>
>>> my $dbobj = Bio::Index::Abstract->new("Data/$db");
>>> my $seq = $dbobj->get_Seq_by_acc($id);
>>> print $seq->display_id."\n";
>>>
>>> did not print the same number than $id !!! So i don't work on the
>>> sequence expected...
>>>
>>> I use the SVN sources on unix and the Perl package manager for
>>> windows...
>>>
>>> Thanks.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> ===================================================================== 
>> ==
>> Attention: The information contained in this message and/or  
>> attachments
>> from AgResearch Limited is intended only for the persons or entities
>> to which it is addressed and may contain confidential and/or  
>> privileged
>> material. Any review, retransmission, dissemination or other use  
>> of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipients is prohibited by  
>> AgResearch
>> Limited. If you have received this message in error, please notify  
>> the
>> sender immediately.
>> ===================================================================== 
>> ==
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bioperlanand at yahoo.com  Mon Apr 21 03:44:00 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Mon, 21 Apr 2008 00:44:00 -0700 (PDT)
Subject: [Bioperl-l] a question on obtaining HTML formatted Blast output
	along with the Blast hits image
Message-ID: <372845.37134.qm@web36808.mail.mud.yahoo.com>


 Hi everybody,

I would like to obtain a HTML formatted blast report output along with a picture of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I have gotten the HTML output working using "Bio::SearchIO::Writer::HTMLResultWriter".

My question: How do I integrate it with Bio:Graphics to render the blast hits image at the correct position in my Bioperl reformatted html file.

I ultimately want to be able to display my blast output files on a browser. 

Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;

my $infile = shift or die $!;

my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile );
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                -file   => ">${infile}.html");

$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------

Thanks in advance,

Anand


---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.


From cjfields at uiuc.edu  Mon Apr 21 11:07:17 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:07:17 -0500
Subject: [Bioperl-l] [Proposed change] HSP::frame()
Message-ID: <ACE26E05-7C02-46E3-B973-E0529C0A0DEA@uiuc.edu>

I have noticed (in relation to bug 2485, http://bugzilla.open-bio.org/show_bug.cgi?id=2485) 
  that the Bio::Search::HSP::GenericHSP frame() method is implemented  
very differently from strand(), start(), end(), and most other HSP  
methods.  The current behavior is to return an array of two values  
(query and hit frame) under list conditions, the query frame if one  
value is passed, and the subject frame if no value is passed under  
scalar context and both under list context.  The latter behavior is  
unfortunately leading to the aforementioned bug above.  The method is  
also implied to be a getter/setter, but the implementation doesn't  
allow that; it always sets to the instantiated values (in fact,  
repeatedly so).

In order to fix that and make the interface more consistent I am  
changing frame() to behave like strand(), etc., in that the first  
argument is 'query/subject/hit/list' (default = 'query' if no arg  
specified) and the rest optional values for setting, in query/subject  
order.

One issue: I can catch and imitate most of the older behavior with a  
few additional checks, the one exception being the old frame() default  
return value which is now 'query' (not context-dependent).  If needed  
we can change the default to 'hit', but I believe method consistency  
is probably the better route, and I can always add a warning under old  
API circumstances indicating the change.

I am also modifying HSPTableWriter to print frame_hit and frame_query  
(previously it was only printing 'frame', which implied the hit  
frame).  I can see this being an issue with anyone expecting 'frame'  
instead of 'frame_hit';  I could hack in a fix for that if needed.

If there aren't any objections or suggestions, I'll commit this in the  
next day or two.

chris


From cjfields at uiuc.edu  Mon Apr 21 11:32:59 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:32:59 -0500
Subject: [Bioperl-l] Assembly.t test fails
Message-ID: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>

I'm getting some significant test failures in bioperl-live for  
Bio::Assembly:

t/Assembly......
1..35
ok 1 - use Bio::Assembly::IO;
ok 2 - The object isa Bio::Assembly::IO
ok 3 - The object isa Bio::Assembly::Scaffold
ok 4
not ok 5
ok 6 - The object isa Bio::AnnotationCollectionI
ok 7 - no annotations in Annotation collection?
ok 8

#   Failed test at t/Assembly.t line 35.
#          got: 'NoName'
#     expected: 'test'
Can't locate object method "get_contig_seq_ids" via package  
"Bio::Assembly::Contig" at /Users/cjfields/bioperl/bioperl-live/blib/ 
lib/Bio/Assembly/Scaffold.pm line 189, <GEN0> line 733.
# Looks like you planned 35 tests but only ran 8.
# Looks like you failed 1 test of 8 run.
# Looks like your test died just after 8.
  Dubious, test returned 255 (wstat 65280, 0xff00)
  Failed 28/35 subtests

Test Summary Report
-------------------
t/Assembly.t (Wstat: 65280 Tests: 8 Failed: 1)
   Failed test:  5
   Non-zero exit status: 255
   Parse errors: Bad plan.  You planned 35 tests but ran 8.
Files=1, Tests=8,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.22 cusr   
0.04 csys =  0.27 CPU)
Result: FAIL
Failed 1/1 test programs. 1/8 subtests failed.


chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Apr 21 11:44:21 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:44:21 -0500
Subject: [Bioperl-l] Assembly.t test fails
In-Reply-To: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>
References: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>
Message-ID: <2F199628-717E-4F88-85D7-408BD7BBE16D@uiuc.edu>

Scratch that, figured it out (easy fix).

chris

On Apr 21, 2008, at 10:32 AM, Chris Fields wrote:

> I'm getting some significant test failures in bioperl-live for  
> Bio::Assembly:
>
> t/Assembly......
> 1..35
> ok 1 - use Bio::Assembly::IO;
> ok 2 - The object isa Bio::Assembly::IO
> ok 3 - The object isa Bio::Assembly::Scaffold
> ok 4
> not ok 5
> ok 6 - The object isa Bio::AnnotationCollectionI
> ok 7 - no annotations in Annotation collection?
> ok 8
>
> #   Failed test at t/Assembly.t line 35.
> #          got: 'NoName'
> #     expected: 'test'
> Can't locate object method "get_contig_seq_ids" via package  
> "Bio::Assembly::Contig" at /Users/cjfields/bioperl/bioperl-live/blib/ 
> lib/Bio/Assembly/Scaffold.pm line 189, <GEN0> line 733.
> # Looks like you planned 35 tests but only ran 8.
> # Looks like you failed 1 test of 8 run.
> # Looks like your test died just after 8.
> Dubious, test returned 255 (wstat 65280, 0xff00)
> Failed 28/35 subtests
>
> Test Summary Report
> -------------------
> t/Assembly.t (Wstat: 65280 Tests: 8 Failed: 1)
>  Failed test:  5
>  Non-zero exit status: 255
>  Parse errors: Bad plan.  You planned 35 tests but ran 8.
> Files=1, Tests=8,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.22  
> cusr  0.04 csys =  0.27 CPU)
> Result: FAIL
> Failed 1/1 test programs. 1/8 subtests failed.
>
>
> chris
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From frederic.romagne at gmail.com  Mon Apr 21 11:53:11 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Mon, 21 Apr 2008 10:53:11 -0500
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
	<1208381947.16620.6.camel@kiss-laptop>
	<A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>
Message-ID: <1208793191.25906.9.camel@kiss-laptop>

In fact, i want the whole Bio::Seq object, but the i verified the
ACCESSION and the LOCUS are the same in my genbank files.
I saw that the program sometimes tells that it cannot find the entry :

 if( !defined $seq ) {
	warn("Sequence $id in Database $db is not present\n");
    }

i suspect the make_index function not to work properly on windows
instead of the ?get_Seq_by_acc function...

Le vendredi 18 avril 2008 ? 19:35 -0700, Jason Stajich a ?crit :
> do you want the LOCUS or the ACCESSION?
> Do you mean the result is the completely wrong record or just the  
> wrong field?
> accession number is available from the seq's accession_number() method.
> -jason
> On Apr 16, 2008, at 2:39 PM, Fr?d?ric Romagn? wrote:
> 
> > Well, if with input file you mean the database used, it's created
> > with Bio::Index::GenBank from a ncbi FTP's genbank file.
> >
> > $id is an accession number read from a file but i chomp the line...
> >
> > I am trying to install the svn version of bioperl under windows to see
> > if there is an improvement.
> >
> > Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
> >> Did you check the format of your input file?
> >> i.e. DOS or UNIX line endings?
> >>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> >>> bounces at lists.open-
> >>> bio.org] On Behalf Of Fr?d?ric Romagn?
> >>> Sent: Thursday, 17 April 2008 5:25 a.m.
> >>> To: bioperl-l at lists.open-bio.org
> >>> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> >>>
> >>> Hello,
> >>> i made a program which use Bio::Index::GenBank and i tested it under
> >>> unix, that worked well.
> >>>
> >>> But i have to launch it under windows and it seems not to work on.
> >>>
> >>> Here is the problem :
> >>>
> >>> my $dbobj = Bio::Index::Abstract->new("Data/$db");
> >>> my $seq = $dbobj->get_Seq_by_acc($id);
> >>> print $seq->display_id."\n";
> >>>
> >>> did not print the same number than $id !!! So i don't work on the
> >>> sequence expected...
> >>>
> >>> I use the SVN sources on unix and the Perl package manager for
> >>> windows...
> >>>
> >>> Thanks.
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> ===================================================================== 
> >> ==
> >> Attention: The information contained in this message and/or  
> >> attachments
> >> from AgResearch Limited is intended only for the persons or entities
> >> to which it is addressed and may contain confidential and/or  
> >> privileged
> >> material. Any review, retransmission, dissemination or other use  
> >> of, or
> >> taking of any action in reliance upon, this information by persons or
> >> entities other than the intended recipients is prohibited by  
> >> AgResearch
> >> Limited. If you have received this message in error, please notify  
> >> the
> >> sender immediately.
> >> ===================================================================== 
> >> ==
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From ewijaya at gmail.com  Tue Apr 22 10:03:07 2008
From: ewijaya at gmail.com (Edward Wijaya)
Date: Tue, 22 Apr 2008 22:03:07 +0800
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
Message-ID: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>

Hi,

Is there any module that can parse the following output
of BLAT. This is taken from UCSC browser.

The idea is to parse it and then extract the conserved block
of aligned sequences.


__DATA__
Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
B D   D. melanogaster
tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
B D       D. simulans
tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
B D      D. sechellia
tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
B D         D. yakuba
tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
            D. erecta
tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
         D. ananassae
taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
     D. pseudoobscura
tata----ccagtacac-cttatatg------------tttttaaata--------------------
B D     D. persimilis
tata----ccagtacac-attatatg------------tttttaaata--------------------
        D. willistoni
aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
           D. virilis
-------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
        D. mojavensis
-------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
         D. grimshawi
====================================================================
         T. castaneum
====================================================================

Inserts between block 3 and 4 in window
    D. pseudoobscura 2008bp
B D    D. persimilis 1421bp
          D. virilis 5bp
       D. mojavensis 4640bp

Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
B D   D. melanogaster
----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
B D       D. simulans
----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
B D      D. sechellia
----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
B D         D. yakuba
----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
            D. erecta
----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
     D. pseudoobscura
====================================================================
B D     D. persimilis
====================================================================
        D. willistoni
----aggattacgaagttcctttat-------------------aaag--------------------
           D. virilis
gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
        D. mojavensis
====================================================================
         D. grimshawi
====================================================================
         T. castaneum
====================================================================

__ END__


From cjfields at uiuc.edu  Tue Apr 22 10:22:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 09:22:45 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
Message-ID: <766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>

A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!

chris

On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:

> Hi,
>
> Is there any module that can parse the following output
> of BLAT. This is taken from UCSC browser.
>
> The idea is to parse it and then extract the conserved block
> of aligned sequences.
>
>
> __DATA__
> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
> B D   D. melanogaster
> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
> B D       D. simulans
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
> B D      D. sechellia
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
> B D         D. yakuba
> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>            D. erecta
> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>         D. ananassae
> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>     D. pseudoobscura
> tata----ccagtacac-cttatatg------------tttttaaata--------------------
> B D     D. persimilis
> tata----ccagtacac-attatatg------------tttttaaata--------------------
>        D. willistoni
> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>           D. virilis
> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>        D. mojavensis
> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>         D. grimshawi
> ====================================================================
>         T. castaneum
> ====================================================================
>
> Inserts between block 3 and 4 in window
>    D. pseudoobscura 2008bp
> B D    D. persimilis 1421bp
>          D. virilis 5bp
>       D. mojavensis 4640bp
>
> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
> B D   D. melanogaster
> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
> B D       D. simulans
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D      D. sechellia
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D         D. yakuba
> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>            D. erecta
> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>     D. pseudoobscura
> ====================================================================
> B D     D. persimilis
> ====================================================================
>        D. willistoni
> ----aggattacgaagttcctttat-------------------aaag--------------------
>           D. virilis
> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>        D. mojavensis
> ====================================================================
>         D. grimshawi
> ====================================================================
>         T. castaneum
> ====================================================================
>
> __ END__
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 22 10:59:25 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 09:59:25 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
Message-ID: <4F3522BB-28F0-44A8-8DE1-7CF3F648402A@uiuc.edu>

A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!

chris

On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:

> Hi,
>
> Is there any module that can parse the following output
> of BLAT. This is taken from UCSC browser.
>
> The idea is to parse it and then extract the conserved block
> of aligned sequences.
>
>
> __DATA__
> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
> B D   D. melanogaster
> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
> B D       D. simulans
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
> B D      D. sechellia
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
> B D         D. yakuba
> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>           D. erecta
> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>        D. ananassae
> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>    D. pseudoobscura
> tata----ccagtacac-cttatatg------------tttttaaata--------------------
> B D     D. persimilis
> tata----ccagtacac-attatatg------------tttttaaata--------------------
>       D. willistoni
> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>          D. virilis
> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>       D. mojavensis
> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>        D. grimshawi
> ====================================================================
>        T. castaneum
> ====================================================================
>
> Inserts between block 3 and 4 in window
>   D. pseudoobscura 2008bp
> B D    D. persimilis 1421bp
>         D. virilis 5bp
>      D. mojavensis 4640bp
>
> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
> B D   D. melanogaster
> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
> B D       D. simulans
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D      D. sechellia
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D         D. yakuba
> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>           D. erecta
> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>    D. pseudoobscura
> ====================================================================
> B D     D. persimilis
> ====================================================================
>       D. willistoni
> ----aggattacgaagttcctttat-------------------aaag--------------------
>          D. virilis
> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>       D. mojavensis
> ====================================================================
>        D. grimshawi
> ====================================================================
>        T. castaneum
> ====================================================================
>
> __ END__
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Tue Apr 22 14:49:32 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 11:49:32 -0700
Subject: [Bioperl-l] Fwd: [blast-announce] New BLAST URL available at the
	NCBI
References: <EEEED756EF6626469B10653F745014389BAEAD@NIHCESMLBX15.nih.gov>
Message-ID: <F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>

Does anyone want to take a look at how to use these URLs in the  
RemoteBlast module, if the interface is the same?

-jason

Begin forwarded message:

> From: "Mcginnis, Scott (NIH/NLM/NCBI) [E]" <mcginnis at ncbi.nlm.nih.gov>
> Date: April 22, 2008 11:35:04 AM PDT
> To: <blast-announce at ncbi.nlm.nih.gov>
> Subject: [blast-announce] New BLAST URL available at the NCBI
>
> New BLAST URL available at the NCBI
>
>
>
> The NCBI has activated a new URL for BLAST searches at the NCBI:
> http://blast.ncbi.nlm.nih.gov.
>
>
>
> Searches sent to this URL can take advantage of a larger number of
> machines for searches and the system has a better overall fault
> tolerance.
>
>
>
> We recommend migration of all BLAST links and bookmarks (e.g.,
> http://www.ncbi.nlm.nih.gov/BLAST/ and
> http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) to the new URL.
>
>
>
> Links on the NCBI and BLAST home pages will start to change in the
> coming weeks.
>
>
>
> At this point in time the plans are to also maintain the current BLAST
> URL.
>
>
>
>
>


From jason at bioperl.org  Tue Apr 22 14:51:08 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 11:51:08 -0700
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
	<766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
Message-ID: <6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>

if you get it as axt it should parse fine in SearchIO but that is  
pairwise, if you can get an alignment blocks I can't remember what  
format this is from UCSC.
MSAs are going to be better handed through Bio::AlignIO though so it  
might be better to build a parser on that.

On Apr 22, 2008, at 7:22 AM, Chris Fields wrote:

> A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
> Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
> Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!
>
> chris
>
> On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:
>
>> Hi,
>>
>> Is there any module that can parse the following output
>> of BLAT. This is taken from UCSC browser.
>>
>> The idea is to parse it and then extract the conserved block
>> of aligned sequences.
>>
>>
>> __DATA__
>> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
>> B D   D. melanogaster
>> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
>> B D       D. simulans
>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
>> B D      D. sechellia
>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
>> B D         D. yakuba
>> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>>            D. erecta
>> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>>         D. ananassae
>> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>>     D. pseudoobscura
>> tata----ccagtacac-cttatatg------------tttttaaata--------------------
>> B D     D. persimilis
>> tata----ccagtacac-attatatg------------tttttaaata--------------------
>>        D. willistoni
>> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>>           D. virilis
>> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>>        D. mojavensis
>> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>>         D. grimshawi
>> ====================================================================
>>         T. castaneum
>> ====================================================================
>>
>> Inserts between block 3 and 4 in window
>>    D. pseudoobscura 2008bp
>> B D    D. persimilis 1421bp
>>          D. virilis 5bp
>>       D. mojavensis 4640bp
>>
>> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
>> B D   D. melanogaster
>> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
>> B D       D. simulans
>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>> B D      D. sechellia
>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>> B D         D. yakuba
>> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>>            D. erecta
>> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>>     D. pseudoobscura
>> ====================================================================
>> B D     D. persimilis
>> ====================================================================
>>        D. willistoni
>> ----aggattacgaagttcctttat-------------------aaag--------------------
>>           D. virilis
>> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>>        D. mojavensis
>> ====================================================================
>>         D. grimshawi
>> ====================================================================
>>         T. castaneum
>> ====================================================================
>>
>> __ END__
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Apr 22 15:02:14 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 14:02:14 -0500
Subject: [Bioperl-l] Fwd: [blast-announce] New BLAST URL available at
	the NCBI
In-Reply-To: <F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>
References: <EEEED756EF6626469B10653F745014389BAEAD@NIHCESMLBX15.nih.gov>
	<F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>
Message-ID: <13C2AD96-8297-40DD-ADCC-B2BEC923B9E0@uiuc.edu>

They work exactly the same as the old URL, at least on the surface; I  
haven't tried changing many URLAPI parameters.  I went ahead and  
changed the URL in RemoteBlast to http://blast.ncbi.nlm.nih.gov/Blast.cgi 
  as it works with RemoteBlast.t.

chris

On Apr 22, 2008, at 1:49 PM, Jason Stajich wrote:

> Does anyone want to take a look at how to use these URLs in the  
> RemoteBlast module, if the interface is the same?
>
> -jason
>
> Begin forwarded message:
>
>> From: "Mcginnis, Scott (NIH/NLM/NCBI) [E]"  
>> <mcginnis at ncbi.nlm.nih.gov>
>> Date: April 22, 2008 11:35:04 AM PDT
>> To: <blast-announce at ncbi.nlm.nih.gov>
>> Subject: [blast-announce] New BLAST URL available at the NCBI
>>
>> New BLAST URL available at the NCBI
>>
>>
>>
>> The NCBI has activated a new URL for BLAST searches at the NCBI:
>> http://blast.ncbi.nlm.nih.gov.
>>
>>
>>
>> Searches sent to this URL can take advantage of a larger number of
>> machines for searches and the system has a better overall fault
>> tolerance.
>>
>>
>>
>> We recommend migration of all BLAST links and bookmarks (e.g.,
>> http://www.ncbi.nlm.nih.gov/BLAST/ and
>> http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) to the new URL.
>>
>>
>>
>> Links on the NCBI and BLAST home pages will start to change in the
>> coming weeks.
>>
>>
>>
>> At this point in time the plans are to also maintain the current  
>> BLAST
>> URL.
>>
>>
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 22 14:58:40 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 13:58:40 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
	<766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
	<6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>
Message-ID: <43344C89-6B4D-4360-AF56-A6FDD065FFF3@uiuc.edu>

Related to that, I have thought about building a parser for some of  
the query-anchored alignments produced by blastall, just haven't had  
time to devote to it.  One of these days...

chris

On Apr 22, 2008, at 1:51 PM, Jason Stajich wrote:

> if you get it as axt it should parse fine in SearchIO but that is  
> pairwise, if you can get an alignment blocks I can't remember what  
> format this is from UCSC.
> MSAs are going to be better handed through Bio::AlignIO though so it  
> might be better to build a parser on that.
>
> On Apr 22, 2008, at 7:22 AM, Chris Fields wrote:
>
>> A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
>> Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
>> Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!
>>
>> chris
>>
>> On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:
>>
>>> Hi,
>>>
>>> Is there any module that can parse the following output
>>> of BLAT. This is taken from UCSC browser.
>>>
>>> The idea is to parse it and then extract the conserved block
>>> of aligned sequences.
>>>
>>>
>>> __DATA__
>>> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
>>> B D   D. melanogaster
>>> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
>>> B D       D. simulans
>>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
>>> B D      D. sechellia
>>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
>>> B D         D. yakuba
>>> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>>>           D. erecta
>>> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>>>        D. ananassae
>>> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>>>    D. pseudoobscura
>>> tata----ccagtacac-cttatatg------------tttttaaata--------------------
>>> B D     D. persimilis
>>> tata----ccagtacac-attatatg------------tttttaaata--------------------
>>>       D. willistoni
>>> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>>>          D. virilis
>>> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>>>       D. mojavensis
>>> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>>>        D. grimshawi
>>> ====================================================================
>>>        T. castaneum
>>> ====================================================================
>>>
>>> Inserts between block 3 and 4 in window
>>>   D. pseudoobscura 2008bp
>>> B D    D. persimilis 1421bp
>>>         D. virilis 5bp
>>>      D. mojavensis 4640bp
>>>
>>> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
>>> B D   D. melanogaster
>>> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
>>> B D       D. simulans
>>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>>> B D      D. sechellia
>>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>>> B D         D. yakuba
>>> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>>>           D. erecta
>>> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>>>    D. pseudoobscura
>>> ====================================================================
>>> B D     D. persimilis
>>> ====================================================================
>>>       D. willistoni
>>> ----aggattacgaagttcctttat-------------------aaag--------------------
>>>          D. virilis
>>> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>>>       D. mojavensis
>>> ====================================================================
>>>        D. grimshawi
>>> ====================================================================
>>>        T. castaneum
>>> ====================================================================
>>>
>>> __ END__
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bioperlanand at yahoo.com  Wed Apr 23 02:02:30 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Tue, 22 Apr 2008 23:02:30 -0700 (PDT)
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
Message-ID: <946658.12337.qm@web36802.mail.mud.yahoo.com>

Hi everybody,

I would like to use Bio::Graphics in conjunction with Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted blast report output along with an image of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I am able to get the HTML output using  "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the image using the examples outlined in the Bio::Graphics HOWTO: http://www.bioperl.org/wiki/HOWTO:Graphics

My question: How do I integrate Bio::Graphics with Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits image at the correct position in my BioPerl reformatted html file.

I also found that someone else has asked something similar to whatever I am asking & is listed under the "Orphans, Leftovers" category in the ListSummary:April 26-May 9,2006 document: 
http://www.bioperl.org/wiki/ListSummary:April_26-May_9%2C2006#Orphans.2C_Leftovers

Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;

my $infile = shift or die $!;

my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile);
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                                  -file   => ">${infile}.html");

$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------

Thanks in advance,

Anand

       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.


From jason at bioperl.org  Wed Apr 23 02:15:28 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 23:15:28 -0700
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <946658.12337.qm@web36802.mail.mud.yahoo.com>
References: <946658.12337.qm@web36802.mail.mud.yahoo.com>
Message-ID: <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org>


Basically you want to inject your own IMG tags into the file with  
these routines:

     $writerhtml->start_report(\&my_start_report);
     $writerhtml->title(\&my_title);
     $writerhtml->hit_link_align(\&my_hit_link_align);
     $writerhtml->hit_link_desc(\&my_hit_link_desc);

fgblast shows a way to do this in part. It relies on Gbrowse to  
generate the image but you can replace the gbrowse_img reference to  
your own image generating software.

http://people.genome.duke.edu/~jes12/software/scripts/fgblast

-jason
On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:

> Hi everybody,
>
> I would like to use Bio::Graphics in conjunction with  
> Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted  
> blast report output along with an image of the blast hits as shown  
> on Slide 60 in this pdf: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
> I am able to get the HTML output using   
> "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the  
> image using the examples outlined in the Bio::Graphics HOWTO:  
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
> My question: How do I integrate Bio::Graphics with  
> Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits  
> image at the correct position in my BioPerl reformatted html file.
>
> I also found that someone else has asked something similar to  
> whatever I am asking & is listed under the "Orphans, Leftovers"  
> category in the ListSummary:April 26-May 9,2006 document:
> http://www.bioperl.org/wiki/ListSummary:April_26-May_9% 
> 2C2006#Orphans.2C_Leftovers
>
> Here is my code so far:
> ----------------------------------------------------------------
> #!/usr/bin/perl -w
> # usage: $0 <blast_report>
> use strict;
> use Bio::SearchIO;
> use Bio::SearchIO::Writer::HTMLResultWriter;
>
> my $infile = shift or die $!;
>
> my $searchio = new Bio::SearchIO( -format => 'blast',-file   =>  
> $infile);
> my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
> my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
>                                                   -file   => ">$ 
> {infile}.html");
>
> $outhtml->write_result($searchio->next_result);
> ----------------------------------------------------------------
>
> Thanks in advance,
>
> Anand
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bamboowarrior at gmail.com  Wed Apr 23 15:39:21 2008
From: bamboowarrior at gmail.com (Arkady)
Date: Wed, 23 Apr 2008 14:39:21 -0500
Subject: [Bioperl-l] WebBlat, where'd it go?
Message-ID: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>

Hi folks,

I'm trying to use BioPerl to run a BLAT search on the four primate
genomes on UCSC. I understand that the proper tool for this is
Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
bioperl distribution (nor do I even know how to figure out what
version that is, unfortunately, though it's a very recent install -- a
month ago?). I also can't find it on CPAN. Is this deprecated? Has
something else replaced it? Or are we always supposed to run local
BLAT?

Thanks.

John Woods

Institute for Cellular and Molecular Biology
The University of Texas at Austin


From spiros at lokku.com  Wed Apr 23 15:48:12 2008
From: spiros at lokku.com (Spiros Denaxas)
Date: Wed, 23 Apr 2008 20:48:12 +0100
Subject: [Bioperl-l] WebBlat, where'd it go?
In-Reply-To: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
References: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
Message-ID: <bba689ec0804231248s47034503y3cbf0512e4344843@mail.gmail.com>

Hey,

a quick look at the list of deprecated modules reveals that it has
indeed been removed,

http://www.bioperl.org/wiki/Deprecated_modules

Spiros

On Wed, Apr 23, 2008 at 8:39 PM, Arkady <bamboowarrior at gmail.com> wrote:
> Hi folks,
>
>  I'm trying to use BioPerl to run a BLAT search on the four primate
>  genomes on UCSC. I understand that the proper tool for this is
>  Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
>  bioperl distribution (nor do I even know how to figure out what
>  version that is, unfortunately, though it's a very recent install -- a
>  month ago?). I also can't find it on CPAN. Is this deprecated? Has
>  something else replaced it? Or are we always supposed to run local
>  BLAT?
>
>  Thanks.
>
>  John Woods
>
>  Institute for Cellular and Molecular Biology
>  The University of Texas at Austin
>  _______________________________________________
>  Bioperl-l mailing list
>  Bioperl-l at lists.open-bio.org
>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Wed Apr 23 15:56:14 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 23 Apr 2008 14:56:14 -0500
Subject: [Bioperl-l] WebBlat, where'd it go?
In-Reply-To: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
References: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
Message-ID: <AF7BBBC2-6A6E-486A-872C-8BB8B0A7FC0C@uiuc.edu>

It's no longer maintained (deprecated); see the following for an  
explanation:

http://article.gmane.org/gmane.comp.lang.perl.bio.general/13545

Basically, only local BLAT searches are supported through BioPerl.

chris

On Apr 23, 2008, at 2:39 PM, Arkady wrote:

> Hi folks,
>
> I'm trying to use BioPerl to run a BLAT search on the four primate
> genomes on UCSC. I understand that the proper tool for this is
> Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
> bioperl distribution (nor do I even know how to figure out what
> version that is, unfortunately, though it's a very recent install -- a
> month ago?). I also can't find it on CPAN. Is this deprecated? Has
> something else replaced it? Or are we always supposed to run local
> BLAT?
>
> Thanks.
>
> John Woods
>
> Institute for Cellular and Molecular Biology
> The University of Texas at Austin
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bioperlanand at yahoo.com  Wed Apr 23 19:05:27 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Wed, 23 Apr 2008 16:05:27 -0700 (PDT)
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org>
Message-ID: <795696.39415.qm@web36804.mail.mud.yahoo.com>

Hi Jason,

Thanks for the reply.

I am a little lost with the solution suggested. Is that how slide 60 in the pdf is obtained: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I guess I am missing something quite obvious, I apologize.

What I have & want is this: I have a directory having say 100 different blast reports & hence I am looking to obtain 100 different bioperl formatted blast html outputs with the respective images just as it would appear in the blast report.

Thanks,

Anand

Jason Stajich <jason at bioperl.org> wrote: 

Basically you want to inject your own IMG tags into the file with these routines:


    $writerhtml->start_report(\&my_start_report);
    $writerhtml->title(\&my_title);
    $writerhtml->hit_link_align(\&my_hit_link_align);
    $writerhtml->hit_link_desc(\&my_hit_link_desc);


fgblast shows a way to do this in part. It relies on Gbrowse to generate the image but you can replace the gbrowse_img reference to your own image generating software.
http://people.genome.duke.edu/~jes12/software/scripts/fgblast


-jason
On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:

Hi everybody,


I would like to use Bio::Graphics in conjunction with Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted blast report output along with an image of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf


I am able to get the HTML output using  "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the image using the examples outlined in the Bio::Graphics HOWTO: http://www.bioperl.org/wiki/HOWTO:Graphics


My question: How do I integrate Bio::Graphics with Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits image at the correct position in my BioPerl reformatted html file.


I also found that someone else has asked something similar to whatever I am asking & is listed under the "Orphans, Leftovers" category in the ListSummary:April 26-May 9,2006 document: 
http://www.bioperl.org/wiki/ListSummary:April_26-May_9%2C2006#Orphans.2C_Leftovers


Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;


my $infile = shift or die $!;


my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile);
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                                  -file   => ">${infile}.html");


$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------


Thanks in advance,


Anand


---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
 

---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.


From jason at bioperl.org  Thu Apr 24 14:06:41 2008
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 24 Apr 2008 11:06:41 -0700
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <795696.39415.qm@web36804.mail.mud.yahoo.com>
References: <795696.39415.qm@web36804.mail.mud.yahoo.com>
Message-ID: <D47EBDB9-C15C-44A7-9376-89FA946270DD@bioperl.org>

The overview graphic is generated basically from the script in  
scripts/graphics/search_overview.PLS

So you'd have to run that on each report to generate the graphic,  
then use the other methods  to insert <img src="NAME"> images into  
each rendered HTML report.

-jason

On Apr 23, 2008, at 4:05 PM, Anand Venkatraman wrote:

> Hi Jason,
>
> Thanks for the reply.
>
> I am a little lost with the solution suggested. Is that how slide  
> 60 in the pdf is obtained: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
> I guess I am missing something quite obvious, I apologize.
>
> What I have & want is this: I have a directory having say 100  
> different blast reports & hence I am looking to obtain 100  
> different bioperl formatted blast html outputs with the respective  
> images just as it would appear in the blast report.
>
> Thanks,
>
> Anand
>
> Jason Stajich <jason at bioperl.org> wrote:
>
> Basically you want to inject your own IMG tags into the file with  
> these routines:
>
>
>     $writerhtml->start_report(\&my_start_report);
>     $writerhtml->title(\&my_title);
>     $writerhtml->hit_link_align(\&my_hit_link_align);
>     $writerhtml->hit_link_desc(\&my_hit_link_desc);
>
>
> fgblast shows a way to do this in part. It relies on Gbrowse to  
> generate the image but you can replace the gbrowse_img reference to  
> your own image generating software.
> http://people.genome.duke.edu/~jes12/software/scripts/fgblast
>
>
>
>
> -jason
> On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:
>
> Hi everybody,
>
>
> I would like to use Bio::Graphics in conjunction with  
> Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted  
> blast report output along with an image of the blast hits as shown  
> on Slide 60 in this pdf: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
>
> I am able to get the HTML output using   
> "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the  
> image using the examples outlined in the Bio::Graphics HOWTO:  
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
>
> My question: How do I integrate Bio::Graphics with  
> Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits  
> image at the correct position in my BioPerl reformatted html file.
>
>
> I also found that someone else has asked something similar to  
> whatever I am asking & is listed under the "Orphans, Leftovers"  
> category in the ListSummary:April 26-May 9,2006 document:
> http://www.bioperl.org/wiki/ListSummary:April_26-May_9% 
> 2C2006#Orphans.2C_Leftovers
>
>
> Here is my code so far:
> ----------------------------------------------------------------
> #!/usr/bin/perl -w
> # usage: $0 <blast_report>
> use strict;
> use Bio::SearchIO;
> use Bio::SearchIO::Writer::HTMLResultWriter;
>
>
> my $infile = shift or die $!;
>
>
> my $searchio = new Bio::SearchIO( -format => 'blast',-file   =>  
> $infile);
> my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
> my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
>                                                   -file   => ">$ 
> {infile}.html");
>
>
> $outhtml->write_result($searchio->next_result);
> ----------------------------------------------------------------
>
>
> Thanks in advance,
>
>
> Anand
>
>
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.


From 1zoujing at 163.com  Wed Apr 16 22:53:16 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 16 Apr 2008 19:53:16 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>
References: <16602770.post@talk.nabble.com> <16603225.post@talk.nabble.com>
	<Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>
Message-ID: <16737795.post@talk.nabble.com>


    Thank you very much!
I splited the file on \t directly.

   Zou Jing


Stefan Kirov-2 wrote:
> 
> It is not. If you use this file, why would you need a parser for it 
> anyway? Just split on \t or read with OpenOffice or equiv.
> Stefan
> 
> On Thu, 10 Apr 2008, zoujing wrote:
> 
>>
>> Seached  the web and found the answer now, quote the answer as following:
>>   The error was thrown by my Bio::ASN1::EntrezGene module because it
>> expects a text file, while you fed it with a binary file.  To use
>> gzipped ASN binary file from NCBI, download the NCBI gene2xml
>> (ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml),
>> then use this syntax to run my parser on the binary files:
>>
>> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
>> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped
>> binary file directly downloaded from NCBI
>>
>> Same syntax should be used when you're using SeqIO (thus
>> SeqIO::entrezgene).
>> Mingyi
>>
>>   But there still one thing, I want to parse "gene_info.gz" in Gene of
>> NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one
>> line
>> per GeneID, Column header line is the first line in the file
>> ) is not the right format for Bio::ASN1::EntrezGene?
>>
>>
>>
>> zoujing wrote:
>>>
>>>    I am a geen hand in Bioperl. When I run perl with
>>> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
>>> information:
>>>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>>>
>>>    But the Sus_scrofa.ags is download from NCBI, with the format of
>>> ASN1,
>>> should be the same as Homo_sapiens in the example. So it should be no
>>> error as the code is the example from Mingyi.
>>>    I wonder why this happen, and should I change something about the
>>> file?
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16737795.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Wed Apr 16 22:55:47 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 16 Apr 2008 19:55:47 -0700 (PDT)
Subject: [Bioperl-l] Bio::ASN1::EntrezGene parse so slowly?
In-Reply-To: <264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>
References: <16602210.post@talk.nabble.com>
	<264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>
Message-ID: <16737804.post@talk.nabble.com>


Thank you vey much!
  Solved the problem now.

   Jing

Sean Davis-3 wrote:
> 
> gene_info is a tab-delimited text file, if I recall correctly.  Have
> you looked at it?  If it is, you should be able to parse it in a few
> seconds with just a couple lines of code.
> 
> Sean
> 
> 
> On Thu, Apr 10, 2008 at 1:08 AM, zoujing <1zoujing at 163.com> wrote:
>>
>>   I want to parse a file "gene_info" from NCBI. The format of Gene in
>> NCBI is
>>  ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
>>  properly/too slow. The file is about 500M.
>>   The code is following:
>>   use Bio::ASN1::EntrezGene;
>>   my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
>>   my $i = 0;
>>   while(my $result = $parser->next_seq)
>>   { last; #something to do there, here use last for test}
>>
>>   When it goes to the "while" part, it is processing on and on, it does
>> not
>>  went out, even I used "last" in the "while" part.
>>    So I wonder whether it is too slow or the module is not fit for this
>> job,
>>  or I did something wrong?
>>
>>   Thank you!
>>  --
>>  View this message in context:
>> http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
>>  Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>>  _______________________________________________
>>  Bioperl-l mailing list
>>  Bioperl-l at lists.open-bio.org
>>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16737804.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From sbassi at clubdelarazon.org  Sat Apr 26 13:49:20 2008
From: sbassi at clubdelarazon.org (Sebastian Bassi)
Date: Sat, 26 Apr 2008 14:49:20 -0300
Subject: [Bioperl-l] bioperl installation problem
Message-ID: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>

I tried to install bioperl because I need to install cviewer.
Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and sdterr outputs.

Here is one of the errors I get:

set_attribute: not a compat02 graph at
/usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
sleeping for 3 seconds
set_attribute: not a compat02 graph at
/usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.

But I have GD::Graph, so I don't know what is going on:

sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install GD::Graph'
CPAN: Storable loaded ok
Going to read /home/sbassi/.cpan/Metadata
  Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
GD::Graph is up to date.

Any help regarding this: http://www.pastecode.com.ar/f37c1cd60
would be appreciated.

Best,
SB.

-- 
Sebasti?n Bassi (???????). Diplomado en Ciencia y Tecnolog?a.
Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
Mostr? tu c?digo: http://www.pastecode.com.ar
GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D


From jason at bioperl.org  Sat Apr 26 15:23:37 2008
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 26 Apr 2008 12:23:37 -0700
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
Message-ID: <B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>

the error refers to the 'Graph' module not 'GD::Graph';

-jason
On Apr 26, 2008, at 10:49 AM, Sebastian Bassi wrote:

> I tried to install bioperl because I need to install cviewer.
> Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and  
> sdterr outputs.
>
> Here is one of the errors I get:
>
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
> sleeping for 3 seconds
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.
>
> But I have GD::Graph, so I don't know what is going on:
>
> sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install GD::Graph'
> CPAN: Storable loaded ok
> Going to read /home/sbassi/.cpan/Metadata
>   Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
> GD::Graph is up to date.
>
> Any help regarding this: http://www.pastecode.com.ar/f37c1cd60
> would be appreciated.
>
> Best,
> SB.
>
> -- 
> Sebasti?n Bassi (???????). Diplomado en Ciencia y  
> Tecnolog?a.
> Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
> Mostr? tu c?digo: http://www.pastecode.com.ar
> GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sbassi at clubdelarazon.org  Sat Apr 26 17:08:13 2008
From: sbassi at clubdelarazon.org (Sebastian Bassi)
Date: Sat, 26 Apr 2008 18:08:13 -0300
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
	<B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>
Message-ID: <9e2f512b0804261408l45ff9f91j94f44065d21cd65f@mail.gmail.com>

On Sat, Apr 26, 2008 at 4:23 PM, Jason Stajich <jason at bioperl.org> wrote:
> the error refers to the 'Graph' module not 'GD::Graph';

You are right, but I have it also installed:

sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install Graph'
Password:
CPAN: Storable loaded ok
Going to read /home/sbassi/.cpan/Metadata
  Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
Graph is up to date.


-- 
Sebasti?n Bassi (???????). Diplomado en Ciencia y Tecnolog?a.
Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
Mostr? tu c?digo: http://www.pastecode.com.ar
GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D


From bix at sendu.me.uk  Sat Apr 26 19:30:56 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Sun, 27 Apr 2008 00:30:56 +0100
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
Message-ID: <4813BB30.6060703@sendu.me.uk>

Sebastian Bassi wrote:
> I tried to install bioperl because I need to install cviewer.
> Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and sdterr outputs.
> 
> Here is one of the errors I get:
> 
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
> sleeping for 3 seconds
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.

You're trying to install a very old version of Bioperl which apparently 
uses behaviour of the Graph module no longer supported:
http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod#Backward_compatibility_with_Graph_0.2

Your options are to force install your desired version of Bioperl (if 
you don't need to use the modules that are causing the errors you get), 
downgrade your version of Graph to pre-0.2, or install the latest 
version of Bioperl (1.5.2 or from svn).


From dr.hogart at gmail.com  Sun Apr 27 10:05:20 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Sun, 27 Apr 2008 18:05:20 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
Message-ID: <op.t99vyoejavnppr@hogart.hackers>

Hi all,

is it possible to add a GD::graphic object (chart) to Bio::Graphics panel  
to obtain a file with image of both the chart and bioseq object?


From Russell.Smithies at agresearch.co.nz  Sun Apr 27 17:27:23 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 28 Apr 2008 09:27:23 +1200
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <op.t99vyoejavnppr@hogart.hackers>
References: <op.t99vyoejavnppr@hogart.hackers>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>

You can get the GD object back from the Bio::Graphics::Panel  then draw
on it using GD methods

Eg:

#create a BioPerl panel
my $panel = Bio::Graphics::Panel->new(
                              			-length   => 600
                              			-width    => 800,
					-bgcolor  => 'white'
					);
# add your features
my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
200,);
$panel->add_track($feature, glyph   =>   'segments',
					-label   =>   0,
					-height  =>   30,
					-bgcolor  =>  'red',
					-fgcolor  => 'red'
					 );

# grab the GD thingy
my $gd = $panel->gd;

#create a color - not sure if there's a better way?
$black = $gd->colorAllocate(0,0,0);

#draw on your GD thingy
$gd->line(10,10,$panel->width -10,10,$black);
$gd->string(gdSmallFont,20,10,'test' ,'$black);

# print it as normal	
print $panel->png;


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-
> bio.org] On Behalf Of sergei ryazansky
> Sent: Monday, 28 April 2008 2:05 a.m.
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
> 
> Hi all,
> 
> is it possible to add a GD::graphic object (chart) to Bio::Graphics
panel
> to obtain a file with image of both the chart and bioseq object?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From dr.hogart at gmail.com  Sun Apr 27 20:25:18 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Mon, 28 Apr 2008 04:25:18 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
Message-ID: <op.uaaosgoeavnppr@hogart.hackers>

Thanks for answer!
Yours  script works fine, but nevertheless, as for as I understand 'gd'  
method return the gd::image object. But I need the to merge bioseq object  
with gd::graph object (gd::graph::area). Is it possible? Or maybe I  
misunderstood something in your example?


On Mon, 28 Apr 2008 01:27:23 +0400, Smithies, Russell  
<Russell.Smithies at agresearch.co.nz> wrote:

> You can get the GD object back from the Bio::Graphics::Panel  then draw
> on it using GD methods
>
> Eg:
>
> #create a BioPerl panel
> my $panel = Bio::Graphics::Panel->new(
>                               			-length   => 600
>                               			-width    => 800,
> 					-bgcolor  => 'white'
> 					);
> # add your features
> my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> 200,);
> $panel->add_track($feature, glyph   =>   'segments',
> 					-label   =>   0,
> 					-height  =>   30,
> 					-bgcolor  =>  'red',
> 					-fgcolor  => 'red'
> 					 );
>
> # grab the GD thingy
> my $gd = $panel->gd;
>
> #create a color - not sure if there's a better way?
> $black = $gd->colorAllocate(0,0,0);
>
> #draw on your GD thingy
> $gd->line(10,10,$panel->width -10,10,$black);
> $gd->string(gdSmallFont,20,10,'test' ,'$black);
>
> # print it as normal	
> print $panel->png;
>
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-
>> bio.org] On Behalf Of sergei ryazansky
>> Sent: Monday, 28 April 2008 2:05 a.m.
>> To: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
>>
>> Hi all,
>>
>> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> panel
>> to obtain a file with image of both the chart and bioseq object?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


From Bank.Beszteri at awi.de  Mon Apr 28 08:18:20 2008
From: Bank.Beszteri at awi.de (=?UTF-8?B?QsOhbmsgQmVzenRlcmk=?=)
Date: Mon, 28 Apr 2008 14:18:20 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FB204F.90405@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de>
Message-ID: <4815C08C.1060305@awi.de>

Dear BioSQL / bioperl-db-ists,

I would like  to share my experiences with trying to load uniprot_trembl 
into a BioSQL db, and also to ask a couple of questions; perhaps some of 
you know the problems I encountered. I used bioperl-live and 
bioperl-db-live as of 2008-04-03 and uniprot_trembl.dat as of 
2008-04-04. The command was like

load_seqdatabase.pl --safe --logchunk 1000 --host dbserv --dbname abc 
--dbuser efg --dbpass xyz --driver mysql --namespace uniprot_trembl 
--format embl uniprot_trembl.dat

although I split the dat file into 10 chunks and started them parallel 
to make it faster. This did not go quite as smoothly as Swissprot did. 
In the end, it seems to have loaded 5022284 entries of the 5443284 which 
appear to be there in the input file (when counting with grep -c "ID   ").

Besides the harmless taxonomy warnings which also appear with Swissprot 
(and have been discussed about here a couple of weeks ago and also 
earlier), there came a couple of more serious errors. Perhaps some of 
you know them already:

First of all, the below error seems to lead to a crash, in spite of --safe:

 >>>
------------- EXCEPTION -------------
MSG: A1XDT7 seems to have an invalid species classification.
STACK Bio::SeqIO::embl::_read_EMBL_Species 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:108
7
STACK Bio::SeqIO::embl::next_seq 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:320
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:634
-------------------------------------

Command exited with non-zero status 255
<<<

What this is about is NCBI Tax_ID:435 (Acetobacter aceti; it has some 30 
synonyms in my DB, too), which, to me, looks like a completely normal 
taxon: I could follow its taxonomy up to the root in my NCBI taxonomy in 
the BioSQL DB I used. I don?t know if someone else has seen / can 
reproduce the problem, or should I think about some problem with my 
taxonomy db? Besides, is it the expected behaviour from 
load_seqdatabase.pl to die upon this error?

###################

The other problems did not lead to a crash, only to a failure to load 
the sequence, which would be what I?d expect with --safe. The first type 
of errors looks like

 >>>
Could not store Q49I36:
------------- EXCEPTION -------------
MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor returned 2 rows 
instead of 1. Query was [name_class="scientific 
name",binomial="Onchocerca volvulus"]
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:958
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:854
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182
STACK Bio::DB::Persistent::PersistentObject::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:244
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
STACK Bio::DB::Persistent::PersistentObject::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:271
STACK (eval) 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:630
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:612
-------------------------------------
<<<

In this particular case, "Onchocerca volvulus" does indeed have two 
taxon_ids in my DB (6282 and 563188, of which only the first one is 
returned by a web search at NCBI taxonomy); but the same thing happened 
with a number of other taxa (followed by how many times the above error 
was caused by the particular taxa):

Wolbachia pipientis     64
Hemerocallis sp.        1
Hypsiglena torquata     3
Salmonella enterica     1211
Burkholderia sp.        31
Streptococcus sp.       4
Rhizobium sp.   600
Nostoc sp.      19
Drosophila sp.  18
Onchocerca volvulus     62
Atlapetes schistaceus   4
Symbiodinium sp.        3
Escherichia coli        7421
Hieraaetus fasciatus    4
Borrelia burgdorferi group      1
Pseudomonas sp. 29
Rotavirus A     1076
Gorilla gorilla 746
Rana plancyi    14
unclassified sequences  1

(This should be 11312 cases altogether, but the list might be incomplete 
because I accidentally removed one of my logs, which contained STDOUT 
&STDERR ~ for 10 % of the entries)

Again, is this a known problem for some of you, or could there be a 
problem with my copy of NCBI taxonomy? I don?t remember having updated 
it after the initial upload, so I?m quite surprised by such duplicate 
entries....

###################

Type 2 error w/o crash:

 >>>
Could not store A5HU09:
------------- EXCEPTION -------------
MSG: create: object (Bio::Species) failed to insert or to be found by 
unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206
STACK Bio::DB::Persistent::PersistentObject::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:244
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
STACK Bio::DB::Persistent::PersistentObject::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:271
STACK (eval) 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:630
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:612
<<<

This particular record has the NCBI_TaxID 44271, which looks completely 
normal in the NCBI taxonomy loaded in my BioSQL DB, but the same problem 
appeared in 53 further cases (I could not look into them in detail as 
yet to see whether they were all the same species). On the other hand, 7 
records which were succesfully loaded have this taxonomy ID in the DB 
(44271).

###################

Nr 3 no crash:

 >>>
Could not store Q6T859: Unmatched ( in regex; marked by <-- HERE in 
m/Camelina microcarpa (Littlepod false flax) ( <-- HERE microcarpa 
subsp.\s+/ at 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/Species.pm line 
466, <GEN0> line 357048.
<<<

This happens in the sub binomial in Species.pm using the option "FULL", 
which requests to also return subspecies. I have not looked much deeper 
into this yet, but is it possible that there is a parsing problem with 
multi-line species strings? In the above case the OS field in 
uniprot_trembl.dat looks like

OS   Camelina microcarpa (Littlepod false flax) (Camelina microcarpa subsp.
OS   sylvestris).

###################

I?m still looking for where the remaining records disappeared: of the 
421000 records not showing up in the DB, I could find these:

crasher (Tax_ID=435):   45 entries
problem 1 ("MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor 
returned 2 rows instead of 1."): 11312 entries
problem 2 ("MSG: create: object (Bio::Species) failed to insert or to be 
found by unique key"): 54 entries
problem 3 ("Unmatched ( in regex"): 28241 entries

381348 still remain... Although these could in principle come from the 
first 10 %, for which I don?t have the output, but they don?t seem to: 
after restarting that chunk, I get ~ 30 "Could not store" errors.

So the last question: are there any error messages I can expect which 
don?t contain "Could not store" and which I thus missed here?


Bank Beszteri


Bioinformatics
Alfred Wegener Institute for Polar and Marine Research
Am Handelshafen 12
27570 Bremerhaven


From cjfields at uiuc.edu  Mon Apr 28 09:20:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 28 Apr 2008 08:20:39 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815C08C.1060305@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
Message-ID: <5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>

On Apr 28, 2008, at 7:18 AM, B?nk Beszteri wrote:

> Dear BioSQL / bioperl-db-ists,
>
> I would like  to share my experiences with trying to load  
> uniprot_trembl into a BioSQL db, and also to ask a couple of  
> questions; perhaps some of you know the problems I encountered. I  
> used bioperl-live and bioperl-db-live as of 2008-04-03 and  
> uniprot_trembl.dat as of 2008-04-04. The command was like
>
> load_seqdatabase.pl --safe --logchunk 1000 --host dbserv --dbname  
> abc --dbuser efg --dbpass xyz --driver mysql --namespace  
> uniprot_trembl --format embl uniprot_trembl.dat
>
> ....
>
> First of all, the below error seems to lead to a crash, in spite of  
> --safe:
>
> >>>
> ------------- EXCEPTION -------------
> MSG: A1XDT7 seems to have an invalid species classification.
> STACK Bio::SeqIO::embl::_read_EMBL_Species /home/biocl/bbeszter/lib/ 
> bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:108
> 7
> STACK Bio::SeqIO::embl::next_seq /home/biocl/bbeszter/lib/bioperl- 
> live/bioperl-live/Bio/SeqIO/embl.pm:320
> STACK toplevel /home/biocl/bbeszter/lib/bioperl-live/bioperl-db/ 
> scripts/biosql/load_seqdatabase.pl:634
> -------------------------------------
>
> Command exited with non-zero status 255
> <<<
>
> What this is about is NCBI Tax_ID:435 (Acetobacter aceti; it has  
> some 30 synonyms in my DB, too), which, to me, looks like a  
> completely normal taxon: I could follow its taxonomy up to the root  
> in my NCBI taxonomy in the BioSQL DB I used. I don?t know if someone  
> else has seen / can reproduce the problem, or should I think about  
> some problem with my taxonomy db? Besides, is it the expected  
> behaviour from load_seqdatabase.pl to die upon this error?

...

You should use 'swiss' format instead of 'embl' when loading Uniprot/ 
SwissProt sequences.  Though on the surface they're similar the  
feature table (among other things) is completely different.  I'm not  
sure if that's causing all of the issues here but it certainly could  
contribute to them.

In the meantime, it's much easier for us to track these problems if  
you file a bug (BioPerl, file for bioperl-db):

http://bugzilla.open-bio.org/

chris


From cjfields at uiuc.edu  Sun Apr 27 17:54:03 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 27 Apr 2008 16:54:03 -0500
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
Message-ID: <FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>

I think this is how some of the synteny mapping is done using  
SynBrowse (the trapezoids connecting syntenous genes on different  
tracks).

http://www.gmod.org/wiki/index.php/SynView

chris

On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:

> You can get the GD object back from the Bio::Graphics::Panel  then  
> draw
> on it using GD methods
>
> Eg:
>
> #create a BioPerl panel
> my $panel = Bio::Graphics::Panel->new(
>                              			-length   => 600
>                              			-width    => 800,
> 					-bgcolor  => 'white'
> 					);
> # add your features
> my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> 200,);
> $panel->add_track($feature, glyph   =>   'segments',
> 					-label   =>   0,
> 					-height  =>   30,
> 					-bgcolor  =>  'red',
> 					-fgcolor  => 'red'
> 					 );
>
> # grab the GD thingy
> my $gd = $panel->gd;
>
> #create a color - not sure if there's a better way?
> $black = $gd->colorAllocate(0,0,0);
>
> #draw on your GD thingy
> $gd->line(10,10,$panel->width -10,10,$black);
> $gd->string(gdSmallFont,20,10,'test' ,'$black);
>
> # print it as normal	
> print $panel->png;
>
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-
>> bio.org] On Behalf Of sergei ryazansky
>> Sent: Monday, 28 April 2008 2:05 a.m.
>> To: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
>>
>> Hi all,
>>
>> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> panel
>> to obtain a file with image of both the chart and bioseq object?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Mon Apr 28 09:51:53 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Mon, 28 Apr 2008 15:51:53 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
Message-ID: <4815D679.3070307@awi.de>

Chris Fields schrieb:
>
> ...
>
> You should use 'swiss' format instead of 'embl' when loading 
> Uniprot/SwissProt sequences.  Though on the surface they're similar 
> the feature table (among other things) is completely different.  I'm 
> not sure if that's causing all of the issues here but it certainly 
> could contribute to them.
>
> In the meantime, it's much easier for us to track these problems if 
> you file a bug (BioPerl, file for bioperl-db):
>
> http://bugzilla.open-bio.org/
>
Hi Chris,

I will do so; in the meanwhile: I?m not loading Swissprot, but TrEMBL. 
Is swiss also the appropriate format here? By reading 
http://expasy.org/sprot/userman.html#diffEMBL, I concluded that embl 
should be the one I?d need for TrEMBL.

Bank


From cjfields at uiuc.edu  Mon Apr 28 12:24:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 28 Apr 2008 11:24:39 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815D679.3070307@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
	<4815D679.3070307@awi.de>
Message-ID: <B7918B56-536D-497F-A59D-D48A61085339@uiuc.edu>


On Apr 28, 2008, at 8:51 AM, B?nk Beszteri wrote:

> Chris Fields schrieb:
>>
>> ...
>>
>> You should use 'swiss' format instead of 'embl' when loading  
>> Uniprot/SwissProt sequences.  Though on the surface they're similar  
>> the feature table (among other things) is completely different.   
>> I'm not sure if that's causing all of the issues here but it  
>> certainly could contribute to them.
>>
>> In the meantime, it's much easier for us to track these problems if  
>> you file a bug (BioPerl, file for bioperl-db):
>>
>> http://bugzilla.open-bio.org/
>>
> Hi Chris,
>
> I will do so; in the meanwhile: I?m not loading Swissprot, but  
> TrEMBL. Is swiss also the appropriate format here? By reading http://expasy.org/sprot/userman.html#diffEMBL 
> , I concluded that embl should be the one I?d need for TrEMBL.
>
> Bank

The section you link to describes several important differences  
between EMBL and SwissProt/UniProt format (i.e. how each indicated  
line type differs between SwissProt and EMBL formats, including ID,  
AC, OS/OC, FT, etc).  I'm unsure how you derived that 'embl' would  
work from that, e.g. they are close, but there are enough significant  
differences that using 'embl' for SwissProt (or vice versa) will not  
work as intended, if at all.

chris


From hlapp at gmx.net  Mon Apr 28 15:46:07 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 28 Apr 2008 15:46:07 -0400
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815D679.3070307@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
	<4815D679.3070307@awi.de>
Message-ID: <3BD6A261-D023-4A5F-9CBC-C3216B0145F0@gmx.net>


On Apr 28, 2008, at 9:51 AM, B?nk Beszteri wrote:
>  I?m not loading Swissprot, but TrEMBL. Is swiss also the  
> appropriate format here?


Yes, though I guess it can be confusing.

Maybe we should create a symlink uniprot.pm to swiss.pm, or in fact  
fork them if UniProt starts accumulating enough differences from the  
traditional Swissprot format.

BTW as you had noticed, the --safe switch only protects the script  
from crashing due to a db loading error. A parsing error will still  
cause a crash.

I guess you can argue that that's not nice, and having a chance to  
skip over the record that offends the (BioPerl) parser would be  
useful. The problem is that if the parser errors out, it's not  
guaranteed where we are in the file and whether the parser module is  
in a state that it can recover itself from. For the database it's a  
bit easier as one just needs to rollback() the transaction (each  
sequence is its own transaction).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From Russell.Smithies at agresearch.co.nz  Mon Apr 28 17:15:16 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 29 Apr 2008 09:15:16 +1200
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>

I thought it was a bit of a hack but I guess if someone else is doing it
too, it can't be all bad  :-)

It looks like you can combine your drawing methods like this:
(I'm sure Lincoln will tell us this is bad but it seems to work ok)
------------------------------------------------------------------------
-------------

#!perl -w
use GD::Graph::lines;
use GD::Graph::colour;
use GD::Graph::Data;

use Bio::Graphics;
use Bio::SeqFeature::Generic;

# create and draw on a graphics panel
my $panel = Bio::Graphics::Panel->new(
                                      -length => 500,
                                      -width  => 500
                                     );
my $track = $panel->add_track(
                              -glyph => 'generic',
                              -label => 1
                             );

# create and add a few features
for($i = 100; $i < 500; $i+= 100){
  my $feature = Bio::SeqFeature::Generic->new(
                                              -display_name => "feature:
$i",
                                              -score        => $i,
                                              -start        => $i,
                                              -end          => $i + 100
                                             );
  $track->add_feature($feature);
}


# create and draw the graph
my @data = (
    ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
    [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
    [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
);
my $graph = GD::Graph::lines->new(500, 300);

$graph->set(
      x_label           => 'X Label',
      y_label           => 'Y label',
      title             => 'Some simple graph',
      y_max_value       => 8,
      y_tick_number     => 8,
      y_label_skip      => 2
) or die $graph->error;

$graph->set( dclrs => [ qw( green blue black red pink) ] );

my $gd = $graph->plot(\@data) or die $graph->error;

# combine the two images
my $combined = $panel->gd($gd);

open(IMG, '>file.png') or die $!;
binmode IMG;
print IMG $combined->png;

------------------------------------------------------------------------
------------------

> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Monday, 28 April 2008 9:54 a.m.
> To: Smithies, Russell
> Cc: sergei ryazansky; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] addition of GD::graphic object to
Bio::Graphics
> 
> I think this is how some of the synteny mapping is done using
> SynBrowse (the trapezoids connecting syntenous genes on different
> tracks).
> 
> http://www.gmod.org/wiki/index.php/SynView
> 
> chris
> 
> On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
> 
> > You can get the GD object back from the Bio::Graphics::Panel  then
> > draw
> > on it using GD methods
> >
> > Eg:
> >
> > #create a BioPerl panel
> > my $panel = Bio::Graphics::Panel->new(
> >                              			-length   => 600
> >                              			-width    =>
800,
> > 					-bgcolor  => 'white'
> > 					);
> > # add your features
> > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> > 200,);
> > $panel->add_track($feature, glyph   =>   'segments',
> > 					-label   =>   0,
> > 					-height  =>   30,
> > 					-bgcolor  =>  'red',
> > 					-fgcolor  => 'red'
> > 					 );
> >
> > # grab the GD thingy
> > my $gd = $panel->gd;
> >
> > #create a color - not sure if there's a better way?
> > $black = $gd->colorAllocate(0,0,0);
> >
> > #draw on your GD thingy
> > $gd->line(10,10,$panel->width -10,10,$black);
> > $gd->string(gdSmallFont,20,10,'test' ,'$black);
> >
> > # print it as normal
> > print $panel->png;
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org
> > [mailto:bioperl-l-bounces at lists.open-
> >> bio.org] On Behalf Of sergei ryazansky
> >> Sent: Monday, 28 April 2008 2:05 a.m.
> >> To: bioperl-l at bioperl.org
> >> Subject: [Bioperl-l] addition of GD::graphic object to
Bio::Graphics
> >>
> >> Hi all,
> >>
> >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> > panel
> >> to obtain a file with image of both the chart and bioseq object?
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =
> >
> =============================================================
> =========
> > Attention: The information contained in this message and/or
> > attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or
> > privileged
> > material. Any review, retransmission, dissemination or other use of,
> > or
> > taking of any action in reliance upon, this information by persons
or
> > entities other than the intended recipients is prohibited by
> > AgResearch
> > Limited. If you have received this message in error, please notify
the
> > sender immediately.
> > =
> >
> =============================================================
> =========
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From lincoln.stein at gmail.com  Mon Apr 28 17:33:19 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 28 Apr 2008 17:33:19 -0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
	<D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
Message-ID: <6dce9a0b0804281433i697cda2fo2c47ce59010d0858@mail.gmail.com>

Hi,

No, I'm perfectly happy with combining images like this. It is part of what
I intended.

Another idea would be to use the Image glyph to embed graphs at particular
genomic locations in the panel. Right now the glyph is designed in the
expectation that the image passed to it is sitting on the file system (or a
web URL), but it would be easy to modify it so that a callback can generate
the GD on the fly, by using, for example GD::Graph.

Lincoln

On Mon, Apr 28, 2008 at 5:15 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

> I thought it was a bit of a hack but I guess if someone else is doing it
> too, it can't be all bad  :-)
>
> It looks like you can combine your drawing methods like this:
> (I'm sure Lincoln will tell us this is bad but it seems to work ok)
> ------------------------------------------------------------------------
> -------------
>
> #!perl -w
> use GD::Graph::lines;
> use GD::Graph::colour;
> use GD::Graph::Data;
>
> use Bio::Graphics;
> use Bio::SeqFeature::Generic;
>
> # create and draw on a graphics panel
> my $panel = Bio::Graphics::Panel->new(
>                                       -length => 500,
>                                      -width  => 500
>                                     );
> my $track = $panel->add_track(
>                              -glyph => 'generic',
>                              -label => 1
>                             );
>
> # create and add a few features
> for($i = 100; $i < 500; $i+= 100){
>   my $feature = Bio::SeqFeature::Generic->new(
>                                               -display_name => "feature:
> $i",
>                                              -score        => $i,
>                                              -start        => $i,
>                                              -end          => $i + 100
>                                             );
>  $track->add_feature($feature);
> }
>
>
> # create and draw the graph
> my @data = (
>    ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
>    [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
>    [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
> );
> my $graph = GD::Graph::lines->new(500, 300);
>
> $graph->set(
>      x_label           => 'X Label',
>      y_label           => 'Y label',
>      title             => 'Some simple graph',
>      y_max_value       => 8,
>      y_tick_number     => 8,
>      y_label_skip      => 2
> ) or die $graph->error;
>
> $graph->set( dclrs => [ qw( green blue black red pink) ] );
>
> my $gd = $graph->plot(\@data) or die $graph->error;
>
> # combine the two images
> my $combined = $panel->gd($gd);
>
> open(IMG, '>file.png') or die $!;
> binmode IMG;
> print IMG $combined->png;
>
> ------------------------------------------------------------------------
> ------------------
>
> > -----Original Message-----
> > From: Chris Fields [mailto:cjfields at uiuc.edu]
> > Sent: Monday, 28 April 2008 9:54 a.m.
> > To: Smithies, Russell
> > Cc: sergei ryazansky; bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
> >
> > I think this is how some of the synteny mapping is done using
> > SynBrowse (the trapezoids connecting syntenous genes on different
> > tracks).
> >
> > http://www.gmod.org/wiki/index.php/SynView
> >
> > chris
> >
> > On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
> >
> > > You can get the GD object back from the Bio::Graphics::Panel  then
> > > draw
> > > on it using GD methods
> > >
> > > Eg:
> > >
> > > #create a BioPerl panel
> > > my $panel = Bio::Graphics::Panel->new(
> > >                                                     -length   => 600
> > >                                                     -width    =>
> 800,
> > >                                     -bgcolor  => 'white'
> > >                                     );
> > > # add your features
> > > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> > > 200,);
> > > $panel->add_track($feature, glyph   =>   'segments',
> > >                                     -label   =>   0,
> > >                                     -height  =>   30,
> > >                                     -bgcolor  =>  'red',
> > >                                     -fgcolor  => 'red'
> > >                                      );
> > >
> > > # grab the GD thingy
> > > my $gd = $panel->gd;
> > >
> > > #create a color - not sure if there's a better way?
> > > $black = $gd->colorAllocate(0,0,0);
> > >
> > > #draw on your GD thingy
> > > $gd->line(10,10,$panel->width -10,10,$black);
> > > $gd->string(gdSmallFont,20,10,'test' ,'$black);
> > >
> > > # print it as normal
> > > print $panel->png;
> > >
> > >
> > >
> > >
> > >> -----Original Message-----
> > >> From: bioperl-l-bounces at lists.open-bio.org
> > > [mailto:bioperl-l-bounces at lists.open-
> > >> bio.org] On Behalf Of sergei ryazansky
> > >> Sent: Monday, 28 April 2008 2:05 a.m.
> > >> To: bioperl-l at bioperl.org
> > >> Subject: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
> > >>
> > >> Hi all,
> > >>
> > >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> > > panel
> > >> to obtain a file with image of both the chart and bioseq object?
> > >>
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > =
> > >
> > =============================================================
> > =========
> > > Attention: The information contained in this message and/or
> > > attachments
> > > from AgResearch Limited is intended only for the persons or entities
> > > to which it is addressed and may contain confidential and/or
> > > privileged
> > > material. Any review, retransmission, dissemination or other use of,
> > > or
> > > taking of any action in reliance upon, this information by persons
> or
> > > entities other than the intended recipients is prohibited by
> > > AgResearch
> > > Limited. If you have received this message in error, please notify
> the
> > > sender immediately.
> > > =
> > >
> > =============================================================
> > =========
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From dr.hogart at gmail.com  Tue Apr 29 03:56:24 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Tue, 29 Apr 2008 11:56:24 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
	<D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
Message-ID: <op.uac4caojavnppr@hogart.img.ras.ru>

Thank you very much! It is exactly that I was looking for.

On Tue, 29 Apr 2008 01:15:16 +0400, Smithies, Russell  
<Russell.Smithies at agresearch.co.nz> wrote:

> I thought it was a bit of a hack but I guess if someone else is doing it
> too, it can't be all bad  :-)
>
> It looks like you can combine your drawing methods like this:
> (I'm sure Lincoln will tell us this is bad but it seems to work ok)
> ------------------------------------------------------------------------
> -------------
>
> #!perl -w
> use GD::Graph::lines;
> use GD::Graph::colour;
> use GD::Graph::Data;
>
> use Bio::Graphics;
> use Bio::SeqFeature::Generic;
>
> # create and draw on a graphics panel
> my $panel = Bio::Graphics::Panel->new(
>                                       -length => 500,
>                                       -width  => 500
>                                      );
> my $track = $panel->add_track(
>                               -glyph => 'generic',
>                               -label => 1
>                              );
>
> # create and add a few features
> for($i = 100; $i < 500; $i+= 100){
>   my $feature = Bio::SeqFeature::Generic->new(
>                                               -display_name => "feature:
> $i",
>                                               -score        => $i,
>                                               -start        => $i,
>                                               -end          => $i + 100
>                                              );
>   $track->add_feature($feature);
> }
>
>
> # create and draw the graph
> my @data = (
>     ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
>     [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
>     [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
> );
> my $graph = GD::Graph::lines->new(500, 300);
>
> $graph->set(
>       x_label           => 'X Label',
>       y_label           => 'Y label',
>       title             => 'Some simple graph',
>       y_max_value       => 8,
>       y_tick_number     => 8,
>       y_label_skip      => 2
> ) or die $graph->error;
>
> $graph->set( dclrs => [ qw( green blue black red pink) ] );
>
> my $gd = $graph->plot(\@data) or die $graph->error;
>
> # combine the two images
> my $combined = $panel->gd($gd);
>
> open(IMG, '>file.png') or die $!;
> binmode IMG;
> print IMG $combined->png;
>
> ------------------------------------------------------------------------
> ------------------
>
>> -----Original Message-----
>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> Sent: Monday, 28 April 2008 9:54 a.m.
>> To: Smithies, Russell
>> Cc: sergei ryazansky; bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
>>
>> I think this is how some of the synteny mapping is done using
>> SynBrowse (the trapezoids connecting syntenous genes on different
>> tracks).
>>
>> http://www.gmod.org/wiki/index.php/SynView
>>
>> chris
>>
>> On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
>>
>> > You can get the GD object back from the Bio::Graphics::Panel  then
>> > draw
>> > on it using GD methods
>> >
>> > Eg:
>> >
>> > #create a BioPerl panel
>> > my $panel = Bio::Graphics::Panel->new(
>> >                              			-length   => 600
>> >                              			-width    =>
> 800,
>> > 					-bgcolor  => 'white'
>> > 					);
>> > # add your features
>> > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
>> > 200,);
>> > $panel->add_track($feature, glyph   =>   'segments',
>> > 					-label   =>   0,
>> > 					-height  =>   30,
>> > 					-bgcolor  =>  'red',
>> > 					-fgcolor  => 'red'
>> > 					 );
>> >
>> > # grab the GD thingy
>> > my $gd = $panel->gd;
>> >
>> > #create a color - not sure if there's a better way?
>> > $black = $gd->colorAllocate(0,0,0);
>> >
>> > #draw on your GD thingy
>> > $gd->line(10,10,$panel->width -10,10,$black);
>> > $gd->string(gdSmallFont,20,10,'test' ,'$black);
>> >
>> > # print it as normal
>> > print $panel->png;
>> >
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org
>> > [mailto:bioperl-l-bounces at lists.open-
>> >> bio.org] On Behalf Of sergei ryazansky
>> >> Sent: Monday, 28 April 2008 2:05 a.m.
>> >> To: bioperl-l at bioperl.org
>> >> Subject: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
>> >>
>> >> Hi all,
>> >>
>> >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
>> > panel
>> >> to obtain a file with image of both the chart and bioseq object?
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =
>> >
>> =============================================================
>> =========
>> > Attention: The information contained in this message and/or
>> > attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or
>> > privileged
>> > material. Any review, retransmission, dissemination or other use of,
>> > or
>> > taking of any action in reliance upon, this information by persons
> or
>> > entities other than the intended recipients is prohibited by
>> > AgResearch
>> > Limited. If you have received this message in error, please notify
> the
>> > sender immediately.
>> > =
>> >
>> =============================================================
>> =========
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


-- 
?????????? M2, ????????????? ???????? ?????????? Opera:  
http://www.opera.com/mail/mail/


From d.gatherer at mrcvu.gla.ac.uk  Tue Apr 29 08:21:05 2008
From: d.gatherer at mrcvu.gla.ac.uk (Derek Gatherer)
Date: Tue, 29 Apr 2008 13:21:05 +0100
Subject: [Bioperl-l] translate() oddities
Message-ID: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>

Hi

I thought I'd better run this by the community before I embarrass 
myself on Bugzilla.  It seems like a clear bug to me.  I'm running 
Bioperl 1.5.0 on RedHat.

For a test input:

 >test
ATGATGATGATGATGTGA

the following code is fine.

while((my $seqobj = $seq_in->next_seq()))
{
     print "\n".$seqobj->display_id;
     my $len  = $seqobj->length();
     print " length: $len";
     my $frame1_obj = $seqobj->translate();
     my $f1_prot = $frame1_obj->seq();
     print "\n$f1_prot";
}

Output:

test length: 18
MMMMM*

But if I want to change the frame as specified in the BioPerl 
tutorial, by using:

my $frame1_obj = $seqobj->translate(frame => 1); # which should now 
give frame 2, I get:

test length: 18
MMMMM-frame

The frame is unchanged and the text "-frame" is tacked on the end of 
the output.  The same occurs with translate(frame => 2).

Any ideas?  Can something as fundamental as translate() really be 
bugged?  or am I guilty of some particularly heinous syntax error?

Cheers
Derek


From tristan.lefebure at gmail.com  Tue Apr 29 09:58:21 2008
From: tristan.lefebure at gmail.com (Tristan Lefebure)
Date: Tue, 29 Apr 2008 09:58:21 -0400
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
Message-ID: <200804290958.21548.tristan.lefebure@gmail.com>

Aren't you forgetting the dash?

my $frame1_obj = $seqobj->translate(-frame => 1)


On Tuesday 29 April 2008 08:21:05 Derek Gatherer wrote:
> my $frame1_obj = $seqobj->translate(frame => 1)


-Tristan


From d.gatherer at mrcvu.gla.ac.uk  Tue Apr 29 10:05:03 2008
From: d.gatherer at mrcvu.gla.ac.uk (Derek Gatherer)
Date: Tue, 29 Apr 2008 15:05:03 +0100
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <481726BF.1060609@bms.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<481726BF.1060609@bms.com>
Message-ID: <E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>

Thanks Stefan

Actually, there was a typo in my message, I did use -frame => 
1.  However, the problem disappears on upgrading from 1.5.0 to 1.5.2.

So not a bug anymore.

Cheers
Derek

At 14:46 29/04/2008, Stefan Kirov wrote:
>my $frame1_obj = $seqobj->translate(-frame => 1);
>not
>my $frame1_obj = $seqobj->translate(frame => 1);
>Stefan
>
>Derek Gatherer wrote:
> > Hi
> >
> > I thought I'd better run this by the community before I embarrass
> > myself on Bugzilla.  It seems like a clear bug to me.  I'm running
> > Bioperl 1.5.0 on RedHat.
> >
> > For a test input:
> >
> > >test
> > ATGATGATGATGATGTGA
> >
> > the following code is fine.
> >
> > while((my $seqobj = $seq_in->next_seq()))
> > {
> >     print "\n".$seqobj->display_id;
> >     my $len  = $seqobj->length();
> >     print " length: $len";
> >     my $frame1_obj = $seqobj->translate();
> >     my $f1_prot = $frame1_obj->seq();
> >     print "\n$f1_prot";
> > }
> >
> > Output:
> >
> > test length: 18
> > MMMMM*
> >
> > But if I want to change the frame as specified in the BioPerl
> > tutorial, by using:
> >
> > my $frame1_obj = $seqobj->translate(frame => 1); # which should now
> > give frame 2, I get:
> >
> > test length: 18
> > MMMMM-frame
> >
> > The frame is unchanged and the text "-frame" is tacked on the end of
> > the output.  The same occurs with translate(frame => 2).
> >
> > Any ideas?  Can something as fundamental as translate() really be
> > bugged?  or am I guilty of some particularly heinous syntax error?
> >
> > Cheers
> > Derek
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >


From l.douchy at gmail.com  Tue Apr 29 10:16:40 2008
From: l.douchy at gmail.com (Laurent DOUCHY)
Date: Tue, 29 Apr 2008 16:16:40 +0200
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <200804290958.21548.tristan.lefebure@gmail.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<200804290958.21548.tristan.lefebure@gmail.com>
Message-ID: <2fb209dd0804290716x36e403dek55978dc4f54e34ff@mail.gmail.com>

Hello,

I resolved this issue in Bio::seqIO with the following line :

my $sequence = $seq->translate('*', 'X', '0', '1', '0', '0', '0', '0')->seq;
the third parameter set the frame.

I hope to have been helpful.

laurent.

On Tue, Apr 29, 2008 at 3:58 PM, Tristan Lefebure <
tristan.lefebure at gmail.com> wrote:

> Aren't you forgetting the dash?
>
> my $frame1_obj = $seqobj->translate(-frame => 1)
>
>
> On Tuesday 29 April 2008 08:21:05 Derek Gatherer wrote:
> > my $frame1_obj = $seqobj->translate(frame => 1)
>
>
>
> -Tristan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From roy.chaudhuri at gmail.com  Tue Apr 29 10:27:10 2008
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 29 Apr 2008 15:27:10 +0100
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>	<481726BF.1060609@bms.com>
	<E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
Message-ID: <4817303E.1040903@gmail.com>

Spent two minutes looking at this, so may as well chip in with what I 
discovered even though you solved your problem.

This "bug" comes about because in version 1.5.1 and earlier, the 
arguments to translate were a simple list, with the first argument the 
terminator (defaults to "*"). Your old version therefore assumed that 
you wanted to translate the stop codon to "-frame". Amusingly given your 
typo, if you miss the hyphen off the frame argument in version 1.5.2 it 
reverts to the old interface and you end up with the output 
"MMMMMframe". The moral of the story is of course to read the docs 
relevant to the version you are using.

Roy.
--
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

Derek Gatherer wrote:
> Thanks Stefan
> 
> Actually, there was a typo in my message, I did use -frame => 
> 1.  However, the problem disappears on upgrading from 1.5.0 to 1.5.2.
> 
> So not a bug anymore.
> 
> Cheers
> Derek
> 
> At 14:46 29/04/2008, Stefan Kirov wrote:
>> my $frame1_obj = $seqobj->translate(-frame => 1);
>> not
>> my $frame1_obj = $seqobj->translate(frame => 1);
>> Stefan
>>
>> Derek Gatherer wrote:
>>> Hi
>>>
>>> I thought I'd better run this by the community before I embarrass
>>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>>> Bioperl 1.5.0 on RedHat.
>>>
>>> For a test input:
>>>
>>>> test
>>> ATGATGATGATGATGTGA
>>>
>>> the following code is fine.
>>>
>>> while((my $seqobj = $seq_in->next_seq()))
>>> {
>>>     print "\n".$seqobj->display_id;
>>>     my $len  = $seqobj->length();
>>>     print " length: $len";
>>>     my $frame1_obj = $seqobj->translate();
>>>     my $f1_prot = $frame1_obj->seq();
>>>     print "\n$f1_prot";
>>> }
>>>
>>> Output:
>>>
>>> test length: 18
>>> MMMMM*
>>>
>>> But if I want to change the frame as specified in the BioPerl
>>> tutorial, by using:
>>>
>>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>>> give frame 2, I get:
>>>
>>> test length: 18
>>> MMMMM-frame
>>>
>>> The frame is unchanged and the text "-frame" is tacked on the end of
>>> the output.  The same occurs with translate(frame => 2).
>>>
>>> Any ideas?  Can something as fundamental as translate() really be
>>> bugged?  or am I guilty of some particularly heinous syntax error?
>>>
>>> Cheers
>>> Derek
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From stefan.kirov at bms.com  Tue Apr 29 09:46:39 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Tue, 29 Apr 2008 09:46:39 -0400
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
Message-ID: <481726BF.1060609@bms.com>

my $frame1_obj = $seqobj->translate(-frame => 1);
not
my $frame1_obj = $seqobj->translate(frame => 1);
Stefan

Derek Gatherer wrote:
> Hi
>
> I thought I'd better run this by the community before I embarrass
> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
> Bioperl 1.5.0 on RedHat.
>
> For a test input:
>
> >test
> ATGATGATGATGATGTGA
>
> the following code is fine.
>
> while((my $seqobj = $seq_in->next_seq()))
> {
>     print "\n".$seqobj->display_id;
>     my $len  = $seqobj->length();
>     print " length: $len";
>     my $frame1_obj = $seqobj->translate();
>     my $f1_prot = $frame1_obj->seq();
>     print "\n$f1_prot";
> }
>
> Output:
>
> test length: 18
> MMMMM*
>
> But if I want to change the frame as specified in the BioPerl
> tutorial, by using:
>
> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
> give frame 2, I get:
>
> test length: 18
> MMMMM-frame
>
> The frame is unchanged and the text "-frame" is tacked on the end of
> the output.  The same occurs with translate(frame => 2).
>
> Any ideas?  Can something as fundamental as translate() really be
> bugged?  or am I guilty of some particularly heinous syntax error?
>
> Cheers
> Derek
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Tue Apr 29 11:00:00 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 29 Apr 2008 10:00:00 -0500
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <4817303E.1040903@gmail.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>	<481726BF.1060609@bms.com>
	<E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
	<4817303E.1040903@gmail.com>
Message-ID: <36045A08-AEA8-4639-A384-1DC53B5DC129@uiuc.edu>

Yes the interface changed somewhat post 1.5.1, mainly to accept named  
parameters.  I think a few methods do this now as passing in lists of  
more than 2 args, undef'ing those one doesn't want set, gets confusing.

chris

On Apr 29, 2008, at 9:27 AM, Roy Chaudhuri wrote:

> Spent two minutes looking at this, so may as well chip in with what  
> I discovered even though you solved your problem.
>
> This "bug" comes about because in version 1.5.1 and earlier, the  
> arguments to translate were a simple list, with the first argument  
> the terminator (defaults to "*"). Your old version therefore assumed  
> that you wanted to translate the stop codon to "-frame". Amusingly  
> given your typo, if you miss the hyphen off the frame argument in  
> version 1.5.2 it reverts to the old interface and you end up with  
> the output "MMMMMframe". The moral of the story is of course to read  
> the docs relevant to the version you are using.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
>
> Derek Gatherer wrote:
>> Thanks Stefan
>> Actually, there was a typo in my message, I did use -frame => 1.   
>> However, the problem disappears on upgrading from 1.5.0 to 1.5.2.
>> So not a bug anymore.
>> Cheers
>> Derek
>> At 14:46 29/04/2008, Stefan Kirov wrote:
>>> my $frame1_obj = $seqobj->translate(-frame => 1);
>>> not
>>> my $frame1_obj = $seqobj->translate(frame => 1);
>>> Stefan
>>>
>>> Derek Gatherer wrote:
>>>> Hi
>>>>
>>>> I thought I'd better run this by the community before I embarrass
>>>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>>>> Bioperl 1.5.0 on RedHat.
>>>>
>>>> For a test input:
>>>>
>>>>> test
>>>> ATGATGATGATGATGTGA
>>>>
>>>> the following code is fine.
>>>>
>>>> while((my $seqobj = $seq_in->next_seq()))
>>>> {
>>>>    print "\n".$seqobj->display_id;
>>>>    my $len  = $seqobj->length();
>>>>    print " length: $len";
>>>>    my $frame1_obj = $seqobj->translate();
>>>>    my $f1_prot = $frame1_obj->seq();
>>>>    print "\n$f1_prot";
>>>> }
>>>>
>>>> Output:
>>>>
>>>> test length: 18
>>>> MMMMM*
>>>>
>>>> But if I want to change the frame as specified in the BioPerl
>>>> tutorial, by using:
>>>>
>>>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>>>> give frame 2, I get:
>>>>
>>>> test length: 18
>>>> MMMMM-frame
>>>>
>>>> The frame is unchanged and the text "-frame" is tacked on the end  
>>>> of
>>>> the output.  The same occurs with translate(frame => 2).
>>>>
>>>> Any ideas?  Can something as fundamental as translate() really be
>>>> bugged?  or am I guilty of some particularly heinous syntax error?
>>>>
>>>> Cheers
>>>> Derek
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 29 11:07:30 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 29 Apr 2008 10:07:30 -0500
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <481726BF.1060609@bms.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<481726BF.1060609@bms.com>
Message-ID: <18DB95FB-52B9-4091-ACEE-996891F8A5AE@uiuc.edu>

As an aside, I've been playing around with perl6 (Rakudo) for a bit  
now.  Parameter-like passing (using autoaccessors and other means)  
will be added in soon, so you will be able to do this:

$seqobj = Seq.new(seq => 'ATGATGATGATGATGTGA', alphabet => 'dna');
my $protobj = $seq.translate(frame => 1);

Yes, I'm a geek. ; >

chris

On Apr 29, 2008, at 8:46 AM, Stefan Kirov wrote:

> my $frame1_obj = $seqobj->translate(-frame => 1);
> not
> my $frame1_obj = $seqobj->translate(frame => 1);
> Stefan
>
> Derek Gatherer wrote:
>> Hi
>>
>> I thought I'd better run this by the community before I embarrass
>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>> Bioperl 1.5.0 on RedHat.
>>
>> For a test input:
>>
>>> test
>> ATGATGATGATGATGTGA
>>
>> the following code is fine.
>>
>> while((my $seqobj = $seq_in->next_seq()))
>> {
>>    print "\n".$seqobj->display_id;
>>    my $len  = $seqobj->length();
>>    print " length: $len";
>>    my $frame1_obj = $seqobj->translate();
>>    my $f1_prot = $frame1_obj->seq();
>>    print "\n$f1_prot";
>> }
>>
>> Output:
>>
>> test length: 18
>> MMMMM*
>>
>> But if I want to change the frame as specified in the BioPerl
>> tutorial, by using:
>>
>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>> give frame 2, I get:
>>
>> test length: 18
>> MMMMM-frame
>>
>> The frame is unchanged and the text "-frame" is tacked on the end of
>> the output.  The same occurs with translate(frame => 2).
>>
>> Any ideas?  Can something as fundamental as translate() really be
>> bugged?  or am I guilty of some particularly heinous syntax error?
>>
>> Cheers
>> Derek
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dr.hogart at gmail.com  Tue Apr 29 11:57:51 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Tue, 29 Apr 2008 19:57:51 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
Message-ID: <op.uadqmpg8avnppr@hogart.img.ras.ru>

Hi all!

I am trying to perform TCoffe aligment by  
Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the script.  
This subroutine works fine, but it is not single subroutine - there are a  
lot of other ones in the script. The problem is when compilation of script  
finish execution (nb! successful execution) of tcoffee subroutine the  
compiliation of the end of the script also interrupted. It seems that the  
tcoffee program itself induce interraption of perl compilation. Is it  
possible to pass this problem?

-- 


From darin.london at duke.edu  Tue Apr 29 12:49:53 2008
From: darin.london at duke.edu (darin.london at duke.edu)
Date: Tue, 29 Apr 2008 12:49:53 -0400
Subject: [Bioperl-l] BOSC 2008 Announcement and Call For Submissions
Message-ID: <200804291650.m3TGnr0H020814@tenero.duhs.duke.edu>


BOSC 2008 Call for Abstracts Reminder

The 9th annual Bioinformatics Open Source Conference (BOSC 2008) will take place in Toronto, Ontario, Canada, as one of several Special Interest Group (SIG) meetings occurring in conjunction with the 16th annual Intelligent Systems for Molecular Biology Conference (ISMB 2008).

This is a reminder to submit your proposals for talks to the BOSC submission system before May 11.

Submission Process:
All abstracts must be submitted through our Open Conference Systems site (http://events.open-bio.org/BOSC2008/openconf.php).
The form will ask for a small Abstract Text to be pasted into it, and a full paper.  The small Abstract text should be a summary, while the longer abstract (should provide more details, including the open-source license requirement details)
Full-length abstracts are limited to one page with one inch (2.5 cm) margins on the top, sides, and bottom.  The full-length abstract should include the title, authors, and affiliations.  We prefer your abstract to be in PDF format, although plain t

Important Dates:
May 11: Abstract submission deadline.
June 2: Notification of accepted talks.
June 4: Early registration discount cut-off.
July 18-19: BOSC 2008!

We hope to see you at BOSC 2008!

Kam Dahlquist and Darin London
BOSC 2008 Co-organizers

			 
From bix at sendu.me.uk  Tue Apr 29 12:54:41 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 29 Apr 2008 17:54:41 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uadqmpg8avnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
Message-ID: <481752D1.7010904@sendu.me.uk>

sergei ryazansky wrote:
> I am trying to perform TCoffe aligment by 
> Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the 
> script. This subroutine works fine, but it is not single subroutine - 
> there are a lot of other ones in the script. The problem is when 
> compilation of script finish execution (nb! successful execution) of 
> tcoffee subroutine the compiliation of the end of the script also 
> interrupted. It seems that the tcoffee program itself induce 
> interraption of perl compilation. Is it possible to pass this problem?

You'll have to supply us with a minimal version of the script and the 
complete error message.


From dr.hogart at gmail.com  Wed Apr 30 07:24:35 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 15:24:35 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
Message-ID: <op.uae8m9tzavnppr@hogart.img.ras.ru>

On Tue, 29 Apr 2008 19:57:51 +0400, sergei ryazansky <dr.hogart at gmail.com>  
wrote:

> Hi all!
>
> I am trying to perform TCoffe aligment by  
> Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the  
> script. This subroutine works fine, but it is not single subroutine -  
> there are a lot of other ones in the script. The problem is when  
> compilation of script finish execution (nb! successful execution) of  
> tcoffee subroutine the compiliation of the end of the script also  
> interrupted. It seems that the tcoffee program itself induce  
> interraption of perl compilation. Is it possible to pass this problem?
>


My subroutine is following:

sub align {
	my $file=shift @_;
	my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 'fasta',  
'outfile' => 'temp_align.out');
	my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
	my $aln=$factory->align ($file);
	open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
	return @temp_file;
}

This subroutine is called by the following command:

my @align_fa = align($inputfile_align);

After successful execution of this subroutine (accompaning with the  
corresponding messages on the terminal window) the execution of remainder  
script is terminated without any error messages.

-- 


From bix at sendu.me.uk  Wed Apr 30 08:47:17 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 13:47:17 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uae8m9tzavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
Message-ID: <48186A55.4030406@sendu.me.uk>

sergei ryazansky wrote:
> My subroutine is following:
> 
> sub align {
>     my $file=shift @_;
>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
> 'fasta', 'outfile' => 'temp_align.out');
>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>     my $aln=$factory->align ($file);
>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>     return @temp_file;
> }
> 
> This subroutine is called by the following command:
> 
> my @align_fa = align($inputfile_align);
> 
> After successful execution of this subroutine (accompaning with the 
> corresponding messages on the terminal window) the execution of 
> remainder script is terminated without any error messages.

The problem lies somewhere within the rest of your script, so we have to 
see it if you want help.

Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you 
don't make use of the resulting alignment object? A system call might 
make more sense given what you're doing. The beauty of 
Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the 
result file (temp_align.out) yourself.


From dr.hogart at gmail.com  Wed Apr 30 09:36:58 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 17:36:58 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
Message-ID: <op.uaferwytavnppr@hogart.img.ras.ru>

On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:

> sergei ryazansky wrote:
>> My subroutine is following:
>>  sub align {
>>     my $file=shift @_;
>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>  
>> 'fasta', 'outfile' => 'temp_align.out');
>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>     my $aln=$factory->align ($file);
>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>     return @temp_file;
>> }
>>  This subroutine is called by the following command:
>>  my @align_fa = align($inputfile_align);
>>  After successful execution of this subroutine (accompaning with the  
>> corresponding messages on the terminal window) the execution of  
>> remainder script is terminated without any error messages.
>
> The problem lies somewhere within the rest of your script, so we have to  
> see it if you want help.
>
> Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you  
> don't make use of the resulting alignment object? A system call might  
> make more sense given what you're doing. The beauty of  
> Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the  
> result file (temp_align.out) yourself.

The rest of script,imho, is ok, because without this sub it is work fine.  
May be problem lies into the TCoffee itself?

One of the feature of script is to estimate the quantity of nt changes in  
each position in the different similar sequences in comparing with  
consensus sequences. To perform this it is nesseccary to obtain the  
multiply alignment: the result of TCoffee alignment goes to another  
subroutine, that estemated the level of changes. Of course, I dont think  
that this way is the best approach, most probably there are a lot of the  
better ways to do it. But for my today purposes it is ok.

-- 


From avilella at gmail.com  Wed Apr 30 10:16:56 2008
From: avilella at gmail.com (Albert Vilella)
Date: Wed, 30 Apr 2008 15:16:56 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru> <48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>

Hi Sergei,

Can you try to isolate this call with a simpler example to see if it still
fails? When you say that the problems are in the compilation, do you mean
that the interpreter won't even compile or that it fails during execution?
Have you checked that you have all the dependencies right?

Cheers,

    Albert.

On Wed, Apr 30, 2008 at 2:36 PM, sergei ryazansky <dr.hogart at gmail.com>
wrote:

> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
>
>  sergei ryazansky wrote:
> >
> > > My subroutine is following:
> > >  sub align {
> > >    my $file=shift @_;
> > >    my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>
> > > 'fasta', 'outfile' => 'temp_align.out');
> > >    my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
> > >    my $aln=$factory->align ($file);
> > >    open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
> > >    return @temp_file;
> > > }
> > >  This subroutine is called by the following command:
> > >  my @align_fa = align($inputfile_align);
> > >  After successful execution of this subroutine (accompaning with the
> > > corresponding messages on the terminal window) the execution of remainder
> > > script is terminated without any error messages.
> > >
> >
> > The problem lies somewhere within the rest of your script, so we have to
> > see it if you want help.
> >
> > Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you
> > don't make use of the resulting alignment object? A system call might make
> > more sense given what you're doing. The beauty of
> > Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the
> > result file (temp_align.out) yourself.
> >
>
> The rest of script,imho, is ok, because without this sub it is work fine.
> May be problem lies into the TCoffee itself?
>
> One of the feature of script is to estimate the quantity of nt changes in
> each position in the different similar sequences in comparing with consensus
> sequences. To perform this it is nesseccary to obtain the multiply
> alignment: the result of TCoffee alignment goes to another subroutine, that
> estemated the level of changes. Of course, I dont think that this way is the
> best approach, most probably there are a lot of the better ways to do it.
> But for my today purposes it is ok.
>
> --
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From bix at sendu.me.uk  Wed Apr 30 10:22:01 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 15:22:01 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <48188089.8000300@sendu.me.uk>

sergei ryazansky wrote:
> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
> 
>> sergei ryazansky wrote:
>>> My subroutine is following:
>>>  sub align {
>>>     my $file=shift @_;
>>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
>>> 'fasta', 'outfile' => 'temp_align.out');
>>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>>     my $aln=$factory->align ($file);
>>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>>     return @temp_file;
>>> }
>>>  This subroutine is called by the following command:
>>>  my @align_fa = align($inputfile_align);
>>>  After successful execution of this subroutine (accompaning with the 
>>> corresponding messages on the terminal window) the execution of 
>>> remainder script is terminated without any error messages.
>>
>> The problem lies somewhere within the rest of your script, so we have 
>> to see it if you want help.
> 
> The rest of script,imho, is ok, because without this sub it is work 
> fine. May be problem lies into the TCoffee itself?

I've run your subroutine in a simple script of my own and it doesn't 
cause script termination. Again, the problem lies elsewhere in your 
script. Supply it or it is impossible for anyone to help you.


From Sebastien.Moretti at unil.ch  Wed Apr 30 10:06:28 2008
From: Sebastien.Moretti at unil.ch (Sebastien MORETTI)
Date: Wed, 30 Apr 2008 16:06:28 +0200
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <48187CE4.8030606@unil.ch>

>>> My subroutine is following:
>>>  sub align {
>>>     my $file=shift @_;
>>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
>>> 'fasta', 'outfile' => 'temp_align.out');
>>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>>     my $aln=$factory->align ($file);
>>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>>     return @temp_file;
>>> }
>>>  This subroutine is called by the following command:
>>>  my @align_fa = align($inputfile_align);
>>>  After successful execution of this subroutine (accompaning with the 
>>> corresponding messages on the terminal window) the execution of 
>>> remainder script is terminated without any error messages.
>>
>> The problem lies somewhere within the rest of your script, so we have 
>> to see it if you want help.
>>
>> Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you 
>> don't make use of the resulting alignment object? A system call might 
>> make more sense given what you're doing. The beauty of 
>> Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse 
>> the result file (temp_align.out) yourself.
> 
> The rest of script,imho, is ok, because without this sub it is work 
> fine. May be problem lies into the TCoffee itself?
> 
> One of the feature of script is to estimate the quantity of nt changes 
> in each position in the different similar sequences in comparing with 
> consensus sequences. To perform this it is nesseccary to obtain the 
> multiply alignment: the result of TCoffee alignment goes to another 
> subroutine, that estemated the level of changes. Of course, I dont think 
> that this way is the best approach, most probably there are a lot of the 
> better ways to do it. But for my today purposes it is ok.

Do you have tried to use the tcoffee command, called via bioperl, as a 
command line ?
To check if it is a problem with tcoffee or with the tcoffee release 
that bioperl must use.

-- 
S?bastien Moretti


From dr.hogart at gmail.com  Wed Apr 30 10:54:59 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 18:54:59 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
Message-ID: <op.uafidxitavnppr@hogart.img.ras.ru>

Hi Albert,

The isolated call is executed without any problem, so the code is  
absolutely correct. The problem arise when this sub executed within the  
whole script - after successful execution of TCoffee alignment the  
execution of the rest of script is terminated. The whole code is very big  
(~500 lines), so for simplicity lets imagine the sheme of script in the  
following view:
sub1;
sub2;
sub3;
sub align;  # TCoffe alignment;
sub4;
sub5;

Each sub (subroutine) is independent from the others subs; The order of  
script execution is 1,2,3,align,4,5. But after the execution of align the  
execution of the rest of subs (4 and 5) is terminated. The script without  
sub align {} successfully execute the sub 4 and sub 5. So, I mean that  
interpreter won't compile sub 4 and 5 if sub align is placed before them.

On Wed, 30 Apr 2008 18:16:56 +0400, Albert Vilella <avilella at gmail.com>  
wrote:

> Hi Sergei,
>
> Can you try to isolate this call with a simpler example to see if it  
> still
> fails? When you say that the problems are in the compilation, do you mean
> that the interpreter won't even compile or that it fails during  
> execution?
> Have you checked that you have all the dependencies right?
>
> Cheers,
>
>     Albert.
>
> On Wed, Apr 30, 2008 at 2:36 PM, sergei ryazansky <dr.hogart at gmail.com>
> wrote:
>
>> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
>>
>>  sergei ryazansky wrote:
>> >
>> > > My subroutine is following:
>> > >  sub align {
>> > >    my $file=shift @_;
>> > >    my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>
>> > > 'fasta', 'outfile' => 'temp_align.out');
>> > >    my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>> > >    my $aln=$factory->align ($file);
>> > >    open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>> > >    return @temp_file;
>> > > }
>> > >  This subroutine is called by the following command:
>> > >  my @align_fa = align($inputfile_align);
>> > >  After successful execution of this subroutine (accompaning with the
>> > > corresponding messages on the terminal window) the execution of  
>> remainder
>> > > script is terminated without any error messages.
>> > >
>> >
>> > The problem lies somewhere within the rest of your script, so we have  
>> to
>> > see it if you want help.
>> >
>> > Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you
>> > don't make use of the resulting alignment object? A system call might  
>> make
>> > more sense given what you're doing. The beauty of
>> > Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse  
>> the
>> > result file (temp_align.out) yourself.
>> >
>>
>> The rest of script,imho, is ok, because without this sub it is work  
>> fine.
>> May be problem lies into the TCoffee itself?
>>
>> One of the feature of script is to estimate the quantity of nt changes  
>> in
>> each position in the different similar sequences in comparing with  
>> consensus
>> sequences. To perform this it is nesseccary to obtain the multiply
>> alignment: the result of TCoffee alignment goes to another subroutine,  
>> that
>> estemated the level of changes. Of course, I dont think that this way  
>> is the
>> best approach, most probably there are a lot of the better ways to do  
>> it.
>> But for my today purposes it is ok.
>>
>> --
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>


-- 
?????????? M2, ????????????? ???????? ?????????? Opera:  
http://www.opera.com/mail/mail/


From dr.hogart at gmail.com  Wed Apr 30 11:14:09 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 19:14:09 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru> <48187CE4.8030606@unil.ch>
Message-ID: <op.uafi7ytravnppr@hogart.img.ras.ru>

No, I didn tried.
To tell the truth the problem like this I have obtatin earlier. I simply  
wanted to aling the several set of sequences by TCoffee Bioperl package.  
The script should have been consequently add the set one after another to  
TCoffee wrapper. But after the alignment of the first set of sequences the  
alignment of the rest sets was terminated. So it was neccessary to use  
another "super_script" that called first script with different arguments  
linked to the corresponding set.


> Do you have tried to use the tcoffee command, called via bioperl, as a  
> command line ?


-- 


From bix at sendu.me.uk  Wed Apr 30 11:28:50 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 16:28:50 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uafidxitavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>	<op.uaferwytavnppr@hogart.img.ras.ru>	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru>
Message-ID: <48189032.20102@sendu.me.uk>

sergei ryazansky wrote:
> Hi Albert,
> 
> The isolated call is executed without any problem, so the code is 
> absolutely correct. The problem arise when this sub executed within the 
> whole script - after successful execution of TCoffee alignment the 
> execution of the rest of script is terminated. The whole code is very 
> big (~500 lines), so for simplicity lets imagine the sheme of script in 
> the following view:
> sub1;
> sub2;
> sub3;
> sub align;  # TCoffe alignment;
> sub4;
> sub5;
> 
> Each sub (subroutine) is independent from the others subs; The order of 
> script execution is 1,2,3,align,4,5. But after the execution of align 
> the execution of the rest of subs (4 and 5) is terminated. The script 
> without sub align {} successfully execute the sub 4 and sub 5. So, I 
> mean that interpreter won't compile sub 4 and 5 if sub align is placed 
> before them.

This has nothing to do with interpreter compilation, which is successful 
if the script runs at all.

What do you do with the output of &align? The thing you are doing with 
that output is most likely the cause of your script terminating, which 
is why &sub4 and &sub5 run when you don't run &align (have no output 
that causes the problem).

If you're not willing to show us your script, here are some simple 
debugging steps you can do yourself:

# don't do anything with the output of align() - does &sub4 still run?

# add some print statements after you call align(), and then after every 
further block of code in your script to see exactly where the script 
terminates

# reduce your script down to a minimal script that shows the problem 
(with the help of the previous step) and show us that


From dr.hogart at gmail.com  Wed Apr 30 11:42:41 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 19:42:41 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafkhojw9ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
Message-ID: <op.uafklfmd9ju7si@hogart.img.ras.ru>


------- Forwarded message -------
From: "Sergei Ryazansky" <dr.hogart at gmail.com>
To: "Sendu Bala" <bix at sendu.me.uk>
Cc:
Subject: Re: [Bioperl-l] alignment by TCoffee as a subroutine
Date: Wed, 30 Apr 2008 19:40:26 +0400

> What do you do with the output of &align? The thing you are doing with  
> that output is most likely the cause of your script terminating, which  
> is why &sub4 and &sub5 run when you don't run &align (have no output  
> that causes the problem).

please sea my answer to Sebastien Moretti - there are description of
another similar problem. The only thing that I did there with output is
printing to file. Nevetheless the problem was the same.

> # don't do anything with the output of align() - does &sub4 still run?

please sea above.

> # add some print statements after you call align(), and then after every  
> further block of code in your script to see exactly where the script  
> terminates
> # reduce your script down to a minimal script that shows the problem  
> (with the help of the previous step) and show us that

all tests with individual bloks was performed earlier. the results is ok.


From cjfields at uiuc.edu  Wed Apr 30 12:25:06 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 11:25:06 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafklfmd9ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
Message-ID: <5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>

Sergei,

I agree with Sendu; we can't diagnose this unless we either have the  
entire script of a minimal version of it demonstrating the bug.

The best way to handle this is to file a bug report, attaching  
relevant data using the 'Create a new attachment' link (including  
either the full script or a shortened one which demonstrates the bug).  
Otherwise we're just shooting in the dark trying to diagnose the  
problem.

http://bugzilla.open-bio.org/

chris

On Apr 30, 2008, at 10:42 AM, Sergei Ryazansky wrote:

>
>
> ------- Forwarded message -------
> From: "Sergei Ryazansky" <dr.hogart at gmail.com>
> To: "Sendu Bala" <bix at sendu.me.uk>
> Cc:
> Subject: Re: [Bioperl-l] alignment by TCoffee as a subroutine
> Date: Wed, 30 Apr 2008 19:40:26 +0400
>
>> What do you do with the output of &align? The thing you are doing  
>> with that output is most likely the cause of your script  
>> terminating, which is why &sub4 and &sub5 run when you don't run  
>> &align (have no output that causes the problem).
>
> please sea my answer to Sebastien Moretti - there are description of
> another similar problem. The only thing that I did there with output  
> is
> printing to file. Nevetheless the problem was the same.
>
>> # don't do anything with the output of align() - does &sub4 still  
>> run?
>
> please sea above.
>
>> # add some print statements after you call align(), and then after  
>> every further block of code in your script to see exactly where the  
>> script terminates
>> # reduce your script down to a minimal script that shows the  
>> problem (with the help of the previous step) and show us that
>
> all tests with individual bloks was performed earlier. the results  
> is ok.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dr.hogart at gmail.com  Wed Apr 30 12:40:19 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 20:40:19 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
Message-ID: <op.uafm9hl79ju7si@hogart.img.ras.ru>

On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu> wrote:

Chris, I have already sent file to Sendu and also I am attaching it here.  
I have removed from it really unnecessary parts.

> Sergei,
>
> I agree with Sendu; we can't diagnose this unless we either have the  
> entire script of a minimal version of it demonstrating the bug.
>
> The best way to handle this is to file a bug report, attaching relevant  
> data using the 'Create a new attachment' link (including either the full  
> script or a shortened one which demonstrates the bug). Otherwise we're  
> just shooting in the dark trying to diagnose the problem.
>
> http://bugzilla.open-bio.org/
>
> chris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: script.pl
Type: application/octet-stream
Size: 6870 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20080430/6aef0fde/attachment-0002.obj>

From cjfields at uiuc.edu  Wed Apr 30 13:02:19 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 12:02:19 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafm9hl79ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
Message-ID: <EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>

Hmm, maybe you were confused?  From my last email:

"The best way to handle this is to file a bug report, attaching  
relevant data using the 'Create a new attachment' link (including  
either the full script or a shortened one which demonstrates the bug).  
Otherwise we're just shooting in the dark trying to diagnose the  
problem."

http://bugzilla.open-bio.org/

Anyone can work on fixing the issue there (so it'll probably get fixed  
faster).  The devs can also track progress on the problem via the dev  
mail list (bioperl-guts).  Diagnosing the bug may also reveal issues  
not just with Bio::Tools::Run::Alignment::TCoffee but also with other  
related modules.

If needed I can post it to bugzilla, but it helps to submit the bug  
yourself (so you can receive posts on it's progress).

chris

On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:

> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu>  
> wrote:
>
> Chris, I have already sent file to Sendu and also I am attaching it  
> here. I have removed from it really unnecessary parts.
>
>> Sergei,
>>
>> I agree with Sendu; we can't diagnose this unless we either have  
>> the entire script of a minimal version of it demonstrating the bug.
>>
>> The best way to handle this is to file a bug report, attaching  
>> relevant data using the 'Create a new attachment' link (including  
>> either the full script or a shortened one which demonstrates the  
>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>> the problem.
>>
>> http://bugzilla.open-bio.org/
>>
>> chris


From dr.hogart at gmail.com  Wed Apr 30 13:39:35 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 21:39:35 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafop6079ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
	<EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>
	<op.uafop6079ju7si@hogart.img.ras.ru>
Message-ID: <op.uafpz9n79ju7si@hogart.img.ras.ru>

On Wed, 30 Apr 2008 21:11:56 +0400, Sergei Ryazansky <dr.hogart at gmail.com>  
wrote:

> Oh, sorry, you right - I too fast read you message. I do it slight later.
>
>> Hmm, maybe you were confused?  From my last email:
>>
>> "The best way to handle this is to file a bug report, attaching  
>> relevant data using the 'Create a new attachment' link (including  
>> either the full script or a shortened one which demonstrates the bug).  
>> Otherwise we're just shooting in the dark trying to diagnose the  
>> problem."
>>
>> http://bugzilla.open-bio.org/
>>
>> Anyone can work on fixing the issue there (so it'll probably get fixed  
>> faster).  The devs can also track progress on the problem via the dev  
>> mail list (bioperl-guts).  Diagnosing the bug may also reveal issues  
>> not just with Bio::Tools::Run::Alignment::TCoffee but also with other  
>> related modules.
>>
>> If needed I can post it to bugzilla, but it helps to submit the bug  
>> yourself (so you can receive posts on it's progress).
>>
>> chris
>>
>> On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:
>>
>>> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu>  
>>> wrote:
>>>
>>> Chris, I have already sent file to Sendu and also I am attaching it  
>>> here. I have removed from it really unnecessary parts.
>>>
>>>> Sergei,
>>>>
>>>> I agree with Sendu; we can't diagnose this unless we either have the  
>>>> entire script of a minimal version of it demonstrating the bug.
>>>>
>>>> The best way to handle this is to file a bug report, attaching  
>>>> relevant data using the 'Create a new attachment' link (including  
>>>> either the full script or a shortened one which demonstrates the  
>>>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>>>> the problem.
>>>>
>>>> http://bugzilla.open-bio.org/
>>>>
>>>> chris
>


From cjfields at uiuc.edu  Wed Apr 30 14:29:28 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 13:29:28 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafpz9n79ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
	<EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>
	<op.uafop6079ju7si@hogart.img.ras.ru>
	<op.uafpz9n79ju7si@hogart.img.ras.ru>
Message-ID: <39A139E4-6783-41E6-8EE9-1FE60CB57577@uiuc.edu>

Sorry, didn't catch that...

chris

On Apr 30, 2008, at 12:39 PM, Sergei Ryazansky wrote:

> On Wed, 30 Apr 2008 21:11:56 +0400, Sergei Ryazansky <dr.hogart at gmail.com 
> > wrote:
>
>> Oh, sorry, you right - I too fast read you message. I do it slight  
>> later.
>>
>>> Hmm, maybe you were confused?  From my last email:
>>>
>>> "The best way to handle this is to file a bug report, attaching  
>>> relevant data using the 'Create a new attachment' link (including  
>>> either the full script or a shortened one which demonstrates the  
>>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>>> the problem."
>>>
>>> http://bugzilla.open-bio.org/
>>>
>>> Anyone can work on fixing the issue there (so it'll probably get  
>>> fixed faster).  The devs can also track progress on the problem  
>>> via the dev mail list (bioperl-guts).  Diagnosing the bug may also  
>>> reveal issues not just with Bio::Tools::Run::Alignment::TCoffee  
>>> but also with other related modules.
>>>
>>> If needed I can post it to bugzilla, but it helps to submit the  
>>> bug yourself (so you can receive posts on it's progress).
>>>
>>> chris
>>>
>>> On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:
>>>
>>>> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields  
>>>> <cjfields at uiuc.edu> wrote:
>>>>
>>>> Chris, I have already sent file to Sendu and also I am attaching  
>>>> it here. I have removed from it really unnecessary parts.
>>>>
>>>>> Sergei,
>>>>>
>>>>> I agree with Sendu; we can't diagnose this unless we either have  
>>>>> the entire script of a minimal version of it demonstrating the  
>>>>> bug.
>>>>>
>>>>> The best way to handle this is to file a bug report, attaching  
>>>>> relevant data using the 'Create a new attachment' link  
>>>>> (including either the full script or a shortened one which  
>>>>> demonstrates the bug). Otherwise we're just shooting in the dark  
>>>>> trying to diagnose the problem.
>>>>>
>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> chris
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Tue Apr  1 08:31:49 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Tue, 01 Apr 2008 14:31:49 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
Message-ID: <47F22B35.1030502@awi.de>

Dear list,

we have recently started to try to find a solution for indexing large 
sequence databases / flat files for a java project, and because we ran 
into problems using biojava, and because both the OBDA and BioSQL ways 
seem to be compatible across bio~ projects, we also started to 
experiment with bioperl. It looks like this should work fine, but we had 
a couple of problems here, too. Perhaps some of you can give me hint 
what we are doing wrong!

The first thing we tried was to use Bio::DB::Flat for indexing a TrEMBL 
flat file (~ 12 GB); but it seems we haven?t got a machine with enough 
memory to be able to handle this. (Perhaps you would be using the "bdb" 
style index in such a case in bioperl, but this apparently doesn?t work 
with biojava, so we had to stick with "flat"). So next we started to 
test BioSQL, by trying to load just Swissprot in a MySQL DB first, like:

load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser xyz 
--dbpass abc --driver mysql --namespace uniprot_sprot --format swiss 
uniprot_sprot.dat

Here we get an error message

###########################################

Loading /biodb/spinkern/uniprot_sprot.dat ...
Could not store Q6DAH5:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: The supplied lineage does not start near 'Erwinia carotovora subsp. 
atroseptica' (I was supplied 'Erwinia carotovora subsp. | Pectobacterium 
| Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | 
Proteobacteria | Bacteria')
STACK: Error::throw
STACK: Bio::Root::Root::throw 
/biodb/spinkern/bioperl-1.5/bioperl-1.5.2_102/Bio/Root/Root.pm:359
STACK: Bio::Species::classification 
/biodb/spinkern/bioperl-1.5/bioperl-1.5.2_102/Bio/Species.pm:174
STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:552 

STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1305 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:973 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:852 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182 

STACK: Bio::DB::Persistent::PersistentObject::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:244 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 

STACK: Bio::DB::Persistent::PersistentObject::store 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271 

STACK: load_seqdatabase.pl:622
-----------------------------------------------------------

at load_seqdatabase.pl line 635

############################################

or similar, depending on whether we use a pre-loaded ncbi taxonomy or 
not, and which Swissprot release we are trying to load. It often seems 
to come from sg. like here, subsp. or other special addition to the 
species line; but alternative genus names and other curious things also 
to appear. It looks like Species.pm tries to validate the species name 
against the lineage info already there in the BioSQL DB, and in several 
cases, it finds inconsistencies. If we start with the ncbi taxonomy 
already loaded in the database, the first error comes much earlier.

I found a thread on the same problem from ~ two years ago 
(http://thread.gmane.org/gmane.comp.lang.perl.bio.general/13766/focus=13788), 
where the solution recommended was to update bioperl, so I was quite 
surprised to find the problem with the version you can see above 
(1.5.2_102 bioperl core, 1.5.2_100 bioperl_db). Can someone give me any 
hints as to what is going wrong here?

The only workaround we have found so far was to comment out line 174 in 
Species.pm:

$self->throw("The supplied lineage does not start near '$name' (I was 
supplied '".join(" | ", @vals)."')");

After doing so, load_seqdatabase.pl runs for several hours (until it 
evetually crashes; I haven?t found out yet why), but proceeds really 
slowly. I also found some info on this for Pg and Oracle in the mailing 
list, but has anyone some approximate numbers for MySQL, how long should 
a first Swissprot load take?

Would be grateful to hear about your ideas / experiences on these issues!

Bank Beszteri


Bioinformatics / Scientific Computing
Alfred Wegener Institute for Polar and Marine Research
Am Handelshafen 12.
27570 Bremerhaven
Germany


From cjfields at uiuc.edu  Tue Apr  1 20:45:28 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 1 Apr 2008 19:45:28 -0500
Subject: [Bioperl-l] quick update on bioperl nightly builds
Message-ID: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>

I'm simplifying the nightly build archive names (removing svn revision  
# and date) in case anyone needs to update bioperl-live/run/db/network  
on a regular basis (read: GBrowse installations).  When I have time  
I'll start working on automated builds, which will require some extra  
work with Module::Build and Build.PL.

chris


From hiekeen at gmail.com  Tue Apr  1 22:14:07 2008
From: hiekeen at gmail.com (Jinyan Huang)
Date: Wed, 2 Apr 2008 10:14:07 +0800
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
Message-ID: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>

I have 20 pathways. My interesting genes are in these pathways. There
are some genes overlaps in these pathways. How can I make a graphic
network using these genes? It means connecting these pathways through
these overlap genes. What kind of software can I use?

Thank you very much in advance.

-- 
Best regards,
Jinyan Huang (ekeen)
School of Life Sciences and Technology, 1302 Room
Tongji University
Siping Road 1239, Shanghai 200092
P.R. China
Tel :0086-21-65981041
Msn: hiekeen at hotmail.com
eMail: hiekeen at gmail.com


From hlapp at gmx.net  Tue Apr  1 22:30:06 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 1 Apr 2008 22:30:06 -0400
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47F22B35.1030502@awi.de>
References: <47F22B35.1030502@awi.de>
Message-ID: <CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>


On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
> [...] So next we started to test BioSQL, by trying to load just  
> Swissprot in a MySQL DB first, like:
>
> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser  
> xyz --dbpass abc --driver mysql --namespace uniprot_sprot --format  
> swiss uniprot_sprot.dat
>
> Here we get an error message
>
> ###########################################
>
> Loading /biodb/spinkern/uniprot_sprot.dat ...
> Could not store Q6DAH5:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: The supplied lineage does not start near 'Erwinia carotovora  
> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |  
> Pectobacterium | Enterobacteriaceae | Enterobacteriales |  
> Gammaproteobacteria | Proteobacteria | Bacteria')
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/ 
> bioperl-1.5.2_102/Bio/Root/Root.pm:359
> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/ 
> bioperl-1.5.2_102/Bio/Species.pm:174
> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 
> 552
> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object / 
> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:1305
> STACK:  
> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:973
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / 
> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:852
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:182
> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 
> 244
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:169
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:251
> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/spinkern/ 
> bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271
> STACK: load_seqdatabase.pl:622
> -----------------------------------------------------------
>
> at load_seqdatabase.pl line 635
>
> ############################################
>
> or similar, depending on whether we use a pre-loaded ncbi taxonomy  
> or not

I recommend to always use a pre-loaded NCBI taxonomy unless you know  
there are only a few organisms that are straightforward (for the  
parser, that is).

> , and which Swissprot release we are trying to load. It often seems  
> to come from sg. like here, subsp. or other special addition to the  
> species line; but alternative genus names and other curious things  
> also to appear. It looks like Species.pm tries to validate the  
> species name against the lineage info already there in the BioSQL  
> DB, and in several cases, it finds inconsistencies.

It actually happens upon a successful lookup when the species object  
is populated from the database.

> [...]
> The only workaround we have found so far was to comment out line  
> 174 in Species.pm:
>
> $self->throw("The supplied lineage does not start near '$name' (I  
> was supplied '".join(" | ", @vals)."')");

That should be OK if you work with a pre-loaded taxonomy. It's sort  
of a sanity check that should catch a parser having messed up a  
species. If you use a pre-loaded NCBI taxonomy the results of the  
species parsing don't matter in all details so long as the NCBI  
taxonID is parsed out correctly, and then found in the database.

Note that this actually a warn() in the main trunk version of  
BioPerl, so you might want to upgrade to that (or change throw() to  
warn() in your version). You still get the records flagged with that,  
but it isn't an exception.

>
> After doing so, load_seqdatabase.pl runs for several hours (until  
> it evetually crashes; I haven?t found out yet why), but proceeds  
> really slowly.

It should certainly *not* crash. Note also that you can supply --safe  
on the command line, in which case the script will continue with the  
next record if one fails to load for whatever reason.

You will want to adjust the width constraint of dbxref.accession, for  
example to 128 chars. This will also be fixed for BioSQL 1.0.1.
See http://bugzilla.open-bio.org/show_bug.cgi?id=2474


> I also found some info on this for Pg and Oracle in the mailing  
> list, but has anyone some approximate numbers for MySQL, how long  
> should a first Swissprot load take?

Possibly around 20 hours according to Erik Rijkers:
See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html

You can use the --logchunks N option to have it print out performance  
statistics every N records.

Hope this helps,

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Tue Apr  1 22:38:12 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 1 Apr 2008 22:38:12 -0400
Subject: [Bioperl-l] Very basic implementation of GenBank XML SeqIO
	module
In-Reply-To: <47F13C2C.4070909@umdnj.edu>
References: <47F13C2C.4070909@umdnj.edu>
Message-ID: <DBDEDED2-656B-4CFD-B603-C0868ED5DAD9@gmx.net>

Ryan - do you not have a committer account?

I do agree with Chris on the test. Modules w/o tests tend to become  
'pseudogenized.'

	-hilmar

On Mar 31, 2008, at 3:31 PM, Ryan Golhar wrote:
> I have a (very) basic SAX implementation of a SeqIO module to parse  
> GenBank XML records.  Right now, it only reads in basic information  
> regarding the sequence and the sequence itself.
>
> It does not yet parse the features table.  Should I submit it to be  
> included in bioperl or wait until I implement more for the features  
> table?  I'm not sure when I'll get around to it though
>
> Ryan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Tue Apr  1 23:12:04 2008
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 01 Apr 2008 23:12:04 -0400
Subject: [Bioperl-l] quick update on bioperl nightly builds
In-Reply-To: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
Message-ID: <1207105924.6184.4.camel@frissell>

Hi Chris,

The tarball is currently (Apr 1) being built in a tmp directory, so that
the extracted tarball is ./tmp/bioperl-live/.  Is that intended?

Thanks,
Scott

On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
> I'm simplifying the nightly build archive names (removing svn revision  
> # and date) in case anyone needs to update bioperl-live/run/db/network  
> on a regular basis (read: GBrowse installations).  When I have time  
> I'll start working on automated builds, which will require some extra  
> work with Module::Build and Build.PL.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory


From cjfields at uiuc.edu  Tue Apr  1 23:59:30 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 1 Apr 2008 22:59:30 -0500
Subject: [Bioperl-l] quick update on bioperl nightly builds
In-Reply-To: <1207105924.6184.4.camel@frissell>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
Message-ID: <D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>

Nope, that isn't intended.  I fixed it and reran it manually, so it  
should be fine now (note I didn't update the log file; the next cron  
run will catch that).

I may toy around with your recent passthrough flag addition to try  
getting automated PPM's up and running.

chris

On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:

> Hi Chris,
>
> The tarball is currently (Apr 1) being built in a tmp directory, so  
> that
> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>
> Thanks,
> Scott
>
> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>> I'm simplifying the nightly build archive names (removing svn  
>> revision
>> # and date) in case anyone needs to update bioperl-live/run/db/ 
>> network
>> on a regular basis (read: GBrowse installations).  When I have time
>> I'll start working on automated builds, which will require some extra
>> work with Module::Build and Build.PL.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Wed Apr  2 07:33:38 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 2 Apr 2008 07:33:38 -0400
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
In-Reply-To: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
Message-ID: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>

On Tue, Apr 1, 2008 at 10:14 PM, Jinyan Huang <hiekeen at gmail.com> wrote:
> I have 20 pathways. My interesting genes are in these pathways. There
>  are some genes overlaps in these pathways. How can I make a graphic
>  network using these genes? It means connecting these pathways through
>  these overlap genes. What kind of software can I use?

R/Bioconductor has tools for working with graphs and pathways.
Cytoscape is another open-source graphical solution.  Ingenuity is, of
course, not free.  If you are looking at a perl solution, you can look
at the various graph modules and their integration with the Graphviz
libraries.

SEan


From cain.cshl at gmail.com  Wed Apr  2 08:28:22 2008
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 02 Apr 2008 08:28:22 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl
	nightly	builds
In-Reply-To: <D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
Message-ID: <1207139302.6507.7.camel@frissell>

Hi Chris,

(trimmed out gbrowse mailing list since this is just bioperl business)

Speaking of the pass through stuff, Sendu mentioned that I stomped on
some changes to Build.PL that you and he did when I committed that
change, so it should be rolled back.  Is there a good (svn) way to do
that?  Or should I just copy the contents of the old (good) Build.PL
into a fresh file in my checkout and commit it?

Thanks,
Scott

On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
> Nope, that isn't intended.  I fixed it and reran it manually, so it  
> should be fine now (note I didn't update the log file; the next cron  
> run will catch that).
> 
> I may toy around with your recent passthrough flag addition to try  
> getting automated PPM's up and running.
> 
> chris
> 
> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > The tarball is currently (Apr 1) being built in a tmp directory, so  
> > that
> > the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
> >
> > Thanks,
> > Scott
> >
> > On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
> >> I'm simplifying the nightly build archive names (removing svn  
> >> revision
> >> # and date) in case anyone needs to update bioperl-live/run/db/ 
> >> network
> >> on a regular basis (read: GBrowse installations).  When I have time
> >> I'll start working on automated builds, which will require some extra
> >> work with Module::Build and Build.PL.
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                         cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory


From robert.citek at gmail.com  Wed Apr  2 08:24:06 2008
From: robert.citek at gmail.com (Robert Citek)
Date: Wed, 2 Apr 2008 07:24:06 -0500
Subject: [Bioperl-l] module for pubchem queries
Message-ID: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>

Hello all,

I have a list of chemical compounds that have some kind of interaction
with proteins or genes.  The current list contains names or SMILES and
I would like to get the CID number for those compounds.  Currently,
I'm using perl to query the NCBI's eutils[1], which works great.  But
I was just curious to know of there was a bioperl module to do
something similar.  A quick google didn't turn up anything, so I
thought I'd ask.

[1] http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html

Regards,
- Robert


From David.Messina at sbc.su.se  Wed Apr  2 08:41:45 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 2 Apr 2008 14:41:45 +0200
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
In-Reply-To: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
Message-ID: <628aabb70804020541v6cee4584ibd9935290ae7cc0a@mail.gmail.com>

I have no personal experience with it, but a colleague of mine suggested
VisANT <http://visant.bu.edu/>.


Dave


From cjfields at uiuc.edu  Wed Apr  2 11:03:32 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 2 Apr 2008 10:03:32 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl nightly
	builds
In-Reply-To: <1207139302.6507.7.camel@frissell>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
	<1207139302.6507.7.camel@frissell>
Message-ID: <3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>

The changes I made were related to problems checking MySQL for  
Bio::DB::SeqFeature::Store tests when connectivity requires username/ 
password.  For some reason it tests DB connectivity up front, while  
Bio::DB::GFF assumes the DB setup is correct (no direct DB check) then  
runs tests assuming the setup is correct.

You can view the diffs for your commits here:

http://code.open-bio.org/svnweb/index.cgi/bioperl/diff/bioperl-live/trunk/ModuleBuildBioperl.pm?revs=14604&revs=14548

http://code.open-bio.org/svnweb/index.cgi/bioperl/diff/bioperl-live/trunk/Build.PL?revs=14604&revs=14565

I'll try working on merging them together today; it shouldn't be too  
hard (the changes were fairly minor in both Build.PL and  
Module::Build).  I'll test to make sure your changes stay in as well.   
Down the road I believe we need to rethink how we want the Build  
process to run using Module::Build as it's a bit convoluted, but it  
works for now.

chris

On Apr 2, 2008, at 7:28 AM, Scott Cain wrote:
> Hi Chris,
>
> (trimmed out gbrowse mailing list since this is just bioperl business)
>
> Speaking of the pass through stuff, Sendu mentioned that I stomped on
> some changes to Build.PL that you and he did when I committed that
> change, so it should be rolled back.  Is there a good (svn) way to do
> that?  Or should I just copy the contents of the old (good) Build.PL
> into a fresh file in my checkout and commit it?
>
> Thanks,
> Scott
>
> On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
>> Nope, that isn't intended.  I fixed it and reran it manually, so it
>> should be fine now (note I didn't update the log file; the next cron
>> run will catch that).
>>
>> I may toy around with your recent passthrough flag addition to try
>> getting automated PPM's up and running.
>>
>> chris
>>
>> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> The tarball is currently (Apr 1) being built in a tmp directory, so
>>> that
>>> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>>>
>>> Thanks,
>>> Scott
>>>
>>> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>>>> I'm simplifying the nightly build archive names (removing svn
>>>> revision
>>>> # and date) in case anyone needs to update bioperl-live/run/db/
>>>> network
>>>> on a regular basis (read: GBrowse installations).  When I have time
>>>> I'll start working on automated builds, which will require some  
>>>> extra
>>>> work with Module::Build and Build.PL.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> ------------------------------------------------------------------------
>>> Scott Cain, Ph. D.                                         cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> -------------------------------------------------------------------------
>> Check out the new SourceForge.net Marketplace.
>> It's the best place to buy or sell services for
>> just about anything Open Source.
>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> Gmod-gbrowse at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Apr  2 11:54:05 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 2 Apr 2008 10:54:05 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl nightly
	builds
In-Reply-To: <3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
	<1207139302.6507.7.camel@frissell>
	<3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>
Message-ID: <71375DA3-A751-4908-8000-D9ACAE39B19C@uiuc.edu>

Okay, committed them.  The accept passthrough still appears to work;  
let me know if anything pops up.

chris

On Apr 2, 2008, at 10:03 AM, Chris Fields wrote:

> ...
> I'll try working on merging them together today; it shouldn't be too  
> hard (the changes were fairly minor in both Build.PL and  
> Module::Build).  I'll test to make sure your changes stay in as  
> well.  Down the road I believe we need to rethink how we want the  
> Build process to run using Module::Build as it's a bit convoluted,  
> but it works for now.
>
> chris
>
> On Apr 2, 2008, at 7:28 AM, Scott Cain wrote:
>> Hi Chris,
>>
>> (trimmed out gbrowse mailing list since this is just bioperl  
>> business)
>>
>> Speaking of the pass through stuff, Sendu mentioned that I stomped on
>> some changes to Build.PL that you and he did when I committed that
>> change, so it should be rolled back.  Is there a good (svn) way to do
>> that?  Or should I just copy the contents of the old (good) Build.PL
>> into a fresh file in my checkout and commit it?
>>
>> Thanks,
>> Scott
>>
>> On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
>>> Nope, that isn't intended.  I fixed it and reran it manually, so it
>>> should be fine now (note I didn't update the log file; the next cron
>>> run will catch that).
>>>
>>> I may toy around with your recent passthrough flag addition to try
>>> getting automated PPM's up and running.
>>>
>>> chris
>>>
>>> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
>>>
>>>> Hi Chris,
>>>>
>>>> The tarball is currently (Apr 1) being built in a tmp directory, so
>>>> that
>>>> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>>>>
>>>> Thanks,
>>>> Scott
>>>>
>>>> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>>>>> I'm simplifying the nightly build archive names (removing svn
>>>>> revision
>>>>> # and date) in case anyone needs to update bioperl-live/run/db/
>>>>> network
>>>>> on a regular basis (read: GBrowse installations).  When I have  
>>>>> time
>>>>> I'll start working on automated builds, which will require some  
>>>>> extra
>>>>> work with Module::Build and Build.PL.
>>>>>
>>>>> chris
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> -- 
>>>> ------------------------------------------------------------------------
>>>> Scott Cain, Ph. D.                                         cain at cshl.edu
>>>> GMOD Coordinator (http://www.gmod.org/)
>>>> 216-392-3087
>>>> Cold Spring Harbor Laboratory
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> -------------------------------------------------------------------------
>>> Check out the new SourceForge.net Marketplace.
>>> It's the best place to buy or sell services for
>>> just about anything Open Source.
>>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>>> _______________________________________________
>>> Gmod-gbrowse mailing list
>>> Gmod-gbrowse at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>> -- 
>> ------------------------------------------------------------------------
>> Scott Cain, Ph. D.                                   cain.cshl at gmail.com
>> GMOD Coordinator (http://www.gmod.org/)                      
>> 216-392-3087
>> Cold Spring Harbor Laboratory
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From zhpan99 at yahoo.com  Wed Apr  2 13:52:46 2008
From: zhpan99 at yahoo.com (Pan Zheng)
Date: Wed, 2 Apr 2008 10:52:46 -0700 (PDT)
Subject: [Bioperl-l] installing bioperl-1.5.2 errors:DB_File
Message-ID: <726978.82400.qm@web53105.mail.re2.yahoo.com>

Hi,
   
  I am installing bioperl-1.5.2_102 under cygwin on my Windows XP and having some errors during the process.
   
  When I was running "perl Build test", one major error is the error about DB_File. I tried to install DB_File from cpan and rpm without any luck.
   
  ++++++++++++++++++++++++
  CPAN: File::Temp loaded ok (v0.16)
CPAN: YAML loaded ok (v0.62)
    CPAN.pm: Going to build P/PM/PMQS/DB_File-1.817.tar.gz
  Parsing config.in...
Looks Good.
Checking if your kit is complete...
Looks good
Note (probably harmless): No library found for -ldb
Writing Makefile for DB_File
cp DB_File.pm blib/lib/DB_File.pm
AutoSplitting blib/lib/DB_File.pm (blib/lib/auto/DB_File)
gcc -c  -I/usr/local/BerkeleyDB/include -DPERL_USE_SAFE_PUTENV -fno-strict-alias
ing -pipe -Wdeclaration-after-statement -DUSEIMPORTLIB -O3   -DVERSION=\"1.817\"
 -DXS_VERSION=\"1.817\"  "-I/usr/lib/perl5/5.8/cygwin/CORE"  -D_NOT_CORE  -DmDB_
Prefix_t=size_t -DmDB_Hash_t=u_int32_t   version.c
version.c:30:16: db.h: No such file or directory
make: *** [version.o] Error 1
  PMQS/DB_File-1.817.tar.gz
  /usr/bin/make -- NOT OK
Running make test
  Can't test without successful make
Running make install
  Make had returned bad status, install seems impossible
Failed during this command:
 PMQS/DB_File-1.817.tar.gz                    : make NO
  +++++++++++++++++++++++++++++++++++++++++++++++
   
   
  I can't remember I had this kind error while installing earlier version.
   
  Would you please help me on DB_File installation ?
   
  Thanks.
   
  Pan

       
---------------------------------
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.


From dr.hogart at gmail.com  Thu Apr  3 09:01:03 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Thu, 03 Apr 2008 17:01:03 +0400
Subject: [Bioperl-l] support of clustalw2 in bio::run::tool::alignment
Message-ID: <op.t81c31ljavnppr@hogart.img.ras.ru>

As for as I understand clustalw2 is not supported in bioperl v1.5.2.100.  
In what version it will be realized?
Thank you in advance.


From slduncan at iastate.edu  Thu Apr  3 14:13:16 2008
From: slduncan at iastate.edu (slduncan at iastate.edu)
Date: Thu, 3 Apr 2008 13:13:16 -0500 (CDT)
Subject: [Bioperl-l] help installing bioperl with cygwin
Message-ID: <161313331084931@webmail.iastate.edu>

I am trying to use cpan to install bioperl and I had an error message saying:
c:\Documents not recognized as and external or internal....
Any ideas here.  Also, I am new to the computer world so please be kind. :)

Stacy Duncan
Iowa State University
Bioinformatics and Computational Biology
1802 University Blvd.
VMRI Building 6
Ames, IA 50011-1240
office phone: (515) 294-8385
office fax: (515) 294-1401
home phone: (336) 965-5622
e-mail: slduncan at iastate.edu


From cjfields at uiuc.edu  Fri Apr  4 16:13:23 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 15:13:23 -0500
Subject: [Bioperl-l] help installing bioperl with cygwin
In-Reply-To: <161313331084931@webmail.iastate.edu>
References: <161313331084931@webmail.iastate.edu>
Message-ID: <B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>

It's best if you use ActiveState's Perl installation (it's the only  
one we really support at this moment, unless someone wants to give  
StrawberryPerl a run).  See:

http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows

chris

On Apr 3, 2008, at 1:13 PM, slduncan at iastate.edu wrote:

> I am trying to use cpan to install bioperl and I had an error  
> message saying:
> c:\Documents not recognized as and external or internal....
> Any ideas here.  Also, I am new to the computer world so please be  
> kind. :)
>
> Stacy Duncan
> Iowa State University
> Bioinformatics and Computational Biology
> 1802 University Blvd.
> VMRI Building 6
> Ames, IA 50011-1240
> office phone: (515) 294-8385
> office fax: (515) 294-1401
> home phone: (336) 965-5622
> e-mail: slduncan at iastate.edu
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Apr  4 16:07:12 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 15:07:12 -0500
Subject: [Bioperl-l] installing bioperl-1.5.2 errors:DB_File
In-Reply-To: <726978.82400.qm@web53105.mail.re2.yahoo.com>
References: <726978.82400.qm@web53105.mail.re2.yahoo.com>
Message-ID: <F786C444-6A18-4AA5-8AE8-6C0ECEEACC5E@uiuc.edu>

I think you have to use the cygwin installer to install DB_File (it  
also installs dependencies, such as BDB).  According to 'perldoc  
perlcygwin':

....
Optional Libraries for Perl on Cygwin

Several Perl functions and modules depend on the existence of some  
optional libraries. Configure will find them if they are installed in  
one of the directories listed as being used for library searches. Pre- 
built packages for most of these are available from the Cygwin  
installer.
....

chris
On Apr 2, 2008, at 12:52 PM, Pan Zheng wrote:

> Hi,
>
>  I am installing bioperl-1.5.2_102 under cygwin on my Windows XP and  
> having some errors during the process.
>
>  When I was running "perl Build test", one major error is the error  
> about DB_File. I tried to install DB_File from cpan and rpm without  
> any luck.
>
>  ++++++++++++++++++++++++
>  CPAN: File::Temp loaded ok (v0.16)
> CPAN: YAML loaded ok (v0.62)
>    CPAN.pm: Going to build P/PM/PMQS/DB_File-1.817.tar.gz
>  Parsing config.in...
> Looks Good.
> Checking if your kit is complete...
> Looks good
> Note (probably harmless): No library found for -ldb
> Writing Makefile for DB_File
> cp DB_File.pm blib/lib/DB_File.pm
> AutoSplitting blib/lib/DB_File.pm (blib/lib/auto/DB_File)
> gcc -c  -I/usr/local/BerkeleyDB/include -DPERL_USE_SAFE_PUTENV -fno- 
> strict-alias
> ing -pipe -Wdeclaration-after-statement -DUSEIMPORTLIB -O3   - 
> DVERSION=\"1.817\"
> -DXS_VERSION=\"1.817\"  "-I/usr/lib/perl5/5.8/cygwin/CORE"  - 
> D_NOT_CORE  -DmDB_
> Prefix_t=size_t -DmDB_Hash_t=u_int32_t   version.c
> version.c:30:16: db.h: No such file or directory
> make: *** [version.o] Error 1
>  PMQS/DB_File-1.817.tar.gz
>  /usr/bin/make -- NOT OK
> Running make test
>  Can't test without successful make
> Running make install
>  Make had returned bad status, install seems impossible
> Failed during this command:
> PMQS/DB_File-1.817.tar.gz                    : make NO
>  +++++++++++++++++++++++++++++++++++++++++++++++
>
>
>  I can't remember I had this kind error while installing earlier  
> version.
>
>  Would you please help me on DB_File installation ?
>
>  Thanks.
>
>  Pan
>
>
> ---------------------------------
> You rock. That's why Blockbuster's offering you one month of  
> Blockbuster Total Access, No Cost.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Apr  4 17:25:41 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 16:25:41 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
Message-ID: <15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>

Do you need something to access eutils via BioPerl, or are you looking  
for a specific set of classes?  I wrote an interface to eutils  
(Bio::DB::EUtilities), you could do something like this:

#!/usr/bin/perl -w

use strict;
use warnings;
use Bio::DB::EUtilities;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
                                      -term => 'dihydroorotate',
                                      -db => 'pcsubstance',
                                      -retmax => 1000);

print join(',',$eutil->get_ids)."\n";

chris

On Apr 2, 2008, at 7:24 AM, Robert Citek wrote:

> Hello all,
>
> I have a list of chemical compounds that have some kind of interaction
> with proteins or genes.  The current list contains names or SMILES and
> I would like to get the CID number for those compounds.  Currently,
> I'm using perl to query the NCBI's eutils[1], which works great.  But
> I was just curious to know of there was a bioperl module to do
> something similar.  A quick google didn't turn up anything, so I
> thought I'd ask.
>
> [1] http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
>
> Regards,
> - Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ekeen at mail.tongji.edu.cn  Mon Apr  7 02:57:04 2008
From: ekeen at mail.tongji.edu.cn (Jinyan Huang)
Date: Mon, 7 Apr 2008 14:57:04 +0800
Subject: [Bioperl-l] How to analysis the relationship of my interesting KEGG
	pathways?
Message-ID: <fb5dae380804062357ka7de019kb3451a5e169c0bf4@mail.gmail.com>

In my research, I got 25 interesting pathways. I want to know the
regulated relationship of these pathways. It is better if there some
software to connect these KEGG pathways.

Thank you very much in advance.


From miguel.pignatelli at uv.es  Mon Apr  7 06:12:58 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Mon, 07 Apr 2008 12:12:58 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
Message-ID: <47F9F3AA.2090003@uv.es>

Hi all,

Is there any way to obtain the date of creation of individual GenBank 
entries? I don't mean the "last revision" date that can be found in the 
first line of a GenBank file.

I can access this creation date by looking at the "revision history" of 
any GenBank entry (for example, see
http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105), 
but I need a systematic (and local=fast) way to access this information.

Any help would be very appreciated,
Thank you very much in advance,

M;


From Bank.Beszteri at awi.de  Mon Apr  7 07:46:43 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Mon, 07 Apr 2008 13:46:43 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
References: <47F22B35.1030502@awi.de>
	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
Message-ID: <47FA09A3.2070004@awi.de>

Hi Hilmar,

it was important to understand that the inconsistency in taxon names is 
apparently only between the Swissprot entries with "non-standard" names 
and the contents of the taxonomy tables and that it is best to use a 
pre-loaded taxonomy, thanks for that! We have now updated to 
bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have 
loaded everything OK in ~26 hours (with many of the "The supplied 
lineage does not start near..." warnings, but no other problems). Our 
next test is to try to load trembl (will try to do this in parallel in 
multiple chunks), hope it will work just as nicely!

Thanks for your tips & insights!

Bank

Hilmar Lapp wrote:

>
> On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
>
>> [...] So next we started to test BioSQL, by trying to load just  
>> Swissprot in a MySQL DB first, like:
>>
>> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser  
>> xyz --dbpass abc --driver mysql --namespace uniprot_sprot --format  
>> swiss uniprot_sprot.dat
>>
>> Here we get an error message
>>
>> ###########################################
>>
>> Loading /biodb/spinkern/uniprot_sprot.dat ...
>> Could not store Q6DAH5:
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: The supplied lineage does not start near 'Erwinia carotovora  
>> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |  
>> Pectobacterium | Enterobacteriaceae | Enterobacteriales |  
>> Gammaproteobacteria | Proteobacteria | Bacteria')
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/ 
>> bioperl-1.5.2_102/Bio/Root/Root.pm:359
>> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/ 
>> bioperl-1.5.2_102/Bio/Species.pm:174
>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 552
>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object / 
>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:1305
>> STACK:  Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
>> /biodb/ spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:973
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / 
>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:852
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:182
>> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 244
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:169
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:251
>> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/spinkern/ 
>> bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271
>> STACK: load_seqdatabase.pl:622
>> -----------------------------------------------------------
>>
>> at load_seqdatabase.pl line 635
>>
>> ############################################
>>
>> or similar, depending on whether we use a pre-loaded ncbi taxonomy  
>> or not
>
>
> I recommend to always use a pre-loaded NCBI taxonomy unless you know  
> there are only a few organisms that are straightforward (for the  
> parser, that is).
>
>> , and which Swissprot release we are trying to load. It often seems  
>> to come from sg. like here, subsp. or other special addition to the  
>> species line; but alternative genus names and other curious things  
>> also to appear. It looks like Species.pm tries to validate the  
>> species name against the lineage info already there in the BioSQL  
>> DB, and in several cases, it finds inconsistencies.
>
>
> It actually happens upon a successful lookup when the species object  
> is populated from the database.
>
>> [...]
>> The only workaround we have found so far was to comment out line  174 
>> in Species.pm:
>>
>> $self->throw("The supplied lineage does not start near '$name' (I  
>> was supplied '".join(" | ", @vals)."')");
>
>
> That should be OK if you work with a pre-loaded taxonomy. It's sort  
> of a sanity check that should catch a parser having messed up a  
> species. If you use a pre-loaded NCBI taxonomy the results of the  
> species parsing don't matter in all details so long as the NCBI  
> taxonID is parsed out correctly, and then found in the database.
>
> Note that this actually a warn() in the main trunk version of  
> BioPerl, so you might want to upgrade to that (or change throw() to  
> warn() in your version). You still get the records flagged with that,  
> but it isn't an exception.
>
>>
>> After doing so, load_seqdatabase.pl runs for several hours (until  it 
>> evetually crashes; I haven?t found out yet why), but proceeds  really 
>> slowly.
>
>
> It should certainly *not* crash. Note also that you can supply --safe  
> on the command line, in which case the script will continue with the  
> next record if one fails to load for whatever reason.
>
> You will want to adjust the width constraint of dbxref.accession, for  
> example to 128 chars. This will also be fixed for BioSQL 1.0.1.
> See http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>
>
>> I also found some info on this for Pg and Oracle in the mailing  
>> list, but has anyone some approximate numbers for MySQL, how long  
>> should a first Swissprot load take?
>
>
> Possibly around 20 hours according to Erik Rijkers:
> See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html
>
> You can use the --logchunks N option to have it print out performance  
> statistics every N records.
>
> Hope this helps,
>
>     -hilmar


From cjfields at uiuc.edu  Mon Apr  7 08:32:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 07:32:45 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FA09A3.2070004@awi.de>
References: <47F22B35.1030502@awi.de>
	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
	<47FA09A3.2070004@awi.de>
Message-ID: <E8A1ED59-830D-473F-8818-1BAC4E0A2FA2@uiuc.edu>

The warnings are something that we still need to resolve, but the only  
fix I can think of likely breaks backward compatibility with older  
bioperl-db installations (i.e. storing the given scientific name  
instead of the binomial name, which is used as a fallback when no  
taxid is found).  There is a full explanation here:

http://bugzilla.open-bio.org/show_bug.cgi?id=2092

Anyway, I think it needs further testing when someone, likely Hilmar  
or I, have time.

chris

On Apr 7, 2008, at 6:46 AM, B?nk Beszteri wrote:

> Hi Hilmar,
>
> it was important to understand that the inconsistency in taxon names  
> is apparently only between the Swissprot entries with "non-standard"  
> names and the contents of the taxonomy tables and that it is best to  
> use a pre-loaded taxonomy, thanks for that! We have now updated to  
> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to  
> have loaded everything OK in ~26 hours (with many of the "The  
> supplied lineage does not start near..." warnings, but no other  
> problems). Our next test is to try to load trembl (will try to do  
> this in parallel in multiple chunks), hope it will work just as  
> nicely!
>
> Thanks for your tips & insights!
>
> Bank
>
> Hilmar Lapp wrote:
>
>>
>> On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
>>
>>> [...] So next we started to test BioSQL, by trying to load just   
>>> Swissprot in a MySQL DB first, like:
>>>
>>> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser   
>>> xyz --dbpass abc --driver mysql --namespace uniprot_sprot -- 
>>> format  swiss uniprot_sprot.dat
>>>
>>> Here we get an error message
>>>
>>> ###########################################
>>>
>>> Loading /biodb/spinkern/uniprot_sprot.dat ...
>>> Could not store Q6DAH5:
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: The supplied lineage does not start near 'Erwinia carotovora   
>>> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |   
>>> Pectobacterium | Enterobacteriaceae | Enterobacteriales |   
>>> Gammaproteobacteria | Proteobacteria | Bacteria')
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/  
>>> bioperl-1.5.2_102/Bio/Root/Root.pm:359
>>> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/  
>>> bioperl-1.5.2_102/Bio/Species.pm:174
>>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm: 552
>>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object /  
>>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:1305
>>> STACK:   
>>> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key / 
>>> biodb/ spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:973
>>> STACK:  
>>> Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key /  
>>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:852
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:182
>>> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm: 244
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:169
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:251
>>> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/ 
>>> spinkern/ bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm:271
>>> STACK: load_seqdatabase.pl:622
>>> -----------------------------------------------------------
>>>
>>> at load_seqdatabase.pl line 635
>>>
>>> ############################################
>>>
>>> or similar, depending on whether we use a pre-loaded ncbi  
>>> taxonomy  or not
>>
>>
>> I recommend to always use a pre-loaded NCBI taxonomy unless you  
>> know  there are only a few organisms that are straightforward (for  
>> the  parser, that is).
>>
>>> , and which Swissprot release we are trying to load. It often  
>>> seems  to come from sg. like here, subsp. or other special  
>>> addition to the  species line; but alternative genus names and  
>>> other curious things  also to appear. It looks like Species.pm  
>>> tries to validate the  species name against the lineage info  
>>> already there in the BioSQL  DB, and in several cases, it finds  
>>> inconsistencies.
>>
>>
>> It actually happens upon a successful lookup when the species  
>> object  is populated from the database.
>>
>>> [...]
>>> The only workaround we have found so far was to comment out line   
>>> 174 in Species.pm:
>>>
>>> $self->throw("The supplied lineage does not start near '$name' (I   
>>> was supplied '".join(" | ", @vals)."')");
>>
>>
>> That should be OK if you work with a pre-loaded taxonomy. It's  
>> sort  of a sanity check that should catch a parser having messed up  
>> a  species. If you use a pre-loaded NCBI taxonomy the results of  
>> the  species parsing don't matter in all details so long as the  
>> NCBI  taxonID is parsed out correctly, and then found in the  
>> database.
>>
>> Note that this actually a warn() in the main trunk version of   
>> BioPerl, so you might want to upgrade to that (or change throw()  
>> to  warn() in your version). You still get the records flagged with  
>> that,  but it isn't an exception.
>>
>>>
>>> After doing so, load_seqdatabase.pl runs for several hours (until   
>>> it evetually crashes; I haven?t found out yet why), but proceeds   
>>> really slowly.
>>
>>
>> It should certainly *not* crash. Note also that you can supply -- 
>> safe  on the command line, in which case the script will continue  
>> with the  next record if one fails to load for whatever reason.
>>
>> You will want to adjust the width constraint of dbxref.accession,  
>> for  example to 128 chars. This will also be fixed for BioSQL 1.0.1.
>> See http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>>
>>
>>> I also found some info on this for Pg and Oracle in the mailing   
>>> list, but has anyone some approximate numbers for MySQL, how long   
>>> should a first Swissprot load take?
>>
>>
>> Possibly around 20 hours according to Erik Rijkers:
>> See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html
>>
>> You can use the --logchunks N option to have it print out  
>> performance  statistics every N records.
>>
>> Hope this helps,
>>
>>    -hilmar
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Apr  7 08:34:00 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 07 Apr 2008 13:34:00 +0100
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FA09A3.2070004@awi.de>
References: <47F22B35.1030502@awi.de>	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
	<47FA09A3.2070004@awi.de>
Message-ID: <47FA14B8.7000500@sendu.me.uk>

B?nk Beszteri wrote:
> Hi Hilmar,
> 
> it was important to understand that the inconsistency in taxon names is 
> apparently only between the Swissprot entries with "non-standard" names 
> and the contents of the taxonomy tables and that it is best to use a 
> pre-loaded taxonomy, thanks for that! We have now updated to 
> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have 
> loaded everything OK in ~26 hours (with many of the "The supplied 
> lineage does not start near..." warnings, but no other problems).

Can you provide some examples of these warnings (of the taxons that 
cause them)? If there's anything consistent about them perhaps 
Bio::Species can be improved to accommodate them properly (instead of 
just issuing the warning and getting the classification wrong).


From heikki at sanbi.ac.za  Mon Apr  7 08:48:34 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Mon, 7 Apr 2008 14:48:34 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47F9F3AA.2090003@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
Message-ID: <200804071448.34769.heikki@sanbi.ac.za>

Miguel,

You probably know this but:

- Your entry example below is a GenPept entry, not a GenBank entry
- The NCBI sequence format "genbank" has only the last modified date.
   I do not know about other formats (ASN.1, ...)
- NCBI Entrez is a great tool but it obscures the source database.
- If you really are working on real GenBank entries, you can use the accession 
number to see find corresponding EMBL (and Swiss-Prot) flat file formats that 
have both creation and last modified dates.

Post to the list if you have trouble getting the dates from EMBL/Swiss-Prot 
formats using bioperl.

Yours,

	-Heikki

On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
> Hi all,
>
> Is there any way to obtain the date of creation of individual GenBank
> entries? I don't mean the "last revision" date that can be found in the
> first line of a GenBank file.
>
> I can access this creation date by looking at the "revision history" of
> any GenBank entry (for example, see
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> but I need a systematic (and local=fast) way to access this information.
>
> Any help would be very appreciated,
> Thank you very much in advance,
>
> M;
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From granjeau at tagc.univ-mrs.fr  Mon Apr  7 09:30:10 2008
From: granjeau at tagc.univ-mrs.fr (Samuel GRANJEAUD - IR/ICIM)
Date: Mon, 07 Apr 2008 15:30:10 +0200
Subject: [Bioperl-l] help installing bioperl with cygwin
In-Reply-To: <B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>
References: <161313331084931@webmail.iastate.edu>
	<B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>
Message-ID: <47FA21E2.3010602@tagc.univ-mrs.fr>

Hi,

I'm using BioPerl under Cygwin, because Cygwin allows one to work in a 
Unix-like environment in a command line point of view.

So, I use the CVS version which runs out of the box
http://www.bioperl.org/wiki/Using_CVS
which has been replaced by SVN at the beginning of the year
http://www.bioperl.org/wiki/Using_Subversion

So if you really want to work under Cygwin, you can try this quick and 
dirty way, but you still have to become experienced because BioPerl is 
not supported under Cygwin.

You may try Strawberry, but in my experience in installing wxPerl, 
wxPerl fails on both flavours of Perl. ActiveState's Perl is still the 
easiest way to install many packages.

Regards,
Samuel


Chris Fields wrote:
> It's best if you use ActiveState's Perl installation (it's the only 
> one we really support at this moment, unless someone wants to give 
> StrawberryPerl a run).  See:
>
> http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
>
> chris
>
> On Apr 3, 2008, at 1:13 PM, slduncan at iastate.edu wrote:
>
>> I am trying to use cpan to install bioperl and I had an error message 
>> saying:
>> c:\Documents not recognized as and external or internal....
>> Any ideas here.  Also, I am new to the computer world so please be 
>> kind. :)
>>
>> Stacy Duncan
>> Iowa State University
>> Bioinformatics and Computational Biology
>> 1802 University Blvd.
>> VMRI Building 6
>> Ames, IA 50011-1240
>> office phone: (515) 294-8385
>> office fax: (515) 294-1401
>> home phone: (336) 965-5622
>> e-mail: slduncan at iastate.edu
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 

Samuel GRANJEAUD                   granjeau at tagc.univ-mrs.fr
INSERM - ICIM - TAGC               Tel: +33  (0)491 82 87 24
http://tagc.univ-mrs.fr            Fax: +33  (0)491 82 87 01
http://icim.marseille.inserm.fr/proteomique


From er at xs4all.nl  Mon Apr  7 10:36:57 2008
From: er at xs4all.nl (Erik)
Date: Mon, 7 Apr 2008 16:36:57 +0200 (CEST)
Subject: [Bioperl-l] Indexing large databases / BioSQL
Message-ID: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>

On Mon, April 7, 2008 14:34, Sendu Bala wrote:
> B?nk Beszteri wrote:
>> Hi Hilmar,
>>
>> it was important to understand that the inconsistency in taxon names is
>> apparently only between the Swissprot entries with "non-standard" names
>> and the contents of the taxonomy tables and that it is best to use a
>> pre-loaded taxonomy, thanks for that! We have now updated to
>> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have
>> loaded everything OK in ~26 hours (with many of the "The supplied
>> lineage does not start near..." warnings, but no other problems).
>
> Can you provide some examples of these warnings (of the taxons that
> cause them)? If there's anything consistent about them perhaps
> Bio::Species can be improved to accommodate them properly (instead of
> just issuing the warning and getting the classification wrong).
>

I did this a little while ago and saved the output
(UniProtKB/Swiss-Prot Release 55.1 of 18-Mar-2008, I think).

All warnings (and a few errors) for swissprot are here:

   http://bugzilla.open-bio.org/show_bug.cgi?id=2474

as an attached file

I suppose the OP will have encountered similar output - I don't think there is
much RDBMS-type-dependency involved.

   regards,

   Erik Rijkers


From cjfields at uiuc.edu  Mon Apr  7 11:46:01 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 10:46:01 -0500
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <200804071448.34769.heikki@sanbi.ac.za>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
Message-ID: <2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>

Strangely enough, if you use NCBI's esummary you can get both dates.   
Via Bio::DB::EUtilities in bioperl-live, if you dump out DocSum data  
(using a debugging method I added in a while back):

---------------------------------------

use Bio::DB::EUtilities;

# for multiple IDs use an array ref; also only use GI's (not accessions)
my $factory = Bio::DB::EUtilities->new(
                         -eutil => 'esummary',
                         -db => 'protein',
                         -id => 1621261);

$factory->print_DocSums;

---------------------------------------

One gets the following tag/value pairs:

UID: 1621261
Caption             :CAB02640
Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN  
PYRR [Mycobacterium tuberculosis
		     H37Rv]
Extra               :gi|1621261|emb|CAB02640.1|[1621261]
Gi                  :1621261
CreateDate          :2003/11/21
UpdateDate          :2006/11/14
Flags               :
TaxId               :83332
Length              :193
Status              :live
ReplacedBy          :
Comment             :

I'll add in a method to grab the data element by tag (in this case,  
grab the creation date by asking for the 'CreateDate' key).  Might  
come in handy for scripts.

chris

On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:

> Miguel,
>
> You probably know this but:
>
> - Your entry example below is a GenPept entry, not a GenBank entry
> - The NCBI sequence format "genbank" has only the last modified date.
>   I do not know about other formats (ASN.1, ...)
> - NCBI Entrez is a great tool but it obscures the source database.
> - If you really are working on real GenBank entries, you can use the  
> accession
> number to see find corresponding EMBL (and Swiss-Prot) flat file  
> formats that
> have both creation and last modified dates.
>
> Post to the list if you have trouble getting the dates from EMBL/ 
> Swiss-Prot
> formats using bioperl.
>
> Yours,
>
> 	-Heikki
>
> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in  
>> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision  
>> history" of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi? 
>> val=74311105),
>> but I need a systematic (and local=fast) way to access this  
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From miguel.pignatelli at uv.es  Mon Apr  7 12:24:50 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Mon, 07 Apr 2008 18:24:50 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
Message-ID: <47FA4AD2.5030206@uv.es>


I've noticed that the ASN.1 version of those records has a 
"creation-date" tag.
But this is somehow strange, because the creation date obtained by you 
and that obtained via ASN.1 format is 2003/11/21, but if you look at the 
revision history of the record:

http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=CAB02640

reports a creation date of "Oct 19 1996 12:28 AM"

I don't know how to get this, because the EMBL version of this gene:

http://www.ebi.ac.uk/cgi-bin/dbfetch?db=emblcds&id=CAB02640&style=raw

doesn't has DT fields at all.

M;


Chris Fields wrote:
> Strangely enough, if you use NCBI's esummary you can get both dates.  
> Via Bio::DB::EUtilities in bioperl-live, if you dump out DocSum data 
> (using a debugging method I added in a while back):
> 
> ---------------------------------------
> 
> use Bio::DB::EUtilities;
> 
> # for multiple IDs use an array ref; also only use GI's (not accessions)
> my $factory = Bio::DB::EUtilities->new(
>                         -eutil => 'esummary',
>                         -db => 'protein',
>                         -id => 1621261);
> 
> $factory->print_DocSums;
> 
> ---------------------------------------
> 
> One gets the following tag/value pairs:
> 
> UID: 1621261
> Caption             :CAB02640
> Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN PYRR 
> [Mycobacterium tuberculosis
>              H37Rv]
> Extra               :gi|1621261|emb|CAB02640.1|[1621261]
> Gi                  :1621261
> CreateDate          :2003/11/21
> UpdateDate          :2006/11/14
> Flags               :
> TaxId               :83332
> Length              :193
> Status              :live
> ReplacedBy          :
> Comment             :
> 
> I'll add in a method to grab the data element by tag (in this case, grab 
> the creation date by asking for the 'CreateDate' key).  Might come in 
> handy for scripts.
> 
> chris
> 
> On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:
> 
>> Miguel,
>>
>> You probably know this but:
>>
>> - Your entry example below is a GenPept entry, not a GenBank entry
>> - The NCBI sequence format "genbank" has only the last modified date.
>>   I do not know about other formats (ASN.1, ...)
>> - NCBI Entrez is a great tool but it obscures the source database.
>> - If you really are working on real GenBank entries, you can use the 
>> accession
>> number to see find corresponding EMBL (and Swiss-Prot) flat file 
>> formats that
>> have both creation and last modified dates.
>>
>> Post to the list if you have trouble getting the dates from 
>> EMBL/Swiss-Prot
>> formats using bioperl.
>>
>> Yours,
>>
>>     -Heikki
>>
>> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>>> Hi all,
>>>
>>> Is there any way to obtain the date of creation of individual GenBank
>>> entries? I don't mean the "last revision" date that can be found in the
>>> first line of a GenBank file.
>>>
>>> I can access this creation date by looking at the "revision history" of
>>> any GenBank entry (for example, see
>>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>>> but I need a systematic (and local=fast) way to access this information.
>>>
>>> Any help would be very appreciated,
>>> Thank you very much in advance,
>>>
>>> M;
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> -- 
>> ______ _/      _/_____________________________________________________
>>      _/      _/
>>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>>  _/  _/  _/  University of Western Cape, South Africa
>>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/________________________________________________________
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> 


From cjfields at uiuc.edu  Mon Apr  7 13:48:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 12:48:45 -0500
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FA4AD2.5030206@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
	<47FA4AD2.5030206@uv.es>
Message-ID: <CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>

Note in the example I gave that, during the revision history, the  
DBSOURCE changed at the point of the creation date (the original nuc.  
record was a M. tuberculosis contig sequence, which later changed to  
an updated full M. tuberculosis genome record at the time of the  
'create date').

Couldn't find anything specific in the GenBank docs on this, but it  
appears (at least for a protein record) the creation date reflects the  
date in which the sequence was either originally deposited or  
originally derived from the nucleotide source record present in the  
record.  In other words, it may not reflect the original date of  
deposition (which could have come from a different record, as in this  
case).

chris

On Apr 7, 2008, at 11:24 AM, Miguel Pignatelli wrote:

>
> I've noticed that the ASN.1 version of those records has a "creation- 
> date" tag.
> But this is somehow strange, because the creation date obtained by  
> you and that obtained via ASN.1 format is 2003/11/21, but if you  
> look at the revision history of the record:
>
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=CAB02640
>
> reports a creation date of "Oct 19 1996 12:28 AM"
>
> I don't know how to get this, because the EMBL version of this gene:
>
> http://www.ebi.ac.uk/cgi-bin/dbfetch?db=emblcds&id=CAB02640&style=raw
>
> doesn't has DT fields at all.
>
> M;
>
>
> Chris Fields wrote:
>> Strangely enough, if you use NCBI's esummary you can get both  
>> dates.  Via Bio::DB::EUtilities in bioperl-live, if you dump out  
>> DocSum data (using a debugging method I added in a while back):
>> ---------------------------------------
>> use Bio::DB::EUtilities;
>> # for multiple IDs use an array ref; also only use GI's (not  
>> accessions)
>> my $factory = Bio::DB::EUtilities->new(
>>                        -eutil => 'esummary',
>>                        -db => 'protein',
>>                        -id => 1621261);
>> $factory->print_DocSums;
>> ---------------------------------------
>> One gets the following tag/value pairs:
>> UID: 1621261
>> Caption             :CAB02640
>> Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN  
>> PYRR [Mycobacterium tuberculosis
>>             H37Rv]
>> Extra               :gi|1621261|emb|CAB02640.1|[1621261]
>> Gi                  :1621261
>> CreateDate          :2003/11/21
>> UpdateDate          :2006/11/14
>> Flags               :
>> TaxId               :83332
>> Length              :193
>> Status              :live
>> ReplacedBy          :
>> Comment             :
>> I'll add in a method to grab the data element by tag (in this case,  
>> grab the creation date by asking for the 'CreateDate' key).  Might  
>> come in handy for scripts.
>> chris
>> On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:
>>> Miguel,
>>>
>>> You probably know this but:
>>>
>>> - Your entry example below is a GenPept entry, not a GenBank entry
>>> - The NCBI sequence format "genbank" has only the last modified  
>>> date.
>>>  I do not know about other formats (ASN.1, ...)
>>> - NCBI Entrez is a great tool but it obscures the source database.
>>> - If you really are working on real GenBank entries, you can use  
>>> the accession
>>> number to see find corresponding EMBL (and Swiss-Prot) flat file  
>>> formats that
>>> have both creation and last modified dates.
>>>
>>> Post to the list if you have trouble getting the dates from EMBL/ 
>>> Swiss-Prot
>>> formats using bioperl.
>>>
>>> Yours,
>>>
>>>    -Heikki
>>>
>>> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>>>> Hi all,
>>>>
>>>> Is there any way to obtain the date of creation of individual  
>>>> GenBank
>>>> entries? I don't mean the "last revision" date that can be found  
>>>> in the
>>>> first line of a GenBank file.
>>>>
>>>> I can access this creation date by looking at the "revision  
>>>> history" of
>>>> any GenBank entry (for example, see
>>>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105) 
>>>> ,
>>>> but I need a systematic (and local=fast) way to access this  
>>>> information.
>>>>
>>>> Any help would be very appreciated,
>>>> Thank you very much in advance,
>>>>
>>>> M;
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>> -- 
>>> ______ _/      _/ 
>>> _____________________________________________________
>>>     _/      _/
>>>    _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>>   _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>>  _/  _/  _/  SANBI, South African National Bioinformatics Institute
>>> _/  _/  _/  University of Western Cape, South Africa
>>>    _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>>> ___ _/_/_/_/_/ 
>>> ________________________________________________________
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Tue Apr  8 03:35:43 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Tue, 08 Apr 2008 09:35:43 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
Message-ID: <47FB204F.90405@awi.de>


>>Can you provide some examples of these warnings (of the taxons that
>>cause them)? If there's anything consistent about them perhaps
>>Bio::Species can be improved to accommodate them properly (instead of
>>just issuing the warning and getting the classification wrong).
>>    
>>
>
>All warnings (and a few errors) for swissprot are here:
>
>   http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>
>as an attached file
>
>I suppose the OP will have encountered similar output - I don't think there is
>much RDBMS-type-dependency involved.
>  
>
Hi Erik & Sendu,

yes, the same kind of thing, probably no DBMS-type dependency; in case 
it could be useful, I uploaded my output as a second attachment to the 
bugzilla report cited above.

Bank


From heikki at sanbi.ac.za  Tue Apr  8 04:32:12 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Tue, 8 Apr 2008 10:32:12 +0200
Subject: [Bioperl-l] Blast database sequence retrieval perl script
In-Reply-To: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
References: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
Message-ID: <200804081032.12312.heikki@sanbi.ac.za>


Dear Nelson,

I am cc:ing the bioperl mailing list where all these kind of queries should 
go. More people can help you that way.


Since you have your own local data set, you need to create an index that 
catalogues you sequences for easy retrieval.

You need to install bioperl-live first. See for example: 	
	http://www.bioperl.org/wiki/Using_Subversion

Then you can follow this HOWTO:
	http://www.bioperl.org/wiki/HOWTO:Flat_databases

The other HOWTOs will help you dealing with BioPerl sequence objects that are 
retrieved: http://www.bioperl.org/wiki/HOWTOs. 


Yours,

	-Heikki


On Monday 07 April 2008 14:50:23 Ndegwa, Nelson (IITA-Nairobi) wrote:
> Dear Prof. Heikki,
>
> Hi. We met at the Pathogen Bioinformatics Conference held in Nairobi
> Kenya in May 2007 at ICIPE. I recall you are a developer of Bioperl and
> Perl. I have managed to install a local Blast, having just cowpea Contig
> sequences, about 50,000 in total. This runs fine, as I can perform
> various queries and get results. However, any good match/hit on the
> local Blast database is hard to retrieve and the only option seems to go
> back to that database and search manually for the top hit sequence - an
> exceedingly manual task. Might you perhaps be having a Perl script I
> could adopt to my database to help with this task Such that the hits
> have a hyperlink which can be used to retrieve that specific entry? I
> have limited knowledge of Perl. Thank you.
>
> With Kind Regards,
>
> Nelson.


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From David.Messina at sbc.su.se  Tue Apr  8 07:29:12 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Tue, 8 Apr 2008 13:29:12 +0200
Subject: [Bioperl-l] How to analysis the relationship of my interesting KEGG
	pathways?
In-Reply-To: <628aabb70804080053g1fd9120ex9d5fd12f65f216f9@mail.gmail.com>
References: <fb5dae380804062357ka7de019kb3451a5e169c0bf4@mail.gmail.com>
	<628aabb70804080053g1fd9120ex9d5fd12f65f216f9@mail.gmail.com>
Message-ID: <628aabb70804080429k2aa17a6eu12197709d4cc1af0@mail.gmail.com>

Hi Jinyan,

You asked a similar question last week and received a couple of suggestions
-- did you take a look at those?

I'm not an expert on this topic, but I believe that since regulatory
information is much harder to obtain experimentally and therefore much less
well known, there isn't a lot of it in pathway databases like KEGG. You may
have to look through the literature and start trying to put together
possible regulatory links on your own.

Dave


From hrh at sanger.ac.uk  Tue Apr  8 08:48:32 2008
From: hrh at sanger.ac.uk (Hans Rudolf Hotz)
Date: Tue, 8 Apr 2008 13:48:32 +0100 (BST)
Subject: [Bioperl-l] Blast database sequence retrieval perl script
In-Reply-To: <200804081032.12312.heikki@sanbi.ac.za>
References: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
	<200804081032.12312.heikki@sanbi.ac.za>
Message-ID: <Pine.LNX.4.64.0804081340180.7147@deskpro50.dynamic.sanger.ac.uk>

Nelson

or simply use the BLAST indices for the sequence retrieval as well.

All you need to do is adding the "-o" option to the 'formatdb' command for 
the BLAST index creation (this will create some extra files). Then you can 
use 'fastacmd' (which is also part of the NCBI BLAST package) to retrieve 
the sequences.


Hans

On Tue, 8 Apr 2008, Heikki Lehvaslaiho wrote:

>
> Dear Nelson,
>
> I am cc:ing the bioperl mailing list where all these kind of queries should
> go. More people can help you that way.
>
>
> Since you have your own local data set, you need to create an index that
> catalogues you sequences for easy retrieval.
>
> You need to install bioperl-live first. See for example:
> 	http://www.bioperl.org/wiki/Using_Subversion
>
> Then you can follow this HOWTO:
> 	http://www.bioperl.org/wiki/HOWTO:Flat_databases
>
> The other HOWTOs will help you dealing with BioPerl sequence objects that are
> retrieved: http://www.bioperl.org/wiki/HOWTOs.
>
>
> Yours,
>
> 	-Heikki
>
>
> On Monday 07 April 2008 14:50:23 Ndegwa, Nelson (IITA-Nairobi) wrote:
>> Dear Prof. Heikki,
>>
>> Hi. We met at the Pathogen Bioinformatics Conference held in Nairobi
>> Kenya in May 2007 at ICIPE. I recall you are a developer of Bioperl and
>> Perl. I have managed to install a local Blast, having just cowpea Contig
>> sequences, about 50,000 in total. This runs fine, as I can perform
>> various queries and get results. However, any good match/hit on the
>> local Blast database is hard to retrieve and the only option seems to go
>> back to that database and search manually for the top hit sequence - an
>> exceedingly manual task. Might you perhaps be having a Perl script I
>> could adopt to my database to help with this task Such that the hits
>> have a hyperlink which can be used to retrieve that specific entry? I
>> have limited knowledge of Perl. Thank you.
>>
>> With Kind Regards,
>>
>> Nelson.
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From robert.citek at gmail.com  Tue Apr  8 10:09:27 2008
From: robert.citek at gmail.com (Robert Citek)
Date: Tue, 8 Apr 2008 09:09:27 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
	<15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
Message-ID: <4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>

Wrapping bioperl around eutils will work just fine.  Thanks for the pointer.

http://search.cpan.org/~sendu/bioperl-1.5.2_102/Bio/DB/EUtilities.pm

Regards,
- Robert

On Fri, Apr 4, 2008 at 4:25 PM, Chris Fields <cjfields at uiuc.edu> wrote:
> Do you need something to access eutils via BioPerl, or are you looking for a
> specific set of classes?  I wrote an interface to eutils
> (Bio::DB::EUtilities), you could do something like this:
>
>  #!/usr/bin/perl -w
>
>  use strict;
>  use warnings;
>  use Bio::DB::EUtilities;
>
>  my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>                                      -term => 'dihydroorotate',
>                                      -db => 'pcsubstance',
>                                      -retmax => 1000);
>
>  print join(',',$eutil->get_ids)."\n";
>
>  chris


From cjfields at uiuc.edu  Tue Apr  8 11:10:26 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 8 Apr 2008 10:10:26 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
	<15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
	<4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>
Message-ID: <32D210FC-575E-4D95-95DA-FC6F5BE1FC24@uiuc.edu>

Just to note, the the API has changed significantly from the interface  
in the 1.5.2 release.  The up-to-date (supported) interface is in  
subversion; there are some example recipes here:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

I'm working on a full HOWTO, just haven't had time to get it up on the  
wiki yet.

chris

On Apr 8, 2008, at 9:09 AM, Robert Citek wrote:

> Wrapping bioperl around eutils will work just fine.  Thanks for the  
> pointer.
>
> http://search.cpan.org/~sendu/bioperl-1.5.2_102/Bio/DB/EUtilities.pm
>
> Regards,
> - Robert
>
> On Fri, Apr 4, 2008 at 4:25 PM, Chris Fields <cjfields at uiuc.edu>  
> wrote:
>> Do you need something to access eutils via BioPerl, or are you  
>> looking for a
>> specific set of classes?  I wrote an interface to eutils
>> (Bio::DB::EUtilities), you could do something like this:
>>
>> #!/usr/bin/perl -w
>>
>> use strict;
>> use warnings;
>> use Bio::DB::EUtilities;
>>
>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>>                                     -term => 'dihydroorotate',
>>                                     -db => 'pcsubstance',
>>                                     -retmax => 1000);
>>
>> print join(',',$eutil->get_ids)."\n";
>>
>> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cuiw at ncbi.nlm.nih.gov  Tue Apr  8 16:41:58 2008
From: cuiw at ncbi.nlm.nih.gov (Cui, Wenwu (NIH/NLM/NCBI) [C])
Date: Tue, 8 Apr 2008 16:41:58 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47F9F3AA.2090003@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
Message-ID: <6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>

Hi, Miguel:

id1_fetch can do it. Detailed instruction can be found at:  

http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
1_fetch.html

Here is an example:

>id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
GI        Loaded      DB    Retrieval No.
--        ------      --    -------------
74311105  12/07/2007  NCBI  19766263
74311105  01/23/2007  NCBI  16325656
74311105  03/30/2006  NCBI  13131204
74311105  03/03/2006  NCBI  12915541
74311105  03/02/2006  NCBI  12885275
74311105  12/03/2005  NCBI  12259793
74311105  09/09/2005  NCBI  11257262
74311105  09/09/2005  NCBI  11242667

Wenwu Cui PhD
NCBI/NLM/NIH

> -----Original Message-----
> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> Sent: Monday, April 07, 2008 6:13 AM
> Cc: bioperl-l at bioperl.org
> Subject: [Bioperl-l] GenBank entries creation dates
> 
> Hi all,
> 
> Is there any way to obtain the date of creation of individual GenBank
> entries? I don't mean the "last revision" date that can be found in
the
> first line of a GenBank file.
> 
> I can access this creation date by looking at the "revision history"
of
> any GenBank entry (for example, see
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> but I need a systematic (and local=fast) way to access this
> information.
> 
> Any help would be very appreciated,
> Thank you very much in advance,
> 
> M;
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From miguel.pignatelli at uv.es  Wed Apr  9 07:32:39 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Wed, 09 Apr 2008 13:32:39 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
	<6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
Message-ID: <47FCA957.5040409@uv.es>

Wow, impressive, thanks Wenwu for the information, I have never used 
this tool before. The problem is that I need to know all the revision 
history (or at least the creation date) for *all* the GIs present in nr 
(well, or at least a significant portion of it) and this tool queries 
via web.

The existence of this tool confirms me that this information is 
available somewhere, is it possible to download the data that contains 
this information?

Thanks again,

M;


Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at:  
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
> 1_fetch.html
> 
> Here is an example:
> 
>> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD
> NCBI/NLM/NIH
> 
>> -----Original Message-----
>> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
>> Sent: Monday, April 07, 2008 6:13 AM
>> Cc: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] GenBank entries creation dates
>>
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in
> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision history"
> of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>> but I need a systematic (and local=fast) way to access this
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 


From cuiw at ncbi.nlm.nih.gov  Wed Apr  9 09:25:16 2008
From: cuiw at ncbi.nlm.nih.gov (Cui, Wenwu (NIH/NLM/NCBI) [C])
Date: Wed, 9 Apr 2008 09:25:16 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FCA957.5040409@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
	<6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
	<47FCA957.5040409@uv.es>
Message-ID: <6F230E9769AA8D4EB4BC401DF133EDB7180BE1@NIHCESMLBX15.nih.gov>

Hi, Miguel,

I do not know whether the data file is publically available. However,
you can perform 'real time' query via id1_fetch:

####step 1: generate GI file #####
id1_fetch -query 'YOUR-GENBANK-QUERY-STRING' -lt none -db Nucleotide
-out qfile

####step 2: retrieve revisions for GIs stored in qfile #####

id1_fetch -lt revisions -qf qfile  -fmt fasta -db Nucleotide

Good luck!

Wenwu Cui

> -----Original Message-----
> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> Sent: Wednesday, April 09, 2008 7:33 AM
> To: Cui, Wenwu (NIH/NLM/NCBI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] GenBank entries creation dates
> 
> Wow, impressive, thanks Wenwu for the information, I have never used
> this tool before. The problem is that I need to know all the revision
> history (or at least the creation date) for *all* the GIs present in
nr
> (well, or at least a significant portion of it) and this tool queries
> via web.
> 
> The existence of this tool confirms me that this information is
> available somewhere, is it possible to download the data that contains
> this information?
> 
> Thanks again,
> 
> M;
> 
> 
> Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> > Hi, Miguel:
> >
> > id1_fetch can do it. Detailed instruction can be found at:
> >
> >
>
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.i
> d
> > 1_fetch.html
> >
> > Here is an example:
> >
> >> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> > GI        Loaded      DB    Retrieval No.
> > --        ------      --    -------------
> > 74311105  12/07/2007  NCBI  19766263
> > 74311105  01/23/2007  NCBI  16325656
> > 74311105  03/30/2006  NCBI  13131204
> > 74311105  03/03/2006  NCBI  12915541
> > 74311105  03/02/2006  NCBI  12885275
> > 74311105  12/03/2005  NCBI  12259793
> > 74311105  09/09/2005  NCBI  11257262
> > 74311105  09/09/2005  NCBI  11242667
> >
> > Wenwu Cui PhD
> > NCBI/NLM/NIH
> >
> >> -----Original Message-----
> >> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> >> Sent: Monday, April 07, 2008 6:13 AM
> >> Cc: bioperl-l at bioperl.org
> >> Subject: [Bioperl-l] GenBank entries creation dates
> >>
> >> Hi all,
> >>
> >> Is there any way to obtain the date of creation of individual
> GenBank
> >> entries? I don't mean the "last revision" date that can be found in
> > the
> >> first line of a GenBank file.
> >>
> >> I can access this creation date by looking at the "revision
history"
> > of
> >> any GenBank entry (for example, see
> >>
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> >> but I need a systematic (and local=fast) way to access this
> >> information.
> >>
> >> Any help would be very appreciated,
> >> Thank you very much in advance,
> >>
> >> M;
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From CALLEY_JOHN_N at LILLY.COM  Wed Apr  9 09:45:23 2008
From: CALLEY_JOHN_N at LILLY.COM (John N Calley)
Date: Wed, 9 Apr 2008 09:45:23 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FCA957.5040409@uv.es>
Message-ID: <OF73E5AA49.8E1EF918-ON85257426.004AF961-85257426.004B915C@EliLilly.lilly.com>

You might want to keep in mind that the creation date is not always 
reliable. I am aware of one example where the recorded creation date 
precedes the sequencing date by several months (as determined by the trace 
file date). NCBI was not able to explain exactly what happened but (as I 
recall) hypothesized that some dates had been scrambled in a database 
rebuild. If there was interest I could probably pull up more details.

John Calley


Miguel Pignatelli <miguel.pignatelli at uv.es> 
Sent by: bioperl-l-bounces at lists.open-bio.org
04/09/2008 07:32 AM
Please respond to
miguel.pignatelli at uv.es


To
"Cui, Wenwu (NIH/NLM/NCBI) [C]" <cuiw at ncbi.nlm.nih.gov>
cc
bioperl-l at bioperl.org
Subject
Re: [Bioperl-l] GenBank entries creation dates


Wow, impressive, thanks Wenwu for the information, I have never used 
this tool before. The problem is that I need to know all the revision 
history (or at least the creation date) for *all* the GIs present in nr 
(well, or at least a significant portion of it) and this tool queries 
via web.

The existence of this tool confirms me that this information is 
available somewhere, is it possible to download the data that contains 
this information?

Thanks again,

M;


Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at: 
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
> 1_fetch.html
> 
> Here is an example:
> 
>> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD
> NCBI/NLM/NIH
> 
>> -----Original Message-----
>> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
>> Sent: Monday, April 07, 2008 6:13 AM
>> Cc: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] GenBank entries creation dates
>>
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in
> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision history"
> of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>> but I need a systematic (and local=fast) way to access this
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From frederic.romagne at gmail.com  Wed Apr  9 16:45:50 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 09 Apr 2008 15:45:50 -0500
Subject: [Bioperl-l] question about clustalw module.
Message-ID: <1207773950.483.13.camel@kiss-laptop>

Hello,

i have a problem when using Bio::Tools::Run::Alignment::Clustalw :

I give it an array_ref scalar (the array contains some fasta sequences)
and all the good parameters and i write the result via  Bio::SeqIO.

The fact is that my result file only contains the Accession number in
the header... An example :

the initial stream is : 

>NM_052854 Homo sapiens cAMP responsive element binding protein 3-like 1
(CREB3L1), mRNA.
AGAAGACGTGCGGAGGGAGACGCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGG
GGGAGCACTTAGCTCCCCCGCCCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTC
AGCCCCAACCCCGGGCTCCCCATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGT
GGAGTCGGCTGAATGCCCACGGTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCG
CTGCCCTAAGGCCCCCGCGCGCCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCG
CCCCTCCCCCGGGGCTTCGCCCCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAG
GAGCTCTGGACTGGGCGCGCCGCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCC
CGGGAGCCGGCTGCGATGGACGCCGTCTTGGAACCCTTCCCGGCCGACAGGCTGTTCCCC
GGATCCAGCTTCCTGGACTTGGGGGATCTGAACGAGTCGGACTTCCTCAACAATGCGCAC

...

the result file is :

>NM_052854
---------------------------------------AGAAGACGTGCGGAGGGAGAC
GCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGGGGGAGCACTTAGCTCCCCCGC
CCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTCAGCCCCAACCCCGGGCTCCCC
ATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGTGGAGTCGGCTGAATGCCCACG
GTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCGCTGCCCTAAGGCCCCCGCGCG
CCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCGCCCCTCCCCCGGGGCTTCGCC
CCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAGGAGCTCTGGACTGGGCGCGCC
GCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCCCGGGAGCCGGCTGCGATGGAC

...

?So i lost the other informations provided by the header...

?Is there any option to keep these informations?

Here is a part of my code with my options :


 my $seq_ref=\@seq;
 my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'quiet' => 1,
		'output' => 'FASTA');
 my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
 my $aln = $factory->align($seq_ref);


Thank you.


From jason at bioperl.org  Wed Apr  9 16:55:13 2008
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 9 Apr 2008 13:55:13 -0700
Subject: [Bioperl-l] question about clustalw module.
In-Reply-To: <1207773950.483.13.camel@kiss-laptop>
References: <1207773950.483.13.camel@kiss-laptop>
Message-ID: <C126E560-1A36-461E-ADAD-774446B9DB9E@bioperl.org>

the clustal alignment format does not allow for the description - if  
you want to preserve it you'll have to add it back, make a hash  
indexed by sequence ID and store the description, then when you get  
your alignment back you can update the description field before  
writing it out with AlignIO.

-jason
On Apr 9, 2008, at 1:45 PM, Fr?d?ric Romagn? wrote:

> Hello,
>
> i have a problem when using Bio::Tools::Run::Alignment::Clustalw :
>
> I give it an array_ref scalar (the array contains some fasta  
> sequences)
> and all the good parameters and i write the result via  Bio::SeqIO.
>
> The fact is that my result file only contains the Accession number in
> the header... An example :
>
> the initial stream is :
>
>> NM_052854 Homo sapiens cAMP responsive element binding protein 3- 
>> like 1
> (CREB3L1), mRNA.
> AGAAGACGTGCGGAGGGAGACGCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGG
> GGGAGCACTTAGCTCCCCCGCCCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTC
> AGCCCCAACCCCGGGCTCCCCATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGT
> GGAGTCGGCTGAATGCCCACGGTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCG
> CTGCCCTAAGGCCCCCGCGCGCCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCG
> CCCCTCCCCCGGGGCTTCGCCCCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAG
> GAGCTCTGGACTGGGCGCGCCGCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCC
> CGGGAGCCGGCTGCGATGGACGCCGTCTTGGAACCCTTCCCGGCCGACAGGCTGTTCCCC
> GGATCCAGCTTCCTGGACTTGGGGGATCTGAACGAGTCGGACTTCCTCAACAATGCGCAC
>
> ...
>
> the result file is :
>
>> NM_052854
> ---------------------------------------AGAAGACGTGCGGAGGGAGAC
> GCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGGGGGAGCACTTAGCTCCCCCGC
> CCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTCAGCCCCAACCCCGGGCTCCCC
> ATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGTGGAGTCGGCTGAATGCCCACG
> GTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCGCTGCCCTAAGGCCCCCGCGCG
> CCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCGCCCCTCCCCCGGGGCTTCGCC
> CCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAGGAGCTCTGGACTGGGCGCGCC
> GCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCCCGGGAGCCGGCTGCGATGGAC
>
> ...
>
> So i lost the other informations provided by the header...
>
> Is there any option to keep these informations?
>
> Here is a part of my code with my options :
>
>
>  my $seq_ref=\@seq;
>  my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'quiet' => 1,
> 		'output' => 'FASTA');
>  my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
>  my $aln = $factory->align($seq_ref);
>
>
> Thank you.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From lamq at usal.es  Thu Apr 10 11:52:24 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Thu, 10 Apr 2008 17:52:24 +0200
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
Message-ID: <47FE37B8.9090404@usal.es>

I am not able to add xyplot glyphs to one panel because I have some
problems with the aggregations.

Using that GFF file:

##sequence-region chr1 1 5578650
chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
. . .


And this source code for preparing the aggregated features necessary for
the xyplot glyph:

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,
                            -adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}'
                           );
my $segment  = $db->segment('chr1');
my @features1 = $db->features('atpc');
print "$#features1 \n";
my @features2 = $segment->features('atpc');
print "$#features2 \n";
my @features3 = $db->features('at');
print "$#features3 \n";
my @features4 = $segment->features('at');
print "$#features4 \n";

I obtain:

111572
111572
0
0

What I am doing wrong with the aggregator?

Many thanks.


From lamq at usal.es  Thu Apr 10 11:45:55 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Thu, 10 Apr 2008 17:45:55 +0200
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
Message-ID: <47FE3633.70908@usal.es>

I am not able to add xyplot glyphs to one panel because I have some 
problems with the aggregations.

Using that GFF file:

##sequence-region chr1 1 5578650
chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
. . .


And this source code for preparing the aggregated features necessary for 
the xyplot glyph:

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,
                            -adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}'
                           );
my $segment  = $db->segment('chr1');                           
my @features1 = $db->features('atpc');
print "$#features1 \n";
my @features2 = $segment->features('atpc');
print "$#features2 \n";
my @features3 = $db->features('at');
print "$#features3 \n";
my @features4 = $segment->features('at');
print "$#features4 \n";

I obtain:

111572
111572
0
0

What I am doing wrong with the aggregator?

Many thanks.


From lincoln.stein at gmail.com  Thu Apr 10 13:55:06 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 10 Apr 2008 13:55:06 -0400
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
In-Reply-To: <47FE37B8.9090404@usal.es>
References: <47FE37B8.9090404@usal.es>
Message-ID: <6dce9a0b0804101055w65e22abfgaa4f155751fef40f@mail.gmail.com>

Hi Luis,

When you aggregate the atpc 1 features together, you end up with one
feature. Thus @features3 is an array of size 1. The $# operator returns the
index of the last element, which is 0. If @features3 were empty, $#features3
would return -1.

Lincoln

On Thu, Apr 10, 2008 at 11:52 AM, Luis A. M. Quintales <lamq at usal.es> wrote:

> I am not able to add xyplot glyphs to one panel because I have some
> problems with the aggregations.
>
> Using that GFF file:
>
> ##sequence-region chr1 1 5578650
> chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
> chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
> chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
> chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
> . . .
>
>
> And this source code for preparing the aggregated features necessary for
> the xyplot glyph:
>
> my $filin  = $ARGV[0];
> my $db = Bio::DB::GFF->new( -dsn => $filin,
>                           -adaptor => 'memory',
>                           -aggregator => 'at{atpc:atfreq}'
>                          );
> my $segment  = $db->segment('chr1');
> my @features1 = $db->features('atpc');
> print "$#features1 \n";
> my @features2 = $segment->features('atpc');
> print "$#features2 \n";
> my @features3 = $db->features('at');
> print "$#features3 \n";
> my @features4 = $segment->features('at');
> print "$#features4 \n";
>
> I obtain:
>
> 111572
> 111572
> 0
> 0
>
> What I am doing wrong with the aggregator?
>
> Many thanks.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From adsj at novozymes.com  Fri Apr 11 04:53:23 2008
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Fri, 11 Apr 2008 10:53:23 +0200
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
Message-ID: <87d4owixh8.fsf@topper.koldfront.dk>

  Hi.

I am trying to make Bio::SeqIO return objects of my own type (a small
extension of Bio::Seq::RichSeq), by setting -seqfactory. I am having a
little trouble creating the correct object to pass with -seqfactory:

Following the example given in SYNOPSIS of Bio::Factory::SequenceFactoryI,
I get this error:

 $ perl -e '
 >            use Bio::Seq::SeqFactory;
 >            my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq');
 > 
 >            my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 >                                          -display_id => 'exampleseq');
 > 
 >            print "seq is a ", ref($seq), "\n";
 > '

 ------------- EXCEPTION: Bio::Root::Exception -------------
 MSG: Can't locate type.pm in @INC (@INC contains: /z/bio/biotools/bioinfperlmodules/ /z/bio/adm/modules /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl .) at (eval 13) line 3.
 : Unrecognized Sequence type for SeqFactory 'type'
 STACK: Error::throw
 STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:357
 STACK: Bio::Seq::SeqFactory::type /usr/share/perl/5.8/Bio/Seq/SeqFactory.pm:134
 STACK: Bio::Seq::SeqFactory::new /usr/share/perl/5.8/Bio/Seq/SeqFactory.pm:93
 STACK: -e:3
 -----------------------------------------------------------
 $ 

If I go "Bio::Seq::SeqFactory('Bio::PrimarySeq'=>1)" instead, for
instance, it seems to work:

 $ perl -e '
 >            use Bio::Seq::SeqFactory;
 >            my $seqbuilder = Bio::Seq::SeqFactory->new('Bio::PrimarySeq'=>1);
 > 
 >            my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 >                                          -display_id => 'exampleseq');
 > 
 >            print "seq is a ", ref($seq), "\n";
 > '
 seq is a Bio::PrimarySeq
 $ 

I was about to write a patch for the pod, when I realized that I'd
better start by asking: Is this a buglet in the pod or the code?

  Best regards,

    Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From hlapp at gmx.net  Fri Apr 11 11:35:54 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 11 Apr 2008 11:35:54 -0400
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <87d4owixh8.fsf@topper.koldfront.dk>
References: <87d4owixh8.fsf@topper.koldfront.dk>
Message-ID: <0037240B-F469-4388-972A-324101B11621@gmx.net>


On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:
>  $ perl -e '
>>            use Bio::Seq::SeqFactory;
>>            my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>  
>> 'Bio::PrimarySeq');


You need to prefix the argument with a dash: '-type', not 'type'.  
Otherwise, it assumes that the class you want instantiated is 'type.pm'.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From 1zoujing at 163.com  Thu Apr 10 01:08:52 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 22:08:52 -0700 (PDT)
Subject: [Bioperl-l]  Bio::ASN1::EntrezGene parse so slowly?
Message-ID: <16602210.post@talk.nabble.com>


  I want to parse a file "gene_info" from NCBI. The format of Gene in NCBI is
ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
properly/too slow. The file is about 500M. 
  The code is following:
  use Bio::ASN1::EntrezGene;
  my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
  my $i = 0;
  while(my $result = $parser->next_seq)
  { last; #something to do there, here use last for test}

  When it goes to the "while" part, it is processing on and on, it does not
went out, even I used "last" in the "while" part. 
   So I wonder whether it is too slow or the module is not fit for this job,
or I did something wrong?

  Thank you!
-- 
View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 02:17:41 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 23:17:41 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl Sus_scrofa.ags"
Message-ID: <16602770.post@talk.nabble.com>


   I am a geen hand in Bioperl. When I run perl with
"parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
information:
     Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
  
   But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
should be the same as Homo_sapiens in the example. So it should be no error
as the code is the example from Mingyi.
   I wonder why this happen, and should I change something about the file? 
    
-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16602770.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 02:56:52 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 23:56:52 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16602770.post@talk.nabble.com>
References: <16602770.post@talk.nabble.com>
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:03:56 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:03:56 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
per GeneID, Column header line is the first line in the file
) is not the right format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:04:32 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:04:32 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
per GeneID, Column header line is the first line in the file) is not the
right format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:09:40 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:09:40 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz. It doesn't work.Is
that means "gene_info.gz"( tab-delimited,one line per GeneID, Column header
line is the first line in the file) is not the right format for
Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 03:10:26 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:10:26 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there is still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz.
   It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line per
GeneID, Column header line is the first line in the file) is not the right
format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From stefan.kirov at bms.com  Fri Apr 11 15:59:29 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Fri, 11 Apr 2008 15:59:29 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16602770.post@talk.nabble.com>
References: <16602770.post@talk.nabble.com>
Message-ID: <Pine.WNT.4.64.0804111557210.2384@A161887.one.ads.bms.com>

AGS is a binary ASN.1 format and WILL NOT be parsed! You have to use 
gene2xml( weird, but this is NCBI) with these flags: -c -x -b -i. This 
will spit out text ASN which can be parsed.
Stefan

On Wed, 9 Apr 2008, zoujing wrote:

>
>   I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>     Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>
>   But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no error
> as the code is the example from Mingyi.
>   I wonder why this happen, and should I change something about the file?
>
> -- 
> View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16602770.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From stefan.kirov at bms.com  Fri Apr 11 16:01:30 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Fri, 11 Apr 2008 16:01:30 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16603225.post@talk.nabble.com>
References: <16603225.post@talk.nabble.com>
Message-ID: <Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>

It is not. If you use this file, why would you need a parser for it 
anyway? Just split on \t or read with OpenOffice or equiv.
Stefan

On Thu, 10 Apr 2008, zoujing wrote:

>
> Seached  the web and found the answer now, quote the answer as following:
>   The error was thrown by my Bio::ASN1::EntrezGene module because it
> expects a text file, while you fed it with a binary file.  To use
> gzipped ASN binary file from NCBI, download the NCBI gene2xml
> (ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml),
> then use this syntax to run my parser on the binary files:
>
> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped
> binary file directly downloaded from NCBI
>
> Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene).
> Mingyi
>
>   But there still one thing, I want to parse "gene_info.gz" in Gene of
> NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
> per GeneID, Column header line is the first line in the file
> ) is not the right format for Bio::ASN1::EntrezGene?
>
>
>
> zoujing wrote:
>>
>>    I am a geen hand in Bioperl. When I run perl with
>> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
>> information:
>>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>>
>>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
>> should be the same as Homo_sapiens in the example. So it should be no
>> error as the code is the example from Mingyi.
>>    I wonder why this happen, and should I change something about the file?
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From asjo at koldfront.dk  Fri Apr 11 15:39:59 2008
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Fri, 11 Apr 2008 21:39:59 +0200
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <0037240B-F469-4388-972A-324101B11621@gmx.net> (Hilmar Lapp's
	message of "Fri, 11 Apr 2008 11:35:54 -0400")
References: <87d4owixh8.fsf@topper.koldfront.dk>
	<0037240B-F469-4388-972A-324101B11621@gmx.net>
Message-ID: <877if4i3jk.fsf@topper.koldfront.dk>

On Fri, 11 Apr 2008 11:35:54 -0400, Hilmar wrote:

> On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:

>>> my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>
>>> 'Bio::PrimarySeq');

> You need to prefix the argument with a dash: '-type', not 'type'. 
> Otherwise, it assumes that the class you want instantiated is
> 'type.pm'.

I guess that means I should submit a patch for the SYNOPSIS. Attached.


   Thanks,

    Adam


Index: Bio/Factory/SequenceFactoryI.pm
===================================================================
--- Bio/Factory/SequenceFactoryI.pm	(revision 14654)
+++ Bio/Factory/SequenceFactoryI.pm	(working copy)
@@ -20,7 +20,7 @@
 # get a Bio::Factory::SequenceFactoryI object like
 
     use Bio::Seq::SeqFactory;
-    my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq');
+    my $seqbuilder = Bio::Seq::SeqFactory->new('-type' => 'Bio::PrimarySeq');
 
     my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 				  -display_id => 'exampleseq');

-- 
 "Well, I'm a moon around you"                                Adam Sj?gren
                                                         asjo at koldfront.dk


From bamboowarrior at gmail.com  Fri Apr 11 19:10:35 2008
From: bamboowarrior at gmail.com (Arkady)
Date: Fri, 11 Apr 2008 18:10:35 -0500
Subject: [Bioperl-l] Nucleotide Links in Gene DB (GenBank)
Message-ID: <91656c3f0804111610r24c8fa5es5bcb56b7a59e0208@mail.gmail.com>

Hi everyone, I'm a bioperl n00b. Actually, kind of a genbank n00b,
too, as I'm from a CS background and just started bio things last
June.

I'm trying to set up an analysis pipeline of primate protein CDSs (the
nucleotide seqs). I've written a script which does a pretty decent job
of downloading these from GenBank--but it's inconsistent, because a
lot of sequences in nucleotide are 'predicted' and named LOCthisorthat
instead of by gene name.

So what I was thinking was this (assume ANKRD43 is the gene for this example):

1. Search 'gene' database for ANKRD43 AND (PRI*[ORGN])
On NCBI, there's an option to show all nucleotide links. How do I get
a list of those in bioperl? Can bioperl even search 'gene', or just
'nucleotide'?

2. Search 'nucleotide' for the referenced items from #1, and also for
ANKRD43[TITL] AND (PRI*[ORGN]), save CDSes.

3. BLAST mRNA for one of those CDSes, see if we pick up any other matches.

4. BLAT other primates for CDSes, see if we find anything not in GenBank.


On the other hand, I always get the feeling I'm doing things the hard
way--especially here, with #1 and #2. Is there a much more obvious,
simple way to do this?

Thanks, folks.


Cheers,
John Woods

Institute for Cellular and Molecular Biology
The University of Texas at Austin


From hlapp at gmx.net  Fri Apr 11 19:19:44 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 11 Apr 2008 19:19:44 -0400
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <877if4i3jk.fsf@topper.koldfront.dk>
References: <87d4owixh8.fsf@topper.koldfront.dk>
	<0037240B-F469-4388-972A-324101B11621@gmx.net>
	<877if4i3jk.fsf@topper.koldfront.dk>
Message-ID: <B4B3CAD0-C346-470C-98D7-D6CBFE116109@gmx.net>

Thanks, applied. -hilmar

On Apr 11, 2008, at 3:39 PM, Adam Sj?gren wrote:
> On Fri, 11 Apr 2008 11:35:54 -0400, Hilmar wrote:
>
>> On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:
>
>>>> my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>
>>>> 'Bio::PrimarySeq');
>
>> You need to prefix the argument with a dash: '-type', not 'type'.
>> Otherwise, it assumes that the class you want instantiated is
>> 'type.pm'.
>
> I guess that means I should submit a patch for the SYNOPSIS. Attached.
>
>
>    Thanks,
>
>     Adam
>
>
> Index: Bio/Factory/SequenceFactoryI.pm
> ===================================================================
> --- Bio/Factory/SequenceFactoryI.pm	(revision 14654)
> +++ Bio/Factory/SequenceFactoryI.pm	(working copy)
> @@ -20,7 +20,7 @@
>  # get a Bio::Factory::SequenceFactoryI object like
>
>      use Bio::Seq::SeqFactory;
> -    my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>  
> 'Bio::PrimarySeq');
> +    my $seqbuilder = Bio::Seq::SeqFactory->new('-type' =>  
> 'Bio::PrimarySeq');
>
>      my $seq = $seqbuilder->create(-seq => 'ACTGAT',
>  				  -display_id => 'exampleseq');
>
> -- 
>  "Well, I'm a moon around you"                                Adam  
> Sj?gren
>                                                           
> asjo at koldfront.dk
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From mmokrejs at ribosome.natur.cuni.cz  Fri Apr 11 21:32:14 2008
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Sat, 12 Apr 2008 03:32:14 +0200
Subject: [Bioperl-l] [BioSQL-l] Loading sequences with novel NCBI
	taxon_id
In-Reply-To: <CE3675B2-2AFD-46AA-A348-16C9FEA51E0E@uiuc.edu>
References: <320fb6e00803130806w46148bacm54c3ead9a50b038f@mail.gmail.com>	<32EB5B0C-4CC8-4C33-9F41-5D4465B6AC48@gmx.net>	<320fb6e00803131613o20eae2b7y325814ef26d2738f@mail.gmail.com>	<CEA4F4E7-A66B-4C62-AE32-511E177BC485@gmx.net>	<93b45ca50803140648s5098a7d0sec621f448ef03040@mail.gmail.com>
	<CE3675B2-2AFD-46AA-A348-16C9FEA51E0E@uiuc.edu>
Message-ID: <4800111E.3030802@ribosome.natur.cuni.cz>

Chris Fields wrote:
> The counter to that perspective (using new sequences with old tax info) 
> would be to regularly update NCBI taxonomy, particularly in 
> circumstances prior to adding new sequences.  Hilmar mentioned that once 
> tax is loaded it doesn't take as long to update, so you could set up a 
> cron job to update regularly.
> 
> I remember someone mentioning weekly or monthly updates on the list 
> quite a while ago, but I'm unsure how often NCBI updates tax information 
> (i.e. with every release, monthly, weekly, etc).  I can see instances 
> popping up where you used the an up-to-date taxonomy but a new sequence 
> contains a tax ID not present.  I think bioperl-db handles these but I'm 
> not sure what other Bio* do.
> 

I spent some time benchmarking this and inspecting the mysql log files.
The current load_ncbi_taxonomy.pl script with minor modification to
show timestamps does this on initial import into mysql and then update
of the database using exactly same dataset (but anyway it has to walk
through all the data):

$ ./load_ncbi_taxonomy.pl --dbname=biosqldb --driver=mysql --host=127.0.01 \
  --port=3306 --directory=/home/mmokrejs/bioinformatics/databases/ncbitax/dump \
  --chunksize=0 --verbose=2 --mycnf=~/.my.cnf
Sat Apr 12 01:58:43 MEST 2008
Loading NCBI taxon database in /home/mmokrejs/bioinformatics/databases/ncbitax/dump:
       ... retrieving all taxon nodes in the database
Sat Apr 12 01:58:43 MEST 2008
       ... reading in taxon nodes from nodes.dmp
Sat Apr 12 01:58:58 MEST 2008
       ... insert / update / delete taxon nodes
                10000/421098 done (in 5 secs, 2000.0 rows/s)
                20000/421098 done (in 4 secs, 2500.0 rows/s)
...
                420000/421098 done (in 4 secs, 2500.0 rows/s)
Sat Apr 12 02:02:21 MEST 2008
       ... (committing nodes)
Sat Apr 12 02:02:21 MEST 2008
       ... rebuilding nested set left/right values
                10000 done (in 24 secs, 416.7 rows/s)
                20000 done (in 26 secs, 384.6 rows/s)
                30000 done (in 24 secs, 416.7 rows/s)
...
                420004 done (in 23 secs, 434.8 rows/s)
Sat Apr 12 02:19:25 MEST 2008
       ... reading in taxon names from names.dmp
Sat Apr 12 02:19:25 MEST 2008
       ... deleting old taxon names
Sat Apr 12 02:19:25 MEST 2008
       ... inserting new taxon names
                10000 done (in 8 secs, 1250.0 rows/s)
                20000 done (in 8 secs, 1250.0 rows/s)
...
                580000 done (in 5 secs, 2000.0 rows/s)
Sat Apr 12 02:24:48 MEST 2008
       ... cleaning up
Sat Apr 12 02:24:49 MEST 2008
Done.
$


I decided to re-import the same data to mimic at least somehow
the future updates, although no record should be UPDATEd,
except zapping left and right values with NULL. :((

$ ./load_ncbi_taxonomy.pl --dbname=biosqldb --driver=mysql --host=127.0.01
  --port=3306 --directory=/home/mmokrejs/bioinformatics/databases/ncbitax/dump \
  --chunksize=0 --verbose=2 --mycnf=~/.my.cnf
Sat Apr 12 02:35:20 MEST 2008
Loading NCBI taxon database in /home/mmokrejs/bioinformatics/databases/ncbitax/dump:
        ... retrieving all taxon nodes in the database
Sat Apr 12 02:35:26 MEST 2008
       ... reading in taxon nodes from nodes.dmp
Sat Apr 12 02:35:46 MEST 2008
       ... insert / update / delete taxon nodes
                10000/421098 done (in 0 secs, 10000.0 rows/s)
                20000/421098 done (in 0 secs, 10000.0 rows/s)
...
                410000/421098 done (in 0 secs, 10000.0 rows/s)
                420000/421098 done (in 0 secs, 10000.0 rows/s)
Sat Apr 12 02:35:55 MEST 2008
       ... (committing nodes)
Sat Apr 12 02:35:55 MEST 2008
       ... rebuilding nested set left/right values
                10000 done (in 9 secs, 1111.1 rows/s)
                20000 done (in 9 secs, 1111.1 rows/s)
...
                410004 done (in 8 secs, 1250.0 rows/s)
                420004 done (in 9 secs, 1111.1 rows/s)
Sat Apr 12 02:41:54 MEST 2008
       ... reading in taxon names from names.dmp
Sat Apr 12 02:41:54 MEST 2008
       ... deleting old taxon names
Sat Apr 12 02:41:55 MEST 2008
       ... inserting new taxon names
                10000 done (in 5 secs, 2000.0 rows/s)
                20000 done (in 5 secs, 2000.0 rows/s)
...
                570000 done (in 6 secs, 1666.7 rows/s)
                580000 done (in 5 secs, 2000.0 rows/s)
Sat Apr 12 02:47:27 MEST 2008
       ... cleaning up
Sat Apr 12 02:47:27 MEST 2008
Done.
$ ls -la /var/log/mysql/mysql.log 
-rw-rw---- 1 mysql mysql 483443314 Apr 12 03:15 /var/log/mysql/mysql.log
$

Pentium4 M laptop, 1.8GHz, 1 GB RAM, mysql-5.0.56 with enabled
SQL text logging, the slow version of logging all SQL commands
compared to binary logging. The log was cleared before the tests.
I could provide some bits from the log or upload it somewhere
if anybody else would like to dig into the details.


I believe the recalculation step could be made faster. See what
happens:

                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '1' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '10239' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12333' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12335' ORDER BY ncbi_taxon_id
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE left_value = '4'
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE right_value = '5'
                     31 Query       UPDATE taxon SET left_value = '4', right_value = '5' WHERE taxon_id = '12335'
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12340' ORDER BY ncbi_taxon_id
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE left_value = '6'
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE right_value = '7'
                     31 Query       UPDATE taxon SET left_value = '6', right_value = '7' WHERE taxon_id = '12340'


The columns left_value and right_value have NULL value upon
the table is created, so no need to write again NULL into
them. This would mean writing a wrapper function which would
mimic update() but before doing that it would do 'SELECT * FROM',
compare the values with those to be written and include in the
final UPDATE statement only those columns for which values have
been changed. We use such a smart wrapper for our code in python.
;-)

When the columns for left and right are to be made NULL during
update of an existing database, I think it would be much faster
to drop the columns and re-create them again with NULL values.


I think it could be investigated more the possibility to create
empty taxon and taxon_name tables as MyISAM tables and only after
all the import and updates they could be converted into InnoDB
tables. One would have to probably think a bit more of the foreign
keys but it might be they would not even be lost during the conversion
back and forth.

Actually, easy to check. Dump your current taxon and taxon_name
tables (maybe even without sql data using --without-data), run
'ALTER TABLE taxon ... type=MyISAM'
followed by
'ALTER TABLE taxon ... type=InnoDB'
dump again the database structure and compare by diff with
the original.

But, time for sleep here.
Martin


From sdavis2 at mail.nih.gov  Fri Apr 11 23:50:44 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 11 Apr 2008 23:50:44 -0400
Subject: [Bioperl-l] Bio::ASN1::EntrezGene parse so slowly?
In-Reply-To: <16602210.post@talk.nabble.com>
References: <16602210.post@talk.nabble.com>
Message-ID: <264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>

gene_info is a tab-delimited text file, if I recall correctly.  Have
you looked at it?  If it is, you should be able to parse it in a few
seconds with just a couple lines of code.

Sean


On Thu, Apr 10, 2008 at 1:08 AM, zoujing <1zoujing at 163.com> wrote:
>
>   I want to parse a file "gene_info" from NCBI. The format of Gene in NCBI is
>  ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
>  properly/too slow. The file is about 500M.
>   The code is following:
>   use Bio::ASN1::EntrezGene;
>   my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
>   my $i = 0;
>   while(my $result = $parser->next_seq)
>   { last; #something to do there, here use last for test}
>
>   When it goes to the "while" part, it is processing on and on, it does not
>  went out, even I used "last" in the "while" part.
>    So I wonder whether it is too slow or the module is not fit for this job,
>  or I did something wrong?
>
>   Thank you!
>  --
>  View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
>  Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
>  _______________________________________________
>  Bioperl-l mailing list
>  Bioperl-l at lists.open-bio.org
>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From david at burt7259.freeserve.co.uk  Sat Apr 12 13:01:57 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sat, 12 Apr 2008 18:01:57 +0100
Subject: [Bioperl-l] bioperl-db
Message-ID: <BFCB174E-B59E-4249-BDF8-4B0F2E2273C9@burt7259.freeserve.co.uk>

Hi Hilmar,

Hope you can help ? I am using bioperl-db to create a biosql database

I have used scripts load_seqdatabase.pl and load_ontology.pl to  
install human swissprot entries, gene ontology, sequence ontology and  
now want to load interpro

Here?s the command line I have tried

perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
root --dbpass chicken --driver mysql \
--namespace "InterPro" --format InterPro interpro.xml

But I get this message

Can't call method "identifier" on an undefined value at  /cygdrive/c/ 
Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
SimpleOntologyEngine.pm line 395

Any ideas?

Dave

PS: here?s the top of the interpro.xml file

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE interprodb SYSTEM "interpro.dtd">


<interprodb>
     <release>
       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
file_date="04-OCT-2006 00:00:00" />
       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
file_date="22-NOV-2006 00:00:00" />
       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
file_date="12-JUN-2007 00:00:00" />
       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
file_date="22-SEP-2005 00:00:00" />
       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
file_date="23-APR-2004 00:00:00" />
       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
file_date="14-NOV-2006 00:00:00" />
       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
file_date="27-JUL-2007 00:00:00" />
       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
file_date="28-SEP-2007 00:00:00" />
       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
file_date="11-SEP-2006 00:00:00" />
       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
file_date="30-NOV-2006 00:00:00" />
       <dbinfo dbname="SWISSPROT" version="55.1" entry_count="359942"  
file_date="18-MAR-2008 00:00:00" />
       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
file_date="18-MAR-2008 00:00:00" />
       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
file_date="19-MAR-2008 00:00:00" />
       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
file_date="27-MAR-2007 00:00:00" />
       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
file_date="12-JUL-2007 16:56:17" />
     </release>
   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
protein_count="352">
     <name>Kringle</name>
     <abstract>

  
From hlapp at gmx.net  Sat Apr 12 14:10:44 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 14:10:44 -0400
Subject: [Bioperl-l] personal vs list email
Message-ID: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>

I'm not sure why but I have received several Bioperl or BioSQL- 
related email inquiries directed to me *personally* over the past few  
weeks.

I have been responding as I get to them, but I feel that I am doing  
both the senders and this community a poor service, because sometimes  
someone else on the list could have responded much faster, and when I  
respond, others on the list who happen to be interested in the same  
question don't get to see the answer.

So from now on as a policy I will redirect *every* email sent to me  
personally and that asks a question related to one of the projects to  
the respective mailing list. If you don't want this, please  
conspicuously say so at the top of your email, and in that case if  
you do ask a project-related question be prepared to wait and to  
possibly needing to follow up.

As an aside, it's a pretty safe assumption to make that all other  
core developers, and quite possibly *all* developers are following a  
similar policy, whether expressly or not.

Isn't this somewhere in the FAQ too?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Apr 12 14:16:13 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 14:16:13 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
Message-ID: <C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>

Hi Burt,

can you try format interprosax instead of interpro? That variant is  
also much more graceful regarding required space.

	-hilmar

On Apr 12, 2008, at 1:01 PM, David Burt wrote:

> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Apr 12 16:17:43 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 12 Apr 2008 15:17:43 -0500
Subject: [Bioperl-l] [BioSQL-l] personal vs list email
In-Reply-To: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
References: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
Message-ID: <E7962E90-8309-4ADA-B002-950793B61D74@uiuc.edu>


On Apr 12, 2008, at 1:10 PM, Hilmar Lapp wrote:

> I'm not sure why but I have received several Bioperl or BioSQL- 
> related email inquiries directed to me *personally* over the past  
> few weeks.
>
> I have been responding as I get to them, but I feel that I am doing  
> both the senders and this community a poor service, because  
> sometimes someone else on the list could have responded much faster,  
> and when I respond, others on the list who happen to be interested  
> in the same question don't get to see the answer.
>
> So from now on as a policy I will redirect *every* email sent to me  
> personally and that asks a question related to one of the projects  
> to the respective mailing list. If you don't want this, please  
> conspicuously say so at the top of your email, and in that case if  
> you do ask a project-related question be prepared to wait and to  
> possibly needing to follow up.
>
> As an aside, it's a pretty safe assumption to make that all other  
> core developers, and quite possibly *all* developers are following a  
> similar policy, whether expressly or not.

I agree; I'm sure several other core devs feel the same way.  I always  
try to forward these to the list if I feel it is more relevant there.

> Isn't this somewhere in the FAQ too?
>
> 	-hilmar

No, but I've added it to the bioperl FAQ; might be worth checking over  
and editing.

chris


From hlapp at gmx.net  Sat Apr 12 18:40:53 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 18:40:53 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89ce2$5400a710$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce2$5400a710$0202a8c0@STUDYPC>
Message-ID: <3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>

Burt - please keep your replies on the list. Others may have input  
too, or benefit from the answer too.

As there is no name() method call on line 914 in the current version  
let's check first that you run a current version of BioPerl. It will  
need to be at least 1.5.2.

However, I do suspect a problem in either the InterPro file itself  
(wouldn't be the first time), or the InterPro parser.

	-hilmar

On Apr 12, 2008, at 5:15 PM, David Burt wrote:

> Hilmar
>
> Many thanks seems to be working
>
> But got this output ? any comments/ideas what it means ?
>
> Dave
>
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> > --namespace "InterPro" --format interprosax interpro.xml
>         ...deleting all relationships for InterPro
>         ...parsing and loading InterPro
> Can't call method "name" on an undefined value at load_ontology.pl  
> line 914.
>
> HERE?S the name and definition in the ontology table
>
> Name = InterPro
>
> Definition =
>
> PANTHER version 6.1, 30128 entries, 04-OCT-2006
> PFAM version 21.0, 8957 entries, 22-NOV-2006
> PIRSF version 2.70, 2877 entries, 12-JUN-2007
> PRINTS version 38.0, 1900 entries, 22-SEP-2005
> PRODOM version 2005.1, 1522 entries, 23-APR-2004
> PROSITE version 20.0, 2006 entries, 14-NOV-2006
> SMART version 5.1, 724 entries, 27-JUL-2007
> TIGRFAMs version 7.0, 3423 entries, 28-SEP-2007
> GENE3D version 3.0.0, 2147 entries, 11-SEP-2006
> SSF version 1.69, 1538 entries, 30-NOV-2006
> SWISSPROT version 55.1, 359942 entries, 18-MAR-2008
> TREMBL version 38.1, 5443281 entries, 18-MAR-2008
> INTERPRO version 17.0, 16175 entries, 19-MAR-2008
> GO version N/A, 23937 entries, 27-MAR-2007
> MEROPS version 7.8, 2831 entries, 12-JUL-2007 |
>
>
>
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: 12 April 2008 19:16
> To: David Burt
> Cc: Bioperl BioPerl
> Subject: Re: bioperl-db
>
> Hi Burt,
>
> can you try format interprosax instead of interpro? That variant is  
> also much more graceful regarding required space.
>
>             -hilmar
>
> On Apr 12, 2008, at 1:01 PM, David Burt wrote:
>
>
> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Apr 12 18:43:25 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 18:43:25 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
Message-ID: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>

I'm not sure what you mean by 'Check interpro.xml', but you can use  
the --safe command-line option to keep going if an individual term  
fails to load for whatever reason.

Can you post the data for the seemingly offending record? (and please  
cc the list)

	-hilmar

On Apr 12, 2008, at 5:39 PM, David Burt wrote:

> Hi Hilmar
>
> Just checked mysql database and only have 39 entries under interpro  
> and loaded up to IPR000035
>
> Check unterpro.xml looks OK from IPR000036 and onwards
>
> So seems to have crashed at IPR000035 ?
>
> dave
>
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: 12 April 2008 19:16
> To: David Burt
> Cc: Bioperl BioPerl
> Subject: Re: bioperl-db
>
> Hi Burt,
>
> can you try format interprosax instead of interpro? That variant is  
> also much more graceful regarding required space.
>
>             -hilmar
>
> On Apr 12, 2008, at 1:01 PM, David Burt wrote:
>
>
> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From Russell.Smithies at agresearch.co.nz  Sun Apr 13 22:51:41 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 14 Apr 2008 14:51:41 +1200
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC><C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net><000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>

Has anyone tried TRF? 
I notice UCSC is using it for all their simple repeat annotations and thought it might be better than what we're currently using (Sputnik)

And is there a BioPerl parser for it's output or am I going to have to write my own ?

Thanx,


Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E? russell.smithies at agresearch.co.nz 

Invermay? Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T? +64 3 489 3809?? 
F? +64 3 489 9174? 
www.agresearch.co.nz 


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From Russell.Smithies at agresearch.co.nz  Sun Apr 13 22:53:46 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 14 Apr 2008 14:53:46 +1200
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C03B09DE9@imail.agresearch.co.nz>
References: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
	<D5DBA313349A4B458528BE63B387F36C03B09DE9@imail.agresearch.co.nz>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEA881@imail.agresearch.co.nz>

Scratch the need for a parser.
I turned off html output and it's all nice white-space separated text  :-)

Russell

> -----Original Message-----
> From: Smithies, Russell
> Sent: Monday, 14 April 2008 2:52 p.m.
> To: 'Bioperl BioPerl'
> Subject: Tandem Repeats Finder?
> 
> Has anyone tried TRF?
> I notice UCSC is using it for all their simple repeat annotations and thought it might
> be better than what we're currently using (Sputnik)
> 
> And is there a BioPerl parser for it's output or am I going to have to write my own ?
> 
> Thanx,
> 
> 
> Russell Smithies
> 
> Bioinformatics Applications Developer
> T +64 3 489 9085
> E? russell.smithies at agresearch.co.nz
> 
> Invermay? Research Centre
> Puddle Alley,
> Mosgiel,
> New Zealand
> T? +64 3 489 3809
> F? +64 3 489 9174
> www.agresearch.co.nz
> 

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From csaba.ortutay at gmail.com  Mon Apr 14 00:15:22 2008
From: csaba.ortutay at gmail.com (Ortutay Csaba =?iso-8859-1?q?P=E9ter?=)
Date: Mon, 14 Apr 2008 07:15:22 +0300
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
	<D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>
Message-ID: <200804140715.22702.csaba.ortutay@gmail.com>

Hello, I have used TRF in my earlier projects. It is nice and quick tool.

There was not ready made parsers those times (5-6 years ago) so we have 
written our own.

Csaba

> Has anyone tried TRF?
> I notice UCSC is using it for all their simple repeat annotations and
> thought it might be better than what we're currently using (Sputnik)
>
> And is there a BioPerl parser for it's output or am I going to have to
> write my own ?
>
> Thanx,


-- 
Csaba Ortutay PhD
IMT Bioinformatics
University of Tampere
Finland


From avilella at gmail.com  Mon Apr 14 07:13:26 2008
From: avilella at gmail.com (Albert Vilella)
Date: Mon, 14 Apr 2008 12:13:26 +0100
Subject: [Bioperl-l] how can I print a Bio::Tree newick sortby given list?
Message-ID: <358f4d650804140413x4271f18bx40af1b9054306df8@mail.gmail.com>

Hi,

I have a newick file that I want to sort by a given order and print again as
newick.
For example, if I have

(((ENSPTRG00000013811:0.0011,ENSG00000142192:0.0021):0.0033,ENSPPYG00000003902:0.0326):0.0000,ENSMMUG00000014384:0.0366):0.3638;

I want to sort it by "ENSG:ENSPTRG:ENSPPYG:ENSMMUG".

Any suggestions on how to do this in bioperl?

Cheers,

    Albert.


From lamq at usal.es  Mon Apr 14 11:01:51 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Mon, 14 Apr 2008 17:01:51 +0200
Subject: [Bioperl-l] xyplot glyph: scale problems
Message-ID: <480371DF.7040900@usal.es>

I have some problem with the xyplot scale numbers calculated by the glyph.

The shape of the graph looks fine, but the scale number 10 and his 
position in the ouput is not correct.

I send the source code, simplified input file and the png output.

Thank you


Source code

ex1.pl  (also in http://avellano.usal.es/~luis/bioperl-l/ex1.pl)
============================
#!/usr/bin/perl
use Bio::DB::GFF;
use Bio::Graphics::Panel;
use strict;

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,-adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}' );
my $segment  = $db->segment('chr1');
my @features = $segment->features('at');
my $panel = Bio::Graphics::Panel->new(
       -offset    => 0, -grid    => 100,                               
       -length    => 500, -width     => 800,
       -pad_left  => 50, -pad_right => 50 );
$panel->add_track($segment, -glyph   => 'generic',
                           -bgcolor => 'blue', -label   => 
1);                                    
$panel->add_track(\@features,
                    -glyph => 'xyplot',
                    -graph_type=>'boxes',
                    -scale=>'left',
                    -height=>200,
 );
open (FI,"> sal.png");
============================

in1.gff file (also in http://avellano.usal.es/~luis/bioperl-l/in1.gff)
============================
##sequence-region chr1 1 5578650
chr1    atfreq    atpc    1    10       64.0000    .    .    atpc 1
chr1    atfreq    atpc    11    20       63.0000    .    .    atpc 1
chr1    atfreq    atpc    21    30       62.0000    .    .    atpc 1
chr1    atfreq    atpc    31    40       59.0000    .    .    atpc 1
chr1    atfreq    atpc    41    50       59.0000    .    .    atpc 1
chr1    atfreq    atpc    51    60       59.0000    .    .    atpc 1
chr1    atfreq    atpc    61    70       59.0000    .    .    atpc 1
chr1    atfreq    atpc    71    80       59.0000    .    .    atpc 1
chr1    atfreq    atpc    81    90       61.0000    .    .    atpc 1
chr1    atfreq    atpc    91    100       60.0000    .    .    atpc 1
chr1    atfreq    atpc    101    110       60.0000    .    .    atpc 1
chr1    atfreq    atpc    111    120       64.0000    .    .    atpc 1
chr1    atfreq    atpc    121    130       64.0000    .    .    atpc 1
chr1    atfreq    atpc    131    140       60.0000    .    .    atpc 1
chr1    atfreq    atpc    141    150       60.0000    .    .    atpc 1
chr1    atfreq    atpc    151    160       63.0000    .    .    atpc 1
chr1    atfreq    atpc    161    170       62.0000    .    .    atpc 1
chr1    atfreq    atpc    171    180       59.0000    .    .    atpc 1
chr1    atfreq    atpc    181    190       54.0000    .    .    atpc 1
chr1    atfreq    atpc    191    200       53.0000    .    .    atpc 1
chr1    atfreq    atpc    201    210       54.0000    .    .    atpc 1
chr1    atfreq    atpc    211    220       50.0000    .    .    atpc 1
chr1    atfreq    atpc    221    230       51.0000    .    .    atpc 1
chr1    atfreq    atpc    231    240       56.0000    .    .    atpc 1
chr1    atfreq    atpc    241    250       58.0000    .    .    atpc 1
chr1    atfreq    atpc    251    260       55.0000    .    .    atpc 1
chr1    atfreq    atpc    261    270       54.0000    .    .    atpc 1
chr1    atfreq    atpc    271    280       56.0000    .    .    atpc 1
chr1    atfreq    atpc    281    290       59.0000    .    .    atpc 1
chr1    atfreq    atpc    291    300       58.0000    .    .    atpc 1
chr1    atfreq    atpc    301    310       60.0000    .    .    atpc 1
chr1    atfreq    atpc    311    320       59.0000    .    .    atpc 1
chr1    atfreq    atpc    321    330       59.0000    .    .    atpc 1
chr1    atfreq    atpc    331    340       57.0000    .    .    atpc 1
chr1    atfreq    atpc    341    350       56.0000    .    .    atpc 1
chr1    atfreq    atpc    351    360       57.0000    .    .    atpc 1
chr1    atfreq    atpc    361    370       57.0000    .    .    atpc 1
chr1    atfreq    atpc    371    380       58.0000    .    .    atpc 1
chr1    atfreq    atpc    381    390       56.0000    .    .    atpc 1
chr1    atfreq    atpc    391    400       58.0000    .    .    atpc 1
chr1    atfreq    atpc    401    410       56.0000    .    .    atpc 1
chr1    atfreq    atpc    411    420       59.0000    .    .    atpc 1
chr1    atfreq    atpc    421    430       58.0000    .    .    atpc 1
chr1    atfreq    atpc    431    440       59.0000    .    .    atpc 1
chr1    atfreq    atpc    441    450       58.0000    .    .    atpc 1
chr1    atfreq    atpc    451    460       58.0000    .    .    atpc 1
chr1    atfreq    atpc    461    470       56.0000    .    .    atpc 1
chr1    atfreq    atpc    471    480       57.0000    .    .    atpc 1
chr1    atfreq    atpc    481    490       59.0000    .    .    atpc 1
============================


The sal.png :
http://avellano.usal.es/~luis/bioperl-l/sal.png

Thank you.


-- 
==================================================
 Luis Antonio Miguel Quintales
 Departamento de Inform?tica y Autom?tica
 Facultad de Ciencias
 Universidad de Salamanca
 Plaza de la Merced s/n
 37008-SALAMANCA
 SPAIN
==================================================
 Tel.: +34-923-294400(ext.1513)
 Fax.: +34-923-294584
 E-mail: lamq at usal.es
==================================================


From aaron.j.mackey at gsk.com  Mon Apr 14 09:00:52 2008
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Mon, 14 Apr 2008 09:00:52 -0400
Subject: [Bioperl-l] personal vs list email
In-Reply-To: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
Message-ID: <OF3ED0BD19.1CBA005A-ON8525742B.00473A95-8525742B.00477DEC@gsk.com>

I try to take it even one step further: I require the person to re-ask 
their question on the mailing list (and then try to answer it there). This 
has the added benefit of causing the person to pause a moment to reflect 
on their question, and (sometimes) to spend a bit more time preparing the 
question for more broader public consumption.

-Aaron


From sutripa at vbi.vt.edu  Mon Apr 14 12:54:47 2008
From: sutripa at vbi.vt.edu (Sucheta Tripathy)
Date: Mon, 14 Apr 2008 12:54:47 -0400 (EDT)
Subject: [Bioperl-l] Error installing XML::Parser
Message-ID: <1285.99.152.150.87.1208192087.squirrel@webmail.vbi.vt.edu>


Hello List,

I have recently installed bioperl using the following command. The
installation was successful. Now I am trying to install XML::Parser but it
returns with  error messages. Any clue what I may be doing wrong?

Thanks

Sucheta

Following is the last part of the error message:

### Error Message #######

Expat.c: In function ??~XS_XML__Parser__Expat_SkipUntil??T:
Expat.c:2664: error: ??~XML_Parser??T undeclared (first use in this function)
Expat.c:2664: error: expected ??~;??T before ??~parser??T
Expat.c:2665: warning: ISO C90 forbids mixed declarations and code
Expat.xs:2179: error: ??~parser??T undeclared (first use in this function)
Expat.xs:2179: warning: cast to pointer from integer of different size
Expat.xs:2180: error: ??~CallbackVector??T has no member named
??~st_serial??T
Expat.xs:2182: error: ??~CallbackVector??T has no member named
??~skip_until??T
Expat.c: In function ??~XS_XML__Parser__Expat_Do_External_Parse??T:
Expat.c:2687: error: ??~XML_Parser??T undeclared (first use in this function)
Expat.c:2687: error: expected ??~;??T before ??~parser??T
Expat.c:2688: warning: ISO C90 forbids mixed declarations and code
Expat.xs:2194: error: ??~parser??T undeclared (first use in this function)
Expat.xs:2194: warning: cast to pointer from integer of different size
Expat.xs:2205: warning: unused variable ??~pret??T
Expat.xs:2194: warning: unused variable ??~cbv??T
Expat.xs:2192: warning: unused variable ??~type??T
make[1]: *** [Expat.o] Error 1
make[1]: Leaving directory `/root/.cpan/build/XML-Parser-2.36/Expat'
make: *** [subdirs] Error 2
  /usr/bin/make  -- NOT OK
Running make test
  Can't test without successful make
Running make install
  make had returned bad status, install seems impossible

#####

-- 
Sucheta Tripathy, Ph.D.
Virginia Bioinformatics Institute Phase-I
Washington street.
Virginia Tech.
Blacksburg,VA 24061-0447
phone:(540)231-8138
Fax:  (540) 231-2606


From mmokrejs at ribosome.natur.cuni.cz  Tue Apr 15 06:45:48 2008
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Tue, 15 Apr 2008 12:45:48 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>	<47F9F3AA.2090003@uv.es>
	<200804071448.34769.heikki@sanbi.ac.za>	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>	<47FA4AD2.5030206@uv.es>
	<CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>
Message-ID: <4804875C.80506@ribosome.natur.cuni.cz>

Chris Fields wrote:
> Note in the example I gave that, during the revision history, the 
> DBSOURCE changed at the point of the creation date (the original nuc.
>  record was a M. tuberculosis contig sequence, which later changed to
> an updated full M. tuberculosis genome record at the time of the
> 'create date').
> 
> Couldn't find anything specific in the GenBank docs on this, but it 
> appears (at least for a protein record) the creation date reflects
> the date in which the sequence was either originally deposited or
> originally derived from the nucleotide source record present in the
> record.  In other words, it may not reflect the original date of
> deposition (which could have come from a different record, as in this
> case).
> 
> chris

Hi,
I have few answers from the past from NCBI staff to my similar questions
regarding DATE issues and VERSION numbers not being increased upon
"changes" in a record.
I tried below to put into a more readable form my former correspondence.
Hope this helps everybody to understand what happens in the black box. ;)
Martin


Date: Thu, 17 Jan 2002 15:40:07 -0500 (EST)
From: David Wheeler
Subject: Brucella_melitensis on ftp site

> Hi, I'd like to point you to the fact, that the descriptions of 
> Brucella_melitensis differ in 
> ftp.ncbi.nih.nlm.gov/genomes/Bacteria/Brucella_melitensis and 
> ftp.ncbi.nih.nlm.gov/genbank/genomes/Bacteria/Brucella_melitensis
> 
> Namely, the description of the strain is retained in *.gbk files
> under /genomes/Bacteria/Brucella_melitensis only under the strain
> description field, but not in the DEFINITION line, where it is
> present in *.gbk files under
> /genbank/genomes/Bacteria/Brucella_melitensis.
> 
> LOCUS       NC_003318 1177787 bp    DNA   circular  BCT
> 13-NOV-2001 DEFINITION  Brucella melitensis chromosome II, complete
> sequence. ACCESSION   NC_003318 VERSION     NC_003318.1  GI:17988344
> 
> compared to
> 
> LOCUS       AE008918  1177787 bp    DNA   circular  BCT
> 27-DEC-2001 DEFINITION  Brucella melitensis strain 16M chromosome II,
> complete sequence. ACCESSION   AE008918 VERSION     AE008918
> 
> This makes me worried about the data. Why is the release date of 
> NON-curated files (AE008918) newer than the release data of CURATED
> data (NC_003318)? Is it expected case? Could someone explain me the
> difference between them (i.e. CURATED vs. NONCURATED)?

The curated record is initially a copy of the non-curated record with certain 
changes in documentation made in order to comply with the NCBI standard for 
reference genomes. One change which you have noticed is the difference in 
Definition line format.  Curated genomic records are created in order to 
standardize annotation for genomes in the Entrez Genomes database while leaving 
editorial control for the parent GenBank records in the hands of the original 
submitters.

Regardles of the date you see on the record, the curated version is derived from 
the non-curated one.  In this case, it appears that the processing of the 
non-curated version lagged a little bit relative to that of the curated version. 
Normally, however, the non-curated version will have the earlier date.


Date: Sun, 27 Jan 2002 00:16:55 -0500 (EST)
From: David Wheeler
Subject: Re: CONSULT: Brucella_melitensis on ftp site

> Are the raw sequence data always same in non-curated and curated 
> flatfiles?
> 
> Is the annotation of orf's/proteins different between them?
> 
> Are there any new or withdrawn orf's or proteins in the curated
> flatfiles compared to non-curated ones?
> 
> My feeling is that no-one except original submitters can modify
> submitted data, so you cannot modify non-curated files, i.e. cannot
> modify them and increase the version number.
> 
> Because of that, you've introduced curated versions, which are just
> copies of original but public data so you are free to modify it. So
> once again, are the differences between non-curated and curated
> flatfiles only in structure of the file? I don't think so. Examples
> would be Listeria genomes or the 2 Agrobacterium's, if I remember
> right.

Initially, there should be no or very few differences, however, as time
goes by, differences in the annotation will materialize.  There may also
be differences in the sequence, if errors in the original sequence come to
light, but these differences should be very rare.

So, practically speaking, you will probably find few differences but,
since the purpose of the Refseq is to curate, there may well be some
differences.


Date: Mon, 17 Dec 2001 11:57:06 -0500 (EST)
From: Dawn Lipshultz
Subject: Re: Buggy date in Staphylococcus aureus N315

>>>> Hi, I've found there has been released Staphylococcus aureus
>>>> N315 on 01-JAN-1900, which is nonsense. I guss you had y2K bug.
>>>> 
>>>> 
>>>> Please see
>>>> 
>> ftp://ncbi.nlm.nih.gov/genbank/genomes/Bacteria/Staphylococcus_aureus_N315/BA000018.gbk
>> 
>>>> 
>>>> Can you please tell me the real release date?
>>>> 
>>>> Also, is newer the NC_xxxx for Staphylococcus aureus N315 under
>>>>  
>>>> ftp://ncbi.nlm.nih.gov/genomes/Bacteria/Staphylococcus_aureus_N315/
>>>>  or this BA000018 non-cured version?
>>>> 
>>>> 
>>>> LOCUS       BA000018  2814816 bp    DNA   circular  BCT
>>>> 01-JAN-1900 DEFINITION  Staphylococcus aureus strain N315,
>>>> complete genome.

>>> AP003129-AP003138. They are all dated June 2001.
>>> 
>>> The date for the record in the ftp file is April 2001. The record
>>> in GenBank (NC_002745) is dated October 2001. This version is
>>> apparently more updated than the one on the ftp site. Therefore,
>>> you may want to download the sequence from GenBank rather than
>>> the ftp site.
>>> 
>>> Regards, Dawn S. Lipshultz

>> I cannot find the record to which you refer in your message. When I
>>  did a search for accession number BA000018, I received results for
>>  accession numbers AP003129-AP003138. They are all dated June 2001.
>> 
>> 
>> The date for the record in the ftp file is April 2001. The record
>> in GenBank (NC_002745) is dated October 2001. This version is
>> apparently more updated than the one on the ftp site. Therefore,
>> you may want to download the sequence from GenBank rather than the
>> ftp site. Regards, Dawn S. Lipshultz

> 
> Hmm, but I do get: 
> http://www.ncbi.nlm.nih.gov:80/cgi-bin/Entrez/framik?db=genome&gi=179
> 
> 
> look at the "GenBank: NC_002745" text in left upper part of the
> window, it points to that OLD ftp file. The "RefSeq: NC_002745"
> points to the April 2001 version. So what is the right way to get the
> October 2001 release?
> 
> Where can I find the difference between NC_002745 from April compared
>  to NC_002745 from October?
> 
> What do you mean with "you may want to download the sequence from 
> GenBank rather than the ftp site."?
> 
> BOTH ftp directories at ftp://ncbi.nlm.nih.gov are outdated. I mean 
> the genomes/Bacteria/Staphylococcus_aureus_N315/NC_002745.* version 
> and also the 
> genbank/genomes/Bacteria/Staphylococcus_aureus_N315/BA000018.* 
> version.
> 
> The web links from www.ncbi.nlm.nih.gov:80/cgi-bin/Entrez/ point 
> anyway to the ftp site. Do you want to say that the ftp version
> aren't updated anymore?

The genome was originally released into the database on 4/20/2001
as 10 pieces with secondary accession number BA000018.  You can 
find these pieces in Entrez nucleotides by querying with BA000018.

The Genomes group here will fix the date on the record that is available
from Entrez genomes.

Regards,
Dawn


Date: Fri, 16 Nov 2001 16:09:59 -0500 (EST)
From: Susan Dombrowski
Subject: Re: Agrobacterium tumefaciens C58

> Dear colleague, I've noticed that there're somehow updated on Oct 17
> the genomic flatfiles of Agrobacterium tumefaciens C58 at 
> ftp://ncbi.nlm.nih.gov/genbank/genomes/Bacteria/Agrobacterium_tumefaciens/.
>  However, for example the AE007869.gbs does NOT self-explain what has
> been changed and also the VERSION number is not increased. Would you
> please explain what's the change, when can I find such information
> next time on web?
> 
> I've used the published sequence from your ftp site on 2001-08-29
> with same ID and would like to know, what differs.
> 
> LOCUS       AE007869  2841581 bp    DNA   circular  CON
> 17-OCT-2001 DEFINITION  Agrobacterium tumefaciens strain C58 circular
> chromosome, complete sequence. ACCESSION   AE007869 VERSION
> AE007869

Dear Colleague,
The version number of a sequence will *only* change if the content of the actual 
sequence has changed in any way since it was first made available. Although the 
date has changed, this date refers to the last time the actual record was 
manipulated by an NCBI staff member. Even if there is something simple, like 
adding a reference, changing a spelling mistake, etc., this will cause a change 
in the date field of the record. 

Thus, since the version has not changed, there are no differences to report.
Best Regards,
Susan


Date: Wed, 26 Jun 2002 11:04:48 -0400 (EDT)
From: Eric Sayers
Subject: Re: Mesorhizobium_loti flatfiles

>>>>> Hi,
>>>>>   I've found that you again silently changed flatfiles lying on your ftp
>>>>> some time ago without changing the revision number. Please apologize me,
>>>>> but this really causes troubles to other people working in this so called
>>>>> bioinformatics. :(
>>>>> 
>>>>> A week ago there was:
>>>>> 
>>>>> LOCUS       NC_002678            7036074 bp    DNA     circular BCT 10-SEP-2001
>>>>> DEFINITION  Mesorhizobium loti, complete genome.
>>>>> ACCESSION   NC_002678
>>>>> VERSION     NC_002678.1  GI:13470324
>>>>> 
>>>>> 
>>>>> and two other plasmid sequences. This yelds 7275 proteins.
>>>>> 
>>>>> But, last autumn there was:
>>>>> 
>>>>> LOCUS       NC_002678 7036074 bp    DNA   circular  BCT       28-MAR-2001
>>>>> DEFINITION  Mesorhizobium loti, complete genome.
>>>>> ACCESSION   NC_002678
>>>>> VERSION     NC_002678.1  GI:13470324
>>>>> 
>>>>> 
>>>>> That version had 7281 proteins in total.
>>>>> I have simple questions: "Why was NOT changed the VERSION number?".
>>>>>
>>>>> Do I understand it wrong, that it should get updated whenever a single
>>>>> character in the file contents is changed?
>>> 
>>>> The version number of a sequence only changes if the sequence itself is
>>>> modified. If anything else in the flat file is changed (ie spelling, authors,
>>>> annotations, etc) the version will not change. However, the modification date in
>>> 
>>> Sorry, do you under annotation also mean number of predicted genes, their
>>> coordinates(position) etc?
>>> 
>>>> the top line of the flat file will change for any of these modifications. (Note
>>>> that the dates are different in the file you display: Mar 28, 2001 vs Sept 10,
>>>> 2001.) I would track the modification date rather than or as well as the version
>>>> number to catch all changes in the files.
>>>> Regards,
>>>> Eric W. Sayers, Ph.D.
>>> 
>>> OK, but unless some of our programs have been buggy before or now (in
>>> either of those cases have failed to extract genes from flatfiles), I do
>>> not have an explanation for the differencies in amount of
>>> predicted/annotated genes.
>>> 
>>> I do not have anymore available the old flatfiles from Mar 28, but it
>>> seems to me that these were newly introduced in the Sept. 10 version:
>>> gi_15600768, gi_15600770, gi_15600769, gi_15600766, gi_15600767
>> 
>> Dear Colleague,
>> Again, the only reason the version number will change is if the sequence itself 
>> changes. The number of annotated/predicted genes is merely an annotation on the 
>> sequence, and does not change the sequence itself. Therefore, the version will 
>> not change when the number of annotations changes. The modification date on the 
>> flat file will (and did) change, of course.
>> 
>> Regards,
>> Eric W. Sayers, Ph.D.
> 
> Finally I've heard that from someone, thanks!
> Now just tell me, how can I figure out what changed between those
> different "date" releases? Is there a changelog available?
> I consider annotations changes very important.

We do not provide the details of flat file changes on our public websites, 
except for changes in the version number (ie actual sequence changes). In that 
particular case, all of the previous versions are linked to the current one. My 
advice to you if you want to chronicle non-sequence changes would be to check 
the flat files of interest periodically (by a script, for example) and look for 
changes in the modification dates. You could then simply compare the before and 
after flat files.

Regards,
Eric W. Sayers, Ph.D.


> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at:  
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id1_fetch.html
> 
> Here is an example:
> 
>> >id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD


From david at burt7259.freeserve.co.uk  Sun Apr 13 10:32:31 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sun, 13 Apr 2008 15:32:31 +0100
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce2$5400a710$0202a8c0@STUDYPC>
	<3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>
Message-ID: <000001c89d73$3b49eec0$0202a8c0@STUDYPC>

Hi Hilmar

 
Many thanks for info - tried a few things

 
1. First tried --safe flag

 
perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser root
--dbpass chicken --driver mysql --safe \

 --namespace "InterPro" --format interprosax interpro.xml

 
Still got same output as before

 
        ...deleting all relationships for InterPro

        ...parsing and loading InterPro

 
Can't call method "name" on an undefined value at load_ontology.pl line 914

 
Only 35 interpro entries entered into database

 
2. I am using bioperl 1.5.2

 
3. I downloaded Release 17.0, 20 March 2008 of the interpro.xml file from
ftp://ftp.ebi.ac.uk/pub/databases/interpro/

 
I did not send this file, sine it was ~10Mb gzipped

 
Dave

 
From david at burt7259.freeserve.co.uk  Sun Apr 13 10:53:43 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sun, 13 Apr 2008 15:53:43 +0100
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
Message-ID: <000001c89d76$319be060$0202a8c0@STUDYPC>

Hilmar

 
Also updated copy of bioperl - see output below

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ perl -MBio::Perl -le 'print Bio::Perl->VERSION;'

1.005002101

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl login

Logging in to :pserver:cvs at cvs.bioperl.org:2401/home/repository/bioperl

CVS password:

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ cd bioperl-live

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src/bioperl-live

$ cvs -q update -d -P -r bioperl-release-1-5-2

P Build.PL

P ModuleBuildBioperl.pm

P Bio/Root/Version.pm

cvs update: warning: t/data/taxdump/names.dmp was lost

U t/data/taxdump/names.dmp

cvs update: warning: t/data/taxdump/nodes.dmp was lost

U t/data/taxdump/nodes.dmp

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src/bioperl-live

$ perl -MBio::Perl -le 'print Bio::Perl->VERSION;'

1.0050021

 
Why is the VERSION 1.0050021 rather than 1.5.2 ?

 
Dave


From heikki at sanbi.ac.za  Wed Apr 16 07:36:16 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 16 Apr 2008 13:36:16 +0200
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
Message-ID: <200804161336.16879.heikki@sanbi.ac.za>

FYI,

Christoper Jones has just published 
[http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an 
article in Bioinformatics] about his 
[http://search.cpan.org/perldoc?Microarray Microarray perl module] in CPAN.

(The text added into BioPerl wiki.)

	-Heikki


On Friday 26 January 2007 16:05:01 Chris Fields wrote:
> Don't know if it's worth it, but could the microarray package be
> modified so that it deals with data generated from or interacts
> directly with Bioconductor (i.e. maybe including some specialized
> bioperl-run set of classes to run Bioconductor tasks, return
> lightweight bioperl microarray classes)?  Allen pointed out in a
> previous post that Bioconductor is the best pick for certain tasks,
> while Perl excels at others:
>
> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>
> Might be nice if we could merge both strengths together in some way.
>
> chris
>
> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
> >> Eh, there is some discussion activity on the list, but not much.  You
> >> are really better off moving to Bioconductor.
> >
> > Ok, thanks. I added that to the wiki page:
> >
> >     http://www.bioperl.org/wiki/Microarray_package
> >
> > j
> > seqlab.net
> > http://www.bioperl.org/wiki/User:Jhannah
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From heikki at sanbi.ac.za  Wed Apr 16 07:36:16 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 16 Apr 2008 13:36:16 +0200
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
Message-ID: <200804161336.16879.heikki@sanbi.ac.za>

FYI,

Christoper Jones has just published 
[http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an 
article in Bioinformatics] about his 
[http://search.cpan.org/perldoc?Microarray Microarray perl module] in CPAN.

(The text added into BioPerl wiki.)

	-Heikki


On Friday 26 January 2007 16:05:01 Chris Fields wrote:
> Don't know if it's worth it, but could the microarray package be
> modified so that it deals with data generated from or interacts
> directly with Bioconductor (i.e. maybe including some specialized
> bioperl-run set of classes to run Bioconductor tasks, return
> lightweight bioperl microarray classes)?  Allen pointed out in a
> previous post that Bioconductor is the best pick for certain tasks,
> while Perl excels at others:
>
> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>
> Might be nice if we could merge both strengths together in some way.
>
> chris
>
> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
> >> Eh, there is some discussion activity on the list, but not much.  You
> >> are really better off moving to Bioconductor.
> >
> > Ok, thanks. I added that to the wiki page:
> >
> >     http://www.bioperl.org/wiki/Microarray_package
> >
> > j
> > seqlab.net
> > http://www.bioperl.org/wiki/User:Jhannah
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From pan.mueller at yahoo.de  Wed Apr 16 08:34:51 2008
From: pan.mueller at yahoo.de (=?iso-8859-1?Q?Peter_M=FCller?=)
Date: Wed, 16 Apr 2008 12:34:51 +0000 (GMT)
Subject: [Bioperl-l] load_seqdatabase.pl --pipeline
Message-ID: <297809.47580.qm@web28203.mail.ukl.yahoo.com>

Dear list,

a want to add gene symbols to unigene-cluster which were in a biosql database and lacks this information.

So one way is to make a post-update script:
my $adp = $db->get_object_adaptor('Bio::ClusterI');
my $pseq = $adp->find_by_primary_key(n);
$adp->remove($pseq);
$pseq->gene('symbol');
$adp->store($pseq);
$adp->commit();

O.k., this works (I ask me why to remove the cluster first - bug or feature...?)

Second way - perhaps:
Using the --pipeline option, but it looks like useable only for seq-objects (Bio::Factory::SeqProcessoI) right?

regards
pan


      Machen Sie Yahoo! zu Ihrer Startseite. Los geht's: 
http://de.yahoo.com/set


From cjfields at uiuc.edu  Wed Apr 16 11:00:51 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 16 Apr 2008 10:00:51 -0500
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <479BD5A4-9C9A-4733-889D-65942F24A7F3@uiuc.edu>

That would be worth looking into at some point, if anyone's interested  
(though it may be best to build a 'bridging' module).  Wonder if it  
uses BioConductor and, if not, how performance is vs BioConductor?

chris

On Apr 16, 2008, at 6:36 AM, Heikki Lehvaslaiho wrote:

> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/ 
> 24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module]  
> in CPAN.
>
> (The text added into BioPerl wiki.)
>
> 	-Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>>> On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>>>> Eh, there is some discussion activity on the list, but not much.   
>>>> You
>>>> are really better off moving to Bioconductor.
>>>
>>> Ok, thanks. I added that to the wiki page:
>>>
>>>    http://www.bioperl.org/wiki/Microarray_package
>>>
>>> j
>>> seqlab.net
>>> http://www.bioperl.org/wiki/User:Jhannah
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From j-keller2 at md.northwestern.edu  Wed Apr 16 12:12:27 2008
From: j-keller2 at md.northwestern.edu (Jacob Keller)
Date: Wed, 16 Apr 2008 11:12:27 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>

Hello All,

I am new to this list, so am not totally sure this is the right forum, so 
please forgive if this is not the right place to asl the following question: 
I am seeking to get all sequences that have a given domain architecture, or 
at least that contain two given domains. I have thought of a few ways to do 
this.

1. Blast/Psi-blast for each domain, then compare the results for common 
sequences between the two lists, and fetch those. I would need to write a 
(simple) script to do this, but would prefer not to re-invent the wheel.

2. Search with a paradigm sequence of desired architecture/domain 
composition, somehow tweaking the psiblast parameters to find only matches 
over the whole search sequence, thereby finding both desired domains. I am 
not sure how to tweak blast to do this, though.

3. Pfam has this capability, i.e. to show all domains with a given 
architecture, but it is difficult to get at the actual sequences or even a 
list of accession numbers.

Does anybody have any suggestions as to how optimally to get these seq's?

Thanks for your consideration,

Jacob

*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-keller2 at northwestern.edu
*******************************************

----- Original Message ----- 
From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za>
To: <bioperl-l at lists.open-bio.org>
Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay Hannah" 
<jay at jays.net>; <bioperl-l at bioperl.org>
Sent: Wednesday, April 16, 2008 6:36 AM
Subject: Re: [Bioperl-l] bioperl-microarray: status?


> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module] in 
> CPAN.
>
> (The text added into BioPerl wiki.)
>
> -Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>> >> Eh, there is some discussion activity on the list, but not much.  You
>> >> are really better off moving to Bioconductor.
>> >
>> > Ok, thanks. I added that to the wiki page:
>> >
>> >     http://www.bioperl.org/wiki/Microarray_package
>> >
>> > j
>> > seqlab.net
>> > http://www.bioperl.org/wiki/User:Jhannah
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From j-keller2 at md.northwestern.edu  Wed Apr 16 12:12:27 2008
From: j-keller2 at md.northwestern.edu (Jacob Keller)
Date: Wed, 16 Apr 2008 11:12:27 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>

Hello All,

I am new to this list, so am not totally sure this is the right forum, so 
please forgive if this is not the right place to asl the following question: 
I am seeking to get all sequences that have a given domain architecture, or 
at least that contain two given domains. I have thought of a few ways to do 
this.

1. Blast/Psi-blast for each domain, then compare the results for common 
sequences between the two lists, and fetch those. I would need to write a 
(simple) script to do this, but would prefer not to re-invent the wheel.

2. Search with a paradigm sequence of desired architecture/domain 
composition, somehow tweaking the psiblast parameters to find only matches 
over the whole search sequence, thereby finding both desired domains. I am 
not sure how to tweak blast to do this, though.

3. Pfam has this capability, i.e. to show all domains with a given 
architecture, but it is difficult to get at the actual sequences or even a 
list of accession numbers.

Does anybody have any suggestions as to how optimally to get these seq's?

Thanks for your consideration,

Jacob

*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-keller2 at northwestern.edu
*******************************************

----- Original Message ----- 
From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za>
To: <bioperl-l at lists.open-bio.org>
Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay Hannah" 
<jay at jays.net>; <bioperl-l at bioperl.org>
Sent: Wednesday, April 16, 2008 6:36 AM
Subject: Re: [Bioperl-l] bioperl-microarray: status?


> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module] in 
> CPAN.
>
> (The text added into BioPerl wiki.)
>
> -Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>> >> Eh, there is some discussion activity on the list, but not much.  You
>> >> are really better off moving to Bioconductor.
>> >
>> > Ok, thanks. I added that to the wiki page:
>> >
>> >     http://www.bioperl.org/wiki/Microarray_package
>> >
>> > j
>> > seqlab.net
>> > http://www.bioperl.org/wiki/User:Jhannah
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From frederic.romagne at gmail.com  Wed Apr 16 13:25:18 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 16 Apr 2008 12:25:18 -0500
Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
Message-ID: <1208366718.19084.15.camel@kiss-laptop>

Hello,
i made a program which use Bio::Index::GenBank and i tested it under
unix, that worked well.

But i have to launch it under windows and it seems not to work on.

Here is the problem : 

my $dbobj = Bio::Index::Abstract->new("Data/$db");
?my $seq = $dbobj->get_Seq_by_acc($id);
print $seq->display_id."\n";

did not print the same number than $id !!! So i don't work on the
sequence expected...

I use the SVN sources on unix and the Perl package manager for
windows...

Thanks.


From cjfields at uiuc.edu  Wed Apr 16 13:52:59 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 16 Apr 2008 12:52:59 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
Message-ID: <BAA878A0-94B4-481F-B01C-A12086FD41E3@uiuc.edu>

You can try CDART:

http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi?cmd=rps

There are probably other tools out there as well.

If you want to roll your own, you can use bioperl wrappers for all of  
these (Bio::Tools::Run::StandAloneBlast is in bioperl-live,  
Bio::Tools::Run::Hmmer in bioperl-run), tweaking the parameters as you  
see fit, and either parse while running them or store the file for  
parsing later using Bio::SearchIO.  Personally, I wouldn't go with (2)  
unless you are absolutely sure the domains are found only once per  
sequence, are spatially conserved, and don't overlap.  For instance,  
with many proteins you could have a domain structure like dom1-dom2,  
dom2-dom1, dom1-dom1-dom2, etc.

If you just want accessions from Pfam's Stockholm format (which are  
UniProt, I believe) you can get at accessions using  
Bio::AlignIO::stockholm (using perl 5.10):

use Bio::AlignIO;
use feature 'say';

my $file = shift || die "Must pass file as argument\n";

my $in = Bio::AlignIO->new(-format => 'stockholm',
                            -file => $file);

while (my $aln = $in->next_aln) {
     my @accs;
     for my $seq ($aln->each_seq) {
         push @accs, $seq->accession_number;
     }
     say join(',', at accs);
}

chris

On Apr 16, 2008, at 11:12 AM, Jacob Keller wrote:

> Hello All,
>
> I am new to this list, so am not totally sure this is the right  
> forum, so please forgive if this is not the right place to asl the  
> following question: I am seeking to get all sequences that have a  
> given domain architecture, or at least that contain two given  
> domains. I have thought of a few ways to do this.
>
> 1. Blast/Psi-blast for each domain, then compare the results for  
> common sequences between the two lists, and fetch those. I would  
> need to write a (simple) script to do this, but would prefer not to  
> re-invent the wheel.
>
> 2. Search with a paradigm sequence of desired architecture/domain  
> composition, somehow tweaking the psiblast parameters to find only  
> matches over the whole search sequence, thereby finding both desired  
> domains. I am not sure how to tweak blast to do this, though.
>
> 3. Pfam has this capability, i.e. to show all domains with a given  
> architecture, but it is difficult to get at the actual sequences or  
> even a list of accession numbers.
>
> Does anybody have any suggestions as to how optimally to get these  
> seq's?
>
> Thanks for your consideration,
>
> Jacob
>
> *******************************************
> Jacob Pearson Keller
> Northwestern University
> Medical Scientist Training Program
> Dallos Laboratory
> F. Searle 1-240
> 2240 Campus Drive
> Evanston IL 60208
> lab: 847.491.2438
> cel: 773.608.9185
> email: j-keller2 at northwestern.edu
> *******************************************
>
> ----- Original Message ----- From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za 
> >
> To: <bioperl-l at lists.open-bio.org>
> Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay  
> Hannah" <jay at jays.net>; <bioperl-l at bioperl.org>
> Sent: Wednesday, April 16, 2008 6:36 AM
> Subject: Re: [Bioperl-l] bioperl-microarray: status?
>
>
>> FYI,
>>
>> Christoper Jones has just published
>> [http://bioinformatics.oxfordjournals.org/cgi/content/short/ 
>> 24/8/1102 an
>> article in Bioinformatics] about his
>> [http://search.cpan.org/perldoc?Microarray Microarray perl module]  
>> in CPAN.
>>
>> (The text added into BioPerl wiki.)
>>
>> -Heikki
>>
>>
>> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>>> Don't know if it's worth it, but could the microarray package be
>>> modified so that it deals with data generated from or interacts
>>> directly with Bioconductor (i.e. maybe including some specialized
>>> bioperl-run set of classes to run Bioconductor tasks, return
>>> lightweight bioperl microarray classes)?  Allen pointed out in a
>>> previous post that Bioconductor is the best pick for certain tasks,
>>> while Perl excels at others:
>>>
>>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>>
>>> Might be nice if we could merge both strengths together in some way.
>>>
>>> chris
>>>
>>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>>> >> Eh, there is some discussion activity on the list, but not  
>>> much.  You
>>> >> are really better off moving to Bioconductor.
>>> >
>>> > Ok, thanks. I added that to the wiki page:
>>> >
>>> >     http://www.bioperl.org/wiki/Microarray_package
>>> >
>>> > j
>>> > seqlab.net
>>> > http://www.bioperl.org/wiki/User:Jhannah
>>> >
>>> > _______________________________________________
>>> > Bioperl-l mailing list
>>> > Bioperl-l at lists.open-bio.org
>>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> -- 
>> ______ _/      _/ 
>> _____________________________________________________
>>     _/      _/
>>    _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>   _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>  _/  _/  _/  SANBI, South African National Bioinformatics Institute
>> _/  _/  _/  University of Western Cape, South Africa
>>    _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/ 
>> ________________________________________________________
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From David.Messina at sbc.su.se  Wed Apr 16 14:23:27 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 16 Apr 2008 20:23:27 +0200
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
Message-ID: <628aabb70804161123s453bd96bqd2213b938dfdb3a2@mail.gmail.com>

Hey Jacob,

This forum is mostly geared toward the BioPerl software package rather than
general bioinformatics assistance.

That being said, I would recommend using Pfam's Sequence Search to determine
the domain content of your sequences and then simply looking at those which
have the same two domains of interest.

If there are more sequences matching this criterion than can be examined
manually, you could write up something (potentially using BioPerl) to then
look at the relative order and number of those domains in your sequences.

However, if these sequences have UniProt IDs, you can start with the domains
and Pfam will hand you a list of all the UniProt seqs having those domains.
On the Pfam website's main page, click on "Help" (right side of menu at the
top of the page) and then "Tools and Services" (left side menu).


Dave


From Russell.Smithies at agresearch.co.nz  Wed Apr 16 16:49:49 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Apr 2008 08:49:49 +1200
Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
In-Reply-To: <1208366718.19084.15.camel@kiss-laptop>
References: <1208366718.19084.15.camel@kiss-laptop>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>

Did you check the format of your input file?
i.e. DOS or UNIX line endings?

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-
> bio.org] On Behalf Of Fr?d?ric Romagn?
> Sent: Thursday, 17 April 2008 5:25 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> 
> Hello,
> i made a program which use Bio::Index::GenBank and i tested it under
> unix, that worked well.
> 
> But i have to launch it under windows and it seems not to work on.
> 
> Here is the problem :
> 
> my $dbobj = Bio::Index::Abstract->new("Data/$db");
> ?my $seq = $dbobj->get_Seq_by_acc($id);
> print $seq->display_id."\n";
> 
> did not print the same number than $id !!! So i don't work on the
> sequence expected...
> 
> I use the SVN sources on unix and the Perl package manager for
> windows...
> 
> Thanks.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From frederic.romagne at gmail.com  Wed Apr 16 17:39:07 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 16 Apr 2008 16:39:07 -0500
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
Message-ID: <1208381947.16620.6.camel@kiss-laptop>

Well, if with input file you mean the database used, it's created
with ?Bio::Index::GenBank from a ncbi FTP's genbank file.

$id is an accession number read from a file but i chomp the line...

I am trying to install the svn version of bioperl under windows to see
if there is an improvement.

Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
> Did you check the format of your input file?
> i.e. DOS or UNIX line endings?
> 
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-
> > bio.org] On Behalf Of Fr?d?ric Romagn?
> > Sent: Thursday, 17 April 2008 5:25 a.m.
> > To: bioperl-l at lists.open-bio.org
> > Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> > 
> > Hello,
> > i made a program which use Bio::Index::GenBank and i tested it under
> > unix, that worked well.
> > 
> > But i have to launch it under windows and it seems not to work on.
> > 
> > Here is the problem :
> > 
> > my $dbobj = Bio::Index::Abstract->new("Data/$db");
> > ?my $seq = $dbobj->get_Seq_by_acc($id);
> > print $seq->display_id."\n";
> > 
> > did not print the same number than $id !!! So i don't work on the
> > sequence expected...
> > 
> > I use the SVN sources on unix and the Perl package manager for
> > windows...
> > 
> > Thanks.
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


From hubert.gaynor at yahoo.com  Thu Apr 17 02:19:11 2008
From: hubert.gaynor at yahoo.com (Hubert Gaynor)
Date: Wed, 16 Apr 2008 23:19:11 -0700 (PDT)
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
Message-ID: <657734.41592.qm@web46008.mail.sp1.yahoo.com>

Hi,

As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier? 

Thanks!

Hubert.


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


From sdavis2 at mail.nih.gov  Thu Apr 17 06:36:32 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 17 Apr 2008 06:36:32 -0400
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
In-Reply-To: <657734.41592.qm@web46008.mail.sp1.yahoo.com>
References: <657734.41592.qm@web46008.mail.sp1.yahoo.com>
Message-ID: <264855a00804170336o2a2bcff9xfcb05a33bac4c8dc@mail.gmail.com>

On Thu, Apr 17, 2008 at 2:19 AM, Hubert Gaynor <hubert.gaynor at yahoo.com> wrote:
> Hi,
>
>  As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier?
>

formatdb is used to make a representation that can be used efficiently
by blast.  That representation already makes blast faster.  MySQL
can't be used for such things.  As for speeding blast, if you have a
multiprocessor machine, you can take advantage of those using blast
and increasing the number of processors.  Also, while blast is a very
versatile program, it is not the only alignment program available.
Depending on your needs, you could look at other programs such as blat
or gmap that can be 2-3 orders of magnitude faster than blast.

Sean


From stefan.kirov at bms.com  Thu Apr 17 09:40:29 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 09:40:29 -0400
Subject: [Bioperl-l] bioperl-db woes
Message-ID: <4807534D.80105@bms.com>

I'm having problems passing all the tests for bioperl-db. There are 2
distinct errors, first one:
Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
   ***Which by the way is embed deep into several layers of eval, so I
am getting the actual error from the test:
    ***t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"
via package "Bio::Ontology::Term" at    
       
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
line 552, <GEN0> line 78.
       or
       ------------- EXCEPTION: Bio::Root::Exception -------------

    MSG: Annotation of class Bio::Annotation::Collection not
    type-mapped. Internal error?
    STACK: Error::throw
    STACK: Bio::Root::Root::throw
    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
    STACK:
    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
    STACK: Bio::DB::Persistent::PersistentObject::store
    Bio/DB/Persistent/PersistentObject.pm:271
    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
    Bio/DB/BioSQL/SeqAdaptor.pm:224
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
    STACK: Bio::DB::Persistent::PersistentObject::create
    Bio/DB/Persistent/PersistentObject.pm:244
    STACK: t/04swiss.t:36
    -----------------------------------------------------------

It turns out the adaptor is really not there???
My bioperl-db is from
dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
bioperl-db (revision 14661)
Is this module being deprecated (I am sure it is not) my download
incomplete....?
The other problem was:
DBD::Oracle::st execute failed: ORA-02292: integrity constraint
(BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
ParamValues: :p1=9606] at
/home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 320.
not ok 76
# Test 76 got: <UNDEF> (t/02species.t at line 71)
I have not tried to debug this one....
Thanks!
Stefan


From stefan.kirov at bms.com  Thu Apr 17 10:18:30 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 10:18:30 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
Message-ID: <Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>


On Thu, 17 Apr 2008, Chris Fields wrote:

> The 'get_dbxrefs' problem looks related to recent changes I made when rolling 
> back the significant feature/annotation changes introduced just prior to the 
> 1.5 release, none which were fully implemented.  I can check that one out. 
> Odd though; these passed for me, but I'm using MySQL not oracle.
get_dbxref is not the problem- I think the error message is misleading:
kirovs at horta:~/bioperl-db> grep get_dbxrefs 
/home/kirovs/bioperl-live/Bio/Ontology/Term.pm
            get_dbxrefs() instead, which handles both strings and DBLink
                       "Use get_dbxrefs() instead");
     $self->get_dbxrefs($context);
=head2 get_dbxrefs
  Title   : get_dbxrefs()
  Usage   : @ds = $term->get_dbxrefs();
sub get_dbxrefs {
} # get_dbxrefs
     my @old = $self->get_dbxrefs($context);
sub each_dblink {shift->throw("use of each_dblink() is deprecated; use 
get_dbxrefs() instead")}

So it is there.
In any case I debugged and tracked that down to the RichSeq adaptor module 
missing. It is not in the distro I downloaded, so I think this is my 
problem. It is a different question why...
I looked at different repos (SVN, CVS, trunk, different tags) and I did 
not see RichSeq.pm. I am not sure what is going on. Perhaps Hilmar will be 
able to help when he is around.
Thanks for the help Chris.... 
Stefan

>
> You may want to make sure you are using bioperl-live and that there isn't an 
> older bioperl installation getting into the mix.
>
> chris
>
> On Apr 17, 2008, at 8:40 AM, Stefan Kirov wrote:
>
>> I'm having problems passing all the tests for bioperl-db. There are 2
>> distinct errors, first one:
>> Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
>>  ***Which by the way is embed deep into several layers of eval, so I
>> am getting the actual error from the test:
>>   ***t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::Term" at
>> 
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 78.
>>      or
>>      ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>   MSG: Annotation of class Bio::Annotation::Collection not
>>   type-mapped. Internal error?
>>   STACK: Error::throw
>>   STACK: Bio::Root::Root::throw
>>   /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>   STACK:
>>   Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>   STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>   STACK: Bio::DB::Persistent::PersistentObject::store
>>   Bio/DB/Persistent/PersistentObject.pm:271
>>   STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>   Bio/DB/BioSQL/SeqAdaptor.pm:224
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>   STACK: Bio::DB::Persistent::PersistentObject::create
>>   Bio/DB/Persistent/PersistentObject.pm:244
>>   STACK: t/04swiss.t:36
>>   -----------------------------------------------------------
>> 
>> It turns out the adaptor is really not there???
>> My bioperl-db is from
>> dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
>> bioperl-db (revision 14661)
>> Is this module being deprecated (I am sure it is not) my download
>> incomplete....?
>> The other problem was:
>> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
>> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
>> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
>> ParamValues: :p1=9606] at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
>> line 320.
>> not ok 76
>> # Test 76 got: <UNDEF> (t/02species.t at line 71)
>> I have not tried to debug this one....
>> Thanks!
>> Stefan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>


From cjfields at uiuc.edu  Thu Apr 17 09:59:57 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 08:59:57 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <4807534D.80105@bms.com>
References: <4807534D.80105@bms.com>
Message-ID: <82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>

The 'get_dbxrefs' problem looks related to recent changes I made when  
rolling back the significant feature/annotation changes introduced  
just prior to the 1.5 release, none which were fully implemented.  I  
can check that one out.  Odd though; these passed for me, but I'm  
using MySQL not oracle.

You may want to make sure you are using bioperl-live and that there  
isn't an older bioperl installation getting into the mix.

chris

On Apr 17, 2008, at 8:40 AM, Stefan Kirov wrote:

> I'm having problems passing all the tests for bioperl-db. There are 2
> distinct errors, first one:
> Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
>   ***Which by the way is embed deep into several layers of eval, so I
> am getting the actual error from the test:
>    ***t/04swiss.........ok 3/52Can't locate object method  
> "get_dbxrefs"
> via package "Bio::Ontology::Term" at
>
> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
> line 552, <GEN0> line 78.
>       or
>       ------------- EXCEPTION: Bio::Root::Exception -------------
>
>    MSG: Annotation of class Bio::Annotation::Collection not
>    type-mapped. Internal error?
>    STACK: Error::throw
>    STACK: Bio::Root::Root::throw
>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>    STACK:
>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>    STACK: Bio::DB::Persistent::PersistentObject::store
>    Bio/DB/Persistent/PersistentObject.pm:271
>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>    STACK: Bio::DB::Persistent::PersistentObject::create
>    Bio/DB/Persistent/PersistentObject.pm:244
>    STACK: t/04swiss.t:36
>    -----------------------------------------------------------
>
> It turns out the adaptor is really not there???
> My bioperl-db is from
> dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
> bioperl-db (revision 14661)
> Is this module being deprecated (I am sure it is not) my download
> incomplete....?
> The other problem was:
> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
> ParamValues: :p1=9606] at
> /home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm
> line 320.
> not ok 76
> # Test 76 got: <UNDEF> (t/02species.t at line 71)
> I have not tried to debug this one....
> Thanks!
> Stefan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stefan.kirov at bms.com  Thu Apr 17 10:52:32 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 10:52:32 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>
References: <4807534D.80105@bms.com>
	<9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>
Message-ID: <Pine.WNT.4.64.0804171052070.2732@A161887.one.ads.bms.com>

That is correct and I assumed I should not be concerned with this error.
Thanks
Stefan

On Thu, 17 Apr 2008, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 9:40 AM, Stefan Kirov wrote:
>> The other problem was:
>> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
>> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
>> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
>> ParamValues: :p1=9606] at
>
>
> This sounds like you are running the tests against a non-empty database?
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>


From hlapp at gmx.net  Thu Apr 17 10:47:58 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Apr 2008 10:47:58 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
Message-ID: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>


On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
> In any case I debugged and tracked that down to the RichSeq adaptor  
> module missing.


That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
Bio::Seq and a SeqAdaptor is present.

I'm afraid it gets stuck somewhere else and frankly I didn't see the  
RichSeqAdaptor failing to load in your stack trace:

>        ------------- EXCEPTION: Bio::Root::Exception -------------
>
>     MSG: Annotation of class Bio::Annotation::Collection not
>     type-mapped. Internal error?
>     STACK: Error::throw
>     STACK: Bio::Root::Root::throw
>     /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>     STACK:
>     Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>     Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>     STACK:  
> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>     Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>     STACK: Bio::DB::Persistent::PersistentObject::store
>     Bio/DB/Persistent/PersistentObject.pm:271
>     STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>     Bio/DB/BioSQL/SeqAdaptor.pm:224
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>     STACK: Bio::DB::Persistent::PersistentObject::create
>     Bio/DB/Persistent/PersistentObject.pm:244
>     STACK: t/04swiss.t:36
>     -----------------------------------------------------------

What that tells me is that when bioperl-db tries to store the  
annotation bundle of the (SwissProt) sequence, one of the annotations  
that it encounters is of type Bio::Annotation::Collection. At present  
bioperl-db doesn't know what to do with it; i.e., bioperl-db can't  
yet handle hierarchical annotation collections (collections within  
collections).

I believe this is due to recent changes in how the GN line is parsed  
in BioPerl - Chris does this ring the right bell? I thought though  
you had built in a method would allow flattening out?

It's worth noting that BioSQL itself can't really represent nested  
annotation collections other than by using ontology terms and their  
hierarchy, which at present I think isn't really appropriate, but I  
have to think through the issue more. In other words, in BioSQL you  
can't directly tie together a bunch of qualifier value pairs into a  
"bag" and then nest this bag within another. The way to make this  
work with the current schema is to flatten out the nesting.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Thu Apr 17 10:48:52 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Apr 2008 10:48:52 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <4807534D.80105@bms.com>
References: <4807534D.80105@bms.com>
Message-ID: <9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>


On Apr 17, 2008, at 9:40 AM, Stefan Kirov wrote:
> The other problem was:
> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
> ParamValues: :p1=9606] at


This sounds like you are running the tests against a non-empty database?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From stefan.kirov at bms.com  Thu Apr 17 11:28:42 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 11:28:42 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
Message-ID: <Pine.WNT.4.64.0804171052430.2732@A161887.one.ads.bms.com>

Hilmar,
I think I saw what happens with this adaptor-
In Bio::DB::BioSQL::DBAdaptor::_load_object_adaptor (call from 
create_persistent) there is request that this module is loaded:
Bio/DB/BioSQL/RichSeqAdaptor.pm
There is no such module... This always fails, but since it is evaled, 
there is no actual error- instead. Perhaps this is leftover...?
This got me fooled...

I guess Chris could be right-
  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key is 
being passed Bio::Annotation::Collection as a value for $obj->obj(). Or 
recursing too far?
Anyway, I am just guessing here- I do not know the architecture of 
bioperl-db...
Thanks again for the help...
Stefan

  On Thu, 17 Apr 2008, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>> In any case I debugged and tracked that down to the RichSeq adaptor module 
>> missing.
>
>
> That almost can't be the problem. Every Bio::Seq::RichSeq is-a Bio::Seq and a 
> SeqAdaptor is present.
>
> I'm afraid it gets stuck somewhere else and frankly I didn't see the 
> RichSeqAdaptor failing to load in your stack trace:
>
>>       ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>    MSG: Annotation of class Bio::Annotation::Collection not
>>    type-mapped. Internal error?
>>    STACK: Error::throw
>>    STACK: Bio::Root::Root::throw
>>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>    STACK:
>>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>    STACK: Bio::DB::Persistent::PersistentObject::store
>>    Bio/DB/Persistent/PersistentObject.pm:271
>>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::Persistent::PersistentObject::create
>>    Bio/DB/Persistent/PersistentObject.pm:244
>>    STACK: t/04swiss.t:36
>>    -----------------------------------------------------------
>
> What that tells me is that when bioperl-db tries to store the annotation 
> bundle of the (SwissProt) sequence, one of the annotations that it encounters 
> is of type Bio::Annotation::Collection. At present bioperl-db doesn't know 
> what to do with it; i.e., bioperl-db can't yet handle hierarchical annotation 
> collections (collections within collections).
>
> I believe this is due to recent changes in how the GN line is parsed in 
> BioPerl - Chris does this ring the right bell? I thought though you had built 
> in a method would allow flattening out?
>
> It's worth noting that BioSQL itself can't really represent nested annotation 
> collections other than by using ontology terms and their hierarchy, which at 
> present I think isn't really appropriate, but I have to think through the 
> issue more. In other words, in BioSQL you can't directly tie together a bunch 
> of qualifier value pairs into a "bag" and then nest this bag within another. 
> The way to make this work with the current schema is to flatten out the 
> nesting.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>


From cjfields at uiuc.edu  Thu Apr 17 12:26:41 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 11:26:41 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
Message-ID: <AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>


On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>> In any case I debugged and tracked that down to the RichSeq adaptor  
>> module missing.
>
>
> That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
> Bio::Seq and a SeqAdaptor is present.
>
> I'm afraid it gets stuck somewhere else and frankly I didn't see the  
> RichSeqAdaptor failing to load in your stack trace:
>
>>       ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>    MSG: Annotation of class Bio::Annotation::Collection not
>>    type-mapped. Internal error?
>>    STACK: Error::throw
>>    STACK: Bio::Root::Root::throw
>>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>    STACK:
>>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>    STACK:  
>> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>    STACK: Bio::DB::Persistent::PersistentObject::store
>>    Bio/DB/Persistent/PersistentObject.pm:271
>>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::Persistent::PersistentObject::create
>>    Bio/DB/Persistent/PersistentObject.pm:244
>>    STACK: t/04swiss.t:36
>>    -----------------------------------------------------------
>
> What that tells me is that when bioperl-db tries to store the  
> annotation bundle of the (SwissProt) sequence, one of the  
> annotations that it encounters is of type  
> Bio::Annotation::Collection. At present bioperl-db doesn't know what  
> to do with it; i.e., bioperl-db can't yet handle hierarchical  
> annotation collections (collections within collections).
>
> I believe this is due to recent changes in how the GN line is parsed  
> in BioPerl - Chris does this ring the right bell? I thought though  
> you had built in a method would allow flattening out

This appears to be using an older bioperl-live checkout, one where  
Heikki changed GN parsing to use a nested Annotation::Collection.  I  
reverted that back in a later commit to svn specifically b/c of  
bioperl-db problems.  bioperl-live's swiss.pm now uses a new subclass  
of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that  
represents nested values via Data::Stag's itext output (we can change  
that to alternatives if needed).

Here are the last few relevant revisions in bioperl-live's main trunk  
(mine is the latest):

------------------------------------------------------------------------
r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1  
line

bug 1825: updating swiss.pm/tests to try out TagTree (passes all  
tests).  Need to update Handler.t and related modules still...
------------------------------------------------------------------------
r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line

documentation for the GN line parsing and management
------------------------------------------------------------------------
r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line

GN (Gene Name) line parsing rewrite. Breaks backward compatibility.  
Can now deal with >1 gene per entry and four categories of names per  
gene. Parses old style syntax (...OR ... OR ... ) into one gene name  
and synonyms for each gene. Docs to follow.

....

I just updated all code from dev and reran bioperl-db tests w/o  
problems.  Maybe someone else could do the same to see what happens?

> It's worth noting that BioSQL itself can't really represent nested  
> annotation collections other than by using ontology terms and their  
> hierarchy, which at present I think isn't really appropriate, but I  
> have to think through the issue more. In other words, in BioSQL you  
> can't directly tie together a bunch of qualifier value pairs into a  
> "bag" and then nest this bag within another. The way to make this  
> work with the current schema is to flatten out the nesting.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Might be worth looking into for a future BioSQL release, but we have a  
decent workaround in place for now, as long as it works cross-platform  
and cross-RDB.

chris


From stefan.kirov at bms.com  Thu Apr 17 12:40:14 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 12:40:14 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
Message-ID: <Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>

Hilmar,
sorry, I missed the part after the stack trace... In any case this is 
still problem for me after I updated bioperl-live.
I see this with a number of other tests:
t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 78.
t/04swiss.........dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 6-52
         Failed 47/52 tests, 9.62% okay
t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 72.
t/05seqfeature....FAILED tests 9-48
         Failed 40/48 tests, 16.67% okay
t/06comment.......ok
t/07dblink........ok
t/08genbank.......ok
t/09fuzzy2........ok
t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1420.
t/10ensembl.......dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 3-15
         Failed 13/15 tests, 13.33% okay
t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs" via 
package "Bio::Annotation::OntologyTerm" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1.
t/11locuslink.....dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 5-110
         Failed 106/110 tests, 3.64% okay
t/12ontology......ok 1/738Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::GOterm" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 98.
t/12ontology......dubious
         Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 5-738
         Failed 734/738 tests, 0.54% okay
t/13remove........ok 2/59Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 145.
t/13remove........FAILED tests 11-59
         Failed 49/59 tests, 16.95% okay
t/14query.........ok
t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1.
t/15cluster.......dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 6-160
         Failed 155/160 tests, 3.12% okay
t/16obda..........ok

On Thu, 17 Apr 2008, Chris Fields wrote:

>
> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>
>> 
>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>> In any case I debugged and tracked that down to the RichSeq adaptor module 
>>> missing.
>> 
>> 
>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a Bio::Seq and 
>> a SeqAdaptor is present.
>> 
>> I'm afraid it gets stuck somewhere else and frankly I didn't see the 
>> RichSeqAdaptor failing to load in your stack trace:
>>
>>>      ------------- EXCEPTION: Bio::Root::Exception -------------
>>>
>>>   MSG: Annotation of class Bio::Annotation::Collection not
>>>   type-mapped. Internal error?
>>>   STACK: Error::throw
>>>   STACK: Bio::Root::Root::throw
>>>   /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>   STACK:
>>>   Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>   STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>   STACK: Bio::DB::Persistent::PersistentObject::store
>>>   Bio/DB/Persistent/PersistentObject.pm:271
>>>   STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>   Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>   STACK: Bio::DB::Persistent::PersistentObject::create
>>>   Bio/DB/Persistent/PersistentObject.pm:244
>>>   STACK: t/04swiss.t:36
>>>   -----------------------------------------------------------
>> 
>> What that tells me is that when bioperl-db tries to store the annotation 
>> bundle of the (SwissProt) sequence, one of the annotations that it 
>> encounters is of type Bio::Annotation::Collection. At present bioperl-db 
>> doesn't know what to do with it; i.e., bioperl-db can't yet handle 
>> hierarchical annotation collections (collections within collections).
>> 
>> I believe this is due to recent changes in how the GN line is parsed in 
>> BioPerl - Chris does this ring the right bell? I thought though you had 
>> built in a method would allow flattening out
>
> This appears to be using an older bioperl-live checkout, one where Heikki 
> changed GN parsing to use a nested Annotation::Collection.  I reverted that 
> back in a later commit to svn specifically b/c of bioperl-db problems. 
> bioperl-live's swiss.pm now uses a new subclass of 
> Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that represents 
> nested values via Data::Stag's itext output (we can change that to 
> alternatives if needed).
>
> Here are the last few relevant revisions in bioperl-live's main trunk (mine 
> is the latest):
>
> ------------------------------------------------------------------------
> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1 line
>
> bug 1825: updating swiss.pm/tests to try out TagTree (passes all tests). 
> Need to update Handler.t and related modules still...
> ------------------------------------------------------------------------
> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line
>
> documentation for the GN line parsing and management
> ------------------------------------------------------------------------
> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line
>
> GN (Gene Name) line parsing rewrite. Breaks backward compatibility. Can now 
> deal with >1 gene per entry and four categories of names per gene. Parses old 
> style syntax (...OR ... OR ... ) into one gene name and synonyms for each 
> gene. Docs to follow.
>
> ....
>
> I just updated all code from dev and reran bioperl-db tests w/o problems. 
> Maybe someone else could do the same to see what happens?
>
>> It's worth noting that BioSQL itself can't really represent nested 
>> annotation collections other than by using ontology terms and their 
>> hierarchy, which at present I think isn't really appropriate, but I have to 
>> think through the issue more. In other words, in BioSQL you can't directly 
>> tie together a bunch of qualifier value pairs into a "bag" and then nest 
>> this bag within another. The way to make this work with the current schema 
>> is to flatten out the nesting.
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>
> Might be worth looking into for a future BioSQL release, but we have a decent 
> workaround in place for now, as long as it works cross-platform and 
> cross-RDB.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Thu Apr 17 13:06:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 12:06:39 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
	<Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
Message-ID: <C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>

Stefan,

'get_dbxrefs' was introduced in bioperl-live a while back during the  
feature/annotation rollback detailed here:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

I still think this is an interfering old bioperl (and maybe bioperl- 
db) installation causing the problems; I had similar issues at one  
point and had to find and remove the old installation.  It might be  
worth (1) checking 'perldoc -l Bio::Root::Root', which will give the  
location of the Bio::Root::Root in lib path being used, and (2) using  
'./Build install uninst=1' to remove any old bioperl/bioperl-db  
installations.

chris

On Apr 17, 2008, at 11:40 AM, Stefan Kirov wrote:

> Hilmar,
> sorry, I missed the part after the stack trace... In any case this  
> is still problem for me after I updated bioperl-live.
> I see this with a number of other tests:
> t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 78.
> t/04swiss.........dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 6-52
>        Failed 47/52 tests, 9.62% okay
> t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 72.
> t/05seqfeature....FAILED tests 9-48
>        Failed 40/48 tests, 16.67% okay
> t/06comment.......ok
> t/07dblink........ok
> t/08genbank.......ok
> t/09fuzzy2........ok
> t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 1420.
> t/10ensembl.......dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 3-15
>        Failed 13/15 tests, 13.33% okay
> t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs"  
> via package "Bio::Annotation::OntologyTerm" at /home/kirovs/bioperl- 
> db/blib/lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0>  
> line 1.
> t/11locuslink.....dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 5-110
>        Failed 106/110 tests, 3.64% okay
> t/12ontology......ok 1/738Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::GOterm" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 98.
> t/12ontology......dubious
>        Test returned status 255 (wstat 65280, 0xff00)
> DIED. FAILED tests 5-738
>        Failed 734/738 tests, 0.54% okay
> t/13remove........ok 2/59Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 145.
> t/13remove........FAILED tests 11-59
>        Failed 49/59 tests, 16.95% okay
> t/14query.........ok
> t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 1.
> t/15cluster.......dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 6-160
>        Failed 155/160 tests, 3.12% okay
> t/16obda..........ok
>
> On Thu, 17 Apr 2008, Chris Fields wrote:
>
>>
>> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>>
>>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>>> In any case I debugged and tracked that down to the RichSeq  
>>>> adaptor module missing.
>>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
>>> Bio::Seq and a SeqAdaptor is present.
>>> I'm afraid it gets stuck somewhere else and frankly I didn't see  
>>> the RichSeqAdaptor failing to load in your stack trace:
>>>
>>>>     ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>
>>>>  MSG: Annotation of class Bio::Annotation::Collection not
>>>>  type-mapped. Internal error?
>>>>  STACK: Error::throw
>>>>  STACK: Bio::Root::Root::throw
>>>>  /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>>  STACK:
>>>>  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>>  STACK:  
>>>> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>>  STACK: Bio::DB::Persistent::PersistentObject::store
>>>>  Bio/DB/Persistent/PersistentObject.pm:271
>>>>  STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>>  Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>  STACK: Bio::DB::Persistent::PersistentObject::create
>>>>  Bio/DB/Persistent/PersistentObject.pm:244
>>>>  STACK: t/04swiss.t:36
>>>>  -----------------------------------------------------------
>>> What that tells me is that when bioperl-db tries to store the  
>>> annotation bundle of the (SwissProt) sequence, one of the  
>>> annotations that it encounters is of type  
>>> Bio::Annotation::Collection. At present bioperl-db doesn't know  
>>> what to do with it; i.e., bioperl-db can't yet handle hierarchical  
>>> annotation collections (collections within collections).
>>> I believe this is due to recent changes in how the GN line is  
>>> parsed in BioPerl - Chris does this ring the right bell? I thought  
>>> though you had built in a method would allow flattening out
>>
>> This appears to be using an older bioperl-live checkout, one where  
>> Heikki changed GN parsing to use a nested Annotation::Collection.   
>> I reverted that back in a later commit to svn specifically b/c of  
>> bioperl-db problems. bioperl-live's swiss.pm now uses a new  
>> subclass of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree)  
>> that represents nested values via Data::Stag's itext output (we can  
>> change that to alternatives if needed).
>>
>> Here are the last few relevant revisions in bioperl-live's main  
>> trunk (mine is the latest):
>>
>> ------------------------------------------------------------------------
>> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) |  
>> 1 line
>>
>> bug 1825: updating swiss.pm/tests to try out TagTree (passes all  
>> tests). Need to update Handler.t and related modules still...
>> ------------------------------------------------------------------------
>> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1  
>> line
>>
>> documentation for the GN line parsing and management
>> ------------------------------------------------------------------------
>> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1  
>> line
>>
>> GN (Gene Name) line parsing rewrite. Breaks backward compatibility.  
>> Can now deal with >1 gene per entry and four categories of names  
>> per gene. Parses old style syntax (...OR ... OR ... ) into one gene  
>> name and synonyms for each gene. Docs to follow.
>>
>> ....
>>
>> I just updated all code from dev and reran bioperl-db tests w/o  
>> problems. Maybe someone else could do the same to see what happens?
>>
>>> It's worth noting that BioSQL itself can't really represent nested  
>>> annotation collections other than by using ontology terms and  
>>> their hierarchy, which at present I think isn't really  
>>> appropriate, but I have to think through the issue more. In other  
>>> words, in BioSQL you can't directly tie together a bunch of  
>>> qualifier value pairs into a "bag" and then nest this bag within  
>>> another. The way to make this work with the current schema is to  
>>> flatten out the nesting.
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>
>> Might be worth looking into for a future BioSQL release, but we  
>> have a decent workaround in place for now, as long as it works  
>> cross-platform and cross-RDB.
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stefan.kirov at bms.com  Thu Apr 17 13:52:22 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 13:52:22 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
	<Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
	<C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>
Message-ID: <48078E56.9000404@bms.com>

Chris Fields wrote:
> Stefan,
>
> 'get_dbxrefs' was introduced in bioperl-live a while back during the
> feature/annotation rollback detailed here:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback
>
Chris was right!
> I still think this is an interfering old bioperl (and maybe
> bioperl-db) installation causing the problems; I had similar issues at
> one point and had to find and remove the old installation.  It might
> be worth (1) checking 'perldoc -l Bio::Root::Root',
This is the first thing I did and it seemed fine from command line.
So I checked a new copy (vs. updating), set PERL5LIB to the minimum
which is necessary (Build changes INC), which is
/home/kirovs/bioperl-db/bplive:/stf/sysdev/perl/newlib/perl/lib/5.8/ia64-linux-multi/
(/home/kirovs/bioperl-db/bplive being the fresh copy and the other
having Module::Build, etc., but definitely no bioperl).
This fixed the problem. I still do not see where the old module came
from, but that was a really good guess.
Thanks
Stefan
> which will give the location of the Bio::Root::Root in lib path being
> used, and (2) using './Build install uninst=1' to remove any old
> bioperl/bioperl-db installations.
Unfortunately this is not an option for me.
>
> chris
>
> On Apr 17, 2008, at 11:40 AM, Stefan Kirov wrote:
>
>> Hilmar,
>> sorry, I missed the part after the stack trace... In any case this is
>> still problem for me after I updated bioperl-live.
>> I see this with a number of other tests:
>> t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 78.
>> t/04swiss.........dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 6-52
>>        Failed 47/52 tests, 9.62% okay
>> t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 72.
>> t/05seqfeature....FAILED tests 9-48
>>        Failed 40/48 tests, 16.67% okay
>> t/06comment.......ok
>> t/07dblink........ok
>> t/08genbank.......ok
>> t/09fuzzy2........ok
>> t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1420.
>> t/10ensembl.......dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 3-15
>>        Failed 13/15 tests, 13.33% okay
>> t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs"
>> via package "Bio::Annotation::OntologyTerm" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1.
>> t/11locuslink.....dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 5-110
>>        Failed 106/110 tests, 3.64% okay
>> t/12ontology......ok 1/738Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::GOterm" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 98.
>> t/12ontology......dubious
>>        Test returned status 255 (wstat 65280, 0xff00)
>> DIED. FAILED tests 5-738
>>        Failed 734/738 tests, 0.54% okay
>> t/13remove........ok 2/59Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 145.
>> t/13remove........FAILED tests 11-59
>>        Failed 49/59 tests, 16.95% okay
>> t/14query.........ok
>> t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1.
>> t/15cluster.......dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 6-160
>>        Failed 155/160 tests, 3.12% okay
>> t/16obda..........ok
>>
>> On Thu, 17 Apr 2008, Chris Fields wrote:
>>
>>>
>>> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>>>
>>>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>>>> In any case I debugged and tracked that down to the RichSeq
>>>>> adaptor module missing.
>>>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a
>>>> Bio::Seq and a SeqAdaptor is present.
>>>> I'm afraid it gets stuck somewhere else and frankly I didn't see
>>>> the RichSeqAdaptor failing to load in your stack trace:
>>>>
>>>>>     ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>>
>>>>>  MSG: Annotation of class Bio::Annotation::Collection not
>>>>>  type-mapped. Internal error?
>>>>>  STACK: Error::throw
>>>>>  STACK: Bio::Root::Root::throw
>>>>>  /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>>>  STACK:
>>>>>  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>>>  STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>>>  STACK: Bio::DB::Persistent::PersistentObject::store
>>>>>  Bio/DB/Persistent/PersistentObject.pm:271
>>>>>  STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>>>  Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>>  STACK: Bio::DB::Persistent::PersistentObject::create
>>>>>  Bio/DB/Persistent/PersistentObject.pm:244
>>>>>  STACK: t/04swiss.t:36
>>>>>  -----------------------------------------------------------
>>>> What that tells me is that when bioperl-db tries to store the
>>>> annotation bundle of the (SwissProt) sequence, one of the
>>>> annotations that it encounters is of type
>>>> Bio::Annotation::Collection. At present bioperl-db doesn't know
>>>> what to do with it; i.e., bioperl-db can't yet handle hierarchical
>>>> annotation collections (collections within collections).
>>>> I believe this is due to recent changes in how the GN line is
>>>> parsed in BioPerl - Chris does this ring the right bell? I thought
>>>> though you had built in a method would allow flattening out
>>>
>>> This appears to be using an older bioperl-live checkout, one where
>>> Heikki changed GN parsing to use a nested Annotation::Collection.  I
>>> reverted that back in a later commit to svn specifically b/c of
>>> bioperl-db problems. bioperl-live's swiss.pm now uses a new subclass
>>> of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that
>>> represents nested values via Data::Stag's itext output (we can
>>> change that to alternatives if needed).
>>>
>>> Here are the last few relevant revisions in bioperl-live's main
>>> trunk (mine is the latest):
>>>
>>> ------------------------------------------------------------------------
>>>
>>> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1
>>> line
>>>
>>> bug 1825: updating swiss.pm/tests to try out TagTree (passes all
>>> tests). Need to update Handler.t and related modules still...
>>> ------------------------------------------------------------------------
>>>
>>> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line
>>>
>>> documentation for the GN line parsing and management
>>> ------------------------------------------------------------------------
>>>
>>> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line
>>>
>>> GN (Gene Name) line parsing rewrite. Breaks backward compatibility.
>>> Can now deal with >1 gene per entry and four categories of names per
>>> gene. Parses old style syntax (...OR ... OR ... ) into one gene name
>>> and synonyms for each gene. Docs to follow.
>>>
>>> ....
>>>
>>> I just updated all code from dev and reran bioperl-db tests w/o
>>> problems. Maybe someone else could do the same to see what happens?
>>>
>>>> It's worth noting that BioSQL itself can't really represent nested
>>>> annotation collections other than by using ontology terms and their
>>>> hierarchy, which at present I think isn't really appropriate, but I
>>>> have to think through the issue more. In other words, in BioSQL you
>>>> can't directly tie together a bunch of qualifier value pairs into a
>>>> "bag" and then nest this bag within another. The way to make this
>>>> work with the current schema is to flatten out the nesting.
>>>>
>>>>     -hilmar
>>>> --===========================================================
>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>> ===========================================================
>>>
>>> Might be worth looking into for a future BioSQL release, but we have
>>> a decent workaround in place for now, as long as it works
>>> cross-platform and cross-RDB.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


From hubert.gaynor at yahoo.com  Thu Apr 17 20:53:16 2008
From: hubert.gaynor at yahoo.com (Hubert Gaynor)
Date: Thu, 17 Apr 2008 17:53:16 -0700 (PDT)
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
Message-ID: <130971.67684.qm@web46007.mail.sp1.yahoo.com>

Hi Sean,

I got it. Thank you so much!

Hubert

----- Original Message ----
From: Sean Davis <sdavis2 at mail.nih.gov>
To: Hubert Gaynor <hubert.gaynor at yahoo.com>
Sent: Thursday, April 17, 2008 6:36:02 PM
Subject: Re: [Bioperl-l] Can I use BLAST against a database like MySQL

On Thu, Apr 17, 2008 at 2:19 AM, Hubert Gaynor <hubert.gaynor at yahoo.com> wrote:
> Hi,
>
>  As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier?
>

formatdb is used to make a representation that can be used efficiently
by blast.  That representation already makes blast faster.  MySQL
can't be used for such things.  As for speeding blast, if you have a
multiprocessor machine, you can take advantage of those using blast
and increasing the number of processors.  Also, while blast is a very
versatile program, it is not the only alignment program available.
Depending on your needs, you could look at other programs such as blat
or gmap that can be 2-3 orders of magnitude faster than blast.

Sean


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


From Russell.Smithies at agresearch.co.nz  Thu Apr 17 21:39:23 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Fri, 18 Apr 2008 13:39:23 +1200
Subject: [Bioperl-l] accessing params for custom glyphs?
In-Reply-To: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
References: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>

This is probably more of a Perl OO problem I'm having, but can anyone
tell me how to access a parameter when I create a custom glyph?

I've created a panel in the usual way then I add a feature with
'my_glyph' and want to pass the value of -new_parameter to the glyph
drawing code.

    $panel->add_track( $feature,
    			-font => gdSmallFont,
			-glyph => 'my_glyph' ,
			-height => 10,
                		-label  => 1,
                		-strand => "forward",
                		-new_parameter => "test",


In my_glyph.pm, I have the usual draw_component sub:

sub draw_component {
  my $self = shift;
  my $gd = shift;
  my ($x1,$y1,$x2,$y2) = $self->bounds(@_);
  my $fg = $self->fgcolor;
  my $params = $self->??????????   <<--- how do I access the value of
"new_parameter" set in the panel drawing code?

  $gd->line($x1,$y1,$x2,$y2,$fg);
  $gd->line($x1,$y2,$x2,$y1,$fg);

}

Any ideas?

Thanx,

Russell	Smithies			
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From David.Messina at sbc.su.se  Fri Apr 18 05:31:59 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 18 Apr 2008 11:31:59 +0200
Subject: [Bioperl-l]  Finding seqs of given domain architecture
In-Reply-To: <628aabb70804170155n4e5dfd81r7020c3e9e11094ff@mail.gmail.com>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
	<628aabb70804161112o6610ee1fkfb4b08e74730237d@mail.gmail.com>
	<1208420674.23342.15.camel@razor.sbc.su.se>
	<628aabb70804170155n4e5dfd81r7020c3e9e11094ff@mail.gmail.com>
Message-ID: <628aabb70804180231p2b9cef9dwd5441e85c31531fd@mail.gmail.com>

Jacob,

I talked about your question with a colleague of mine who has been working
in this area. Below is his reply.

[I'm reposting this *without* the attachment mentioned since the mailing
list wouldn't accept it otherwise. If anyone wants a copy of the code, just
email me.]

Dave

-------

> 3. Pfam has this capability, i.e. to show all domains with a given
> architecture, but it is difficult to get at the actual sequences or
> even a list of accession numbers.

First, this should be available right away in PfamAlyser:

http://pfamalyzer.sbc.su.se/pfamalyzer/index.html

although you might need to upgrade your browser to Java 1.6 to get it to
work.

If this does not work as suggested (an upgraded version is coming
eventually), have a look at the file:

ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/swisspfam.gz

which contains the Pfam architectures for all UniProt sequences. You can
parse that to get a file of <accession number>-<list of domain>
correspondences and just filter that to get the accession numbers.
(Please find attached a Perl script to do just that.)

Under UNIX, you can then just grep this for the domain IDs,

(like grep domainArchitectureFile.txt PF00008 | grep PF00456 >
resultFile.txt)

but I am sure there are solutions under other operating systems as well.
You could then write a script to parse out the corresponding sequences
from the UniProt fasta flatfile, if you wanted, or (again under UNIX) a
script to wget them of the webpage.

In case your sequences are not in UniProt, consider using HMMER and the
Pfam HMM files to assign domains to all sequences in your dataset. I
would then parse the HMMER output into the same format as the above, and
use the same approach following that.

Hope this helps,

Yours sincerely,

Kristoffer Forslund
krifo at sbc.su.se


From lincoln.stein at gmail.com  Fri Apr 18 15:16:19 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Fri, 18 Apr 2008 15:16:19 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] accessing params for custom glyphs?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>
References: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
	<D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>
Message-ID: <6dce9a0b0804181216q6564e580u8a805ae96c78df2e@mail.gmail.com>

Hi Russell,

It's very simple:

   my $params = $self->option('new_parameter');

Lincoln

On Thu, Apr 17, 2008 at 9:39 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

> This is probably more of a Perl OO problem I'm having, but can anyone
> tell me how to access a parameter when I create a custom glyph?
>
> I've created a panel in the usual way then I add a feature with
> 'my_glyph' and want to pass the value of -new_parameter to the glyph
> drawing code.
>
>    $panel->add_track( $feature,
>                        -font => gdSmallFont,
>                        -glyph => 'my_glyph' ,
>                        -height => 10,
>                                -label  => 1,
>                                -strand => "forward",
>                                -new_parameter => "test",
>
>
> In my_glyph.pm, I have the usual draw_component sub:
>
> sub draw_component {
>  my $self = shift;
>  my $gd = shift;
>  my ($x1,$y1,$x2,$y2) = $self->bounds(@_);
>  my $fg = $self->fgcolor;
>  my $params = $self->??????????   <<--- how do I access the value of
> "new_parameter" set in the panel drawing code?
>
>  $gd->line($x1,$y1,$x2,$y2,$fg);
>  $gd->line($x1,$y2,$x2,$y1,$fg);
>
> }
>
> Any ideas?
>
> Thanx,
>
> Russell Smithies
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Don't miss this year's exciting event. There's still time to save $100.
> Use priority code J8TL2D2.
>
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From jason at bioperl.org  Fri Apr 18 22:35:10 2008
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 18 Apr 2008 19:35:10 -0700
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <1208381947.16620.6.camel@kiss-laptop>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
	<1208381947.16620.6.camel@kiss-laptop>
Message-ID: <A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>

do you want the LOCUS or the ACCESSION?
Do you mean the result is the completely wrong record or just the  
wrong field?
accession number is available from the seq's accession_number() method.
-jason
On Apr 16, 2008, at 2:39 PM, Fr?d?ric Romagn? wrote:

> Well, if with input file you mean the database used, it's created
> with Bio::Index::GenBank from a ncbi FTP's genbank file.
>
> $id is an accession number read from a file but i chomp the line...
>
> I am trying to install the svn version of bioperl under windows to see
> if there is an improvement.
>
> Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
>> Did you check the format of your input file?
>> i.e. DOS or UNIX line endings?
>>
>>> -----Original Message-----
>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
>>> bounces at lists.open-
>>> bio.org] On Behalf Of Fr?d?ric Romagn?
>>> Sent: Thursday, 17 April 2008 5:25 a.m.
>>> To: bioperl-l at lists.open-bio.org
>>> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
>>>
>>> Hello,
>>> i made a program which use Bio::Index::GenBank and i tested it under
>>> unix, that worked well.
>>>
>>> But i have to launch it under windows and it seems not to work on.
>>>
>>> Here is the problem :
>>>
>>> my $dbobj = Bio::Index::Abstract->new("Data/$db");
>>> my $seq = $dbobj->get_Seq_by_acc($id);
>>> print $seq->display_id."\n";
>>>
>>> did not print the same number than $id !!! So i don't work on the
>>> sequence expected...
>>>
>>> I use the SVN sources on unix and the Perl package manager for
>>> windows...
>>>
>>> Thanks.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> ===================================================================== 
>> ==
>> Attention: The information contained in this message and/or  
>> attachments
>> from AgResearch Limited is intended only for the persons or entities
>> to which it is addressed and may contain confidential and/or  
>> privileged
>> material. Any review, retransmission, dissemination or other use  
>> of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipients is prohibited by  
>> AgResearch
>> Limited. If you have received this message in error, please notify  
>> the
>> sender immediately.
>> ===================================================================== 
>> ==
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bioperlanand at yahoo.com  Mon Apr 21 03:44:00 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Mon, 21 Apr 2008 00:44:00 -0700 (PDT)
Subject: [Bioperl-l] a question on obtaining HTML formatted Blast output
	along with the Blast hits image
Message-ID: <372845.37134.qm@web36808.mail.mud.yahoo.com>


 Hi everybody,

I would like to obtain a HTML formatted blast report output along with a picture of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I have gotten the HTML output working using "Bio::SearchIO::Writer::HTMLResultWriter".

My question: How do I integrate it with Bio:Graphics to render the blast hits image at the correct position in my Bioperl reformatted html file.

I ultimately want to be able to display my blast output files on a browser. 

Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;

my $infile = shift or die $!;

my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile );
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                -file   => ">${infile}.html");

$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------

Thanks in advance,

Anand


---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.


From cjfields at uiuc.edu  Mon Apr 21 11:07:17 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:07:17 -0500
Subject: [Bioperl-l] [Proposed change] HSP::frame()
Message-ID: <ACE26E05-7C02-46E3-B973-E0529C0A0DEA@uiuc.edu>

I have noticed (in relation to bug 2485, http://bugzilla.open-bio.org/show_bug.cgi?id=2485) 
  that the Bio::Search::HSP::GenericHSP frame() method is implemented  
very differently from strand(), start(), end(), and most other HSP  
methods.  The current behavior is to return an array of two values  
(query and hit frame) under list conditions, the query frame if one  
value is passed, and the subject frame if no value is passed under  
scalar context and both under list context.  The latter behavior is  
unfortunately leading to the aforementioned bug above.  The method is  
also implied to be a getter/setter, but the implementation doesn't  
allow that; it always sets to the instantiated values (in fact,  
repeatedly so).

In order to fix that and make the interface more consistent I am  
changing frame() to behave like strand(), etc., in that the first  
argument is 'query/subject/hit/list' (default = 'query' if no arg  
specified) and the rest optional values for setting, in query/subject  
order.

One issue: I can catch and imitate most of the older behavior with a  
few additional checks, the one exception being the old frame() default  
return value which is now 'query' (not context-dependent).  If needed  
we can change the default to 'hit', but I believe method consistency  
is probably the better route, and I can always add a warning under old  
API circumstances indicating the change.

I am also modifying HSPTableWriter to print frame_hit and frame_query  
(previously it was only printing 'frame', which implied the hit  
frame).  I can see this being an issue with anyone expecting 'frame'  
instead of 'frame_hit';  I could hack in a fix for that if needed.

If there aren't any objections or suggestions, I'll commit this in the  
next day or two.

chris


From cjfields at uiuc.edu  Mon Apr 21 11:32:59 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:32:59 -0500
Subject: [Bioperl-l] Assembly.t test fails
Message-ID: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>

I'm getting some significant test failures in bioperl-live for  
Bio::Assembly:

t/Assembly......
1..35
ok 1 - use Bio::Assembly::IO;
ok 2 - The object isa Bio::Assembly::IO
ok 3 - The object isa Bio::Assembly::Scaffold
ok 4
not ok 5
ok 6 - The object isa Bio::AnnotationCollectionI
ok 7 - no annotations in Annotation collection?
ok 8

#   Failed test at t/Assembly.t line 35.
#          got: 'NoName'
#     expected: 'test'
Can't locate object method "get_contig_seq_ids" via package  
"Bio::Assembly::Contig" at /Users/cjfields/bioperl/bioperl-live/blib/ 
lib/Bio/Assembly/Scaffold.pm line 189, <GEN0> line 733.
# Looks like you planned 35 tests but only ran 8.
# Looks like you failed 1 test of 8 run.
# Looks like your test died just after 8.
  Dubious, test returned 255 (wstat 65280, 0xff00)
  Failed 28/35 subtests

Test Summary Report
-------------------
t/Assembly.t (Wstat: 65280 Tests: 8 Failed: 1)
   Failed test:  5
   Non-zero exit status: 255
   Parse errors: Bad plan.  You planned 35 tests but ran 8.
Files=1, Tests=8,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.22 cusr   
0.04 csys =  0.27 CPU)
Result: FAIL
Failed 1/1 test programs. 1/8 subtests failed.


chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Apr 21 11:44:21 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:44:21 -0500
Subject: [Bioperl-l] Assembly.t test fails
In-Reply-To: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>
References: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>
Message-ID: <2F199628-717E-4F88-85D7-408BD7BBE16D@uiuc.edu>

Scratch that, figured it out (easy fix).

chris

On Apr 21, 2008, at 10:32 AM, Chris Fields wrote:

> I'm getting some significant test failures in bioperl-live for  
> Bio::Assembly:
>
> t/Assembly......
> 1..35
> ok 1 - use Bio::Assembly::IO;
> ok 2 - The object isa Bio::Assembly::IO
> ok 3 - The object isa Bio::Assembly::Scaffold
> ok 4
> not ok 5
> ok 6 - The object isa Bio::AnnotationCollectionI
> ok 7 - no annotations in Annotation collection?
> ok 8
>
> #   Failed test at t/Assembly.t line 35.
> #          got: 'NoName'
> #     expected: 'test'
> Can't locate object method "get_contig_seq_ids" via package  
> "Bio::Assembly::Contig" at /Users/cjfields/bioperl/bioperl-live/blib/ 
> lib/Bio/Assembly/Scaffold.pm line 189, <GEN0> line 733.
> # Looks like you planned 35 tests but only ran 8.
> # Looks like you failed 1 test of 8 run.
> # Looks like your test died just after 8.
> Dubious, test returned 255 (wstat 65280, 0xff00)
> Failed 28/35 subtests
>
> Test Summary Report
> -------------------
> t/Assembly.t (Wstat: 65280 Tests: 8 Failed: 1)
>  Failed test:  5
>  Non-zero exit status: 255
>  Parse errors: Bad plan.  You planned 35 tests but ran 8.
> Files=1, Tests=8,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.22  
> cusr  0.04 csys =  0.27 CPU)
> Result: FAIL
> Failed 1/1 test programs. 1/8 subtests failed.
>
>
> chris
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From frederic.romagne at gmail.com  Mon Apr 21 11:53:11 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Mon, 21 Apr 2008 10:53:11 -0500
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
	<1208381947.16620.6.camel@kiss-laptop>
	<A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>
Message-ID: <1208793191.25906.9.camel@kiss-laptop>

In fact, i want the whole Bio::Seq object, but the i verified the
ACCESSION and the LOCUS are the same in my genbank files.
I saw that the program sometimes tells that it cannot find the entry :

 if( !defined $seq ) {
	warn("Sequence $id in Database $db is not present\n");
    }

i suspect the make_index function not to work properly on windows
instead of the ?get_Seq_by_acc function...

Le vendredi 18 avril 2008 ? 19:35 -0700, Jason Stajich a ?crit :
> do you want the LOCUS or the ACCESSION?
> Do you mean the result is the completely wrong record or just the  
> wrong field?
> accession number is available from the seq's accession_number() method.
> -jason
> On Apr 16, 2008, at 2:39 PM, Fr?d?ric Romagn? wrote:
> 
> > Well, if with input file you mean the database used, it's created
> > with Bio::Index::GenBank from a ncbi FTP's genbank file.
> >
> > $id is an accession number read from a file but i chomp the line...
> >
> > I am trying to install the svn version of bioperl under windows to see
> > if there is an improvement.
> >
> > Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
> >> Did you check the format of your input file?
> >> i.e. DOS or UNIX line endings?
> >>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> >>> bounces at lists.open-
> >>> bio.org] On Behalf Of Fr?d?ric Romagn?
> >>> Sent: Thursday, 17 April 2008 5:25 a.m.
> >>> To: bioperl-l at lists.open-bio.org
> >>> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> >>>
> >>> Hello,
> >>> i made a program which use Bio::Index::GenBank and i tested it under
> >>> unix, that worked well.
> >>>
> >>> But i have to launch it under windows and it seems not to work on.
> >>>
> >>> Here is the problem :
> >>>
> >>> my $dbobj = Bio::Index::Abstract->new("Data/$db");
> >>> my $seq = $dbobj->get_Seq_by_acc($id);
> >>> print $seq->display_id."\n";
> >>>
> >>> did not print the same number than $id !!! So i don't work on the
> >>> sequence expected...
> >>>
> >>> I use the SVN sources on unix and the Perl package manager for
> >>> windows...
> >>>
> >>> Thanks.
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> ===================================================================== 
> >> ==
> >> Attention: The information contained in this message and/or  
> >> attachments
> >> from AgResearch Limited is intended only for the persons or entities
> >> to which it is addressed and may contain confidential and/or  
> >> privileged
> >> material. Any review, retransmission, dissemination or other use  
> >> of, or
> >> taking of any action in reliance upon, this information by persons or
> >> entities other than the intended recipients is prohibited by  
> >> AgResearch
> >> Limited. If you have received this message in error, please notify  
> >> the
> >> sender immediately.
> >> ===================================================================== 
> >> ==
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From ewijaya at gmail.com  Tue Apr 22 10:03:07 2008
From: ewijaya at gmail.com (Edward Wijaya)
Date: Tue, 22 Apr 2008 22:03:07 +0800
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
Message-ID: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>

Hi,

Is there any module that can parse the following output
of BLAT. This is taken from UCSC browser.

The idea is to parse it and then extract the conserved block
of aligned sequences.


__DATA__
Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
B D   D. melanogaster
tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
B D       D. simulans
tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
B D      D. sechellia
tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
B D         D. yakuba
tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
            D. erecta
tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
         D. ananassae
taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
     D. pseudoobscura
tata----ccagtacac-cttatatg------------tttttaaata--------------------
B D     D. persimilis
tata----ccagtacac-attatatg------------tttttaaata--------------------
        D. willistoni
aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
           D. virilis
-------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
        D. mojavensis
-------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
         D. grimshawi
====================================================================
         T. castaneum
====================================================================

Inserts between block 3 and 4 in window
    D. pseudoobscura 2008bp
B D    D. persimilis 1421bp
          D. virilis 5bp
       D. mojavensis 4640bp

Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
B D   D. melanogaster
----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
B D       D. simulans
----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
B D      D. sechellia
----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
B D         D. yakuba
----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
            D. erecta
----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
     D. pseudoobscura
====================================================================
B D     D. persimilis
====================================================================
        D. willistoni
----aggattacgaagttcctttat-------------------aaag--------------------
           D. virilis
gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
        D. mojavensis
====================================================================
         D. grimshawi
====================================================================
         T. castaneum
====================================================================

__ END__


From cjfields at uiuc.edu  Tue Apr 22 10:22:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 09:22:45 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
Message-ID: <766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>

A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!

chris

On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:

> Hi,
>
> Is there any module that can parse the following output
> of BLAT. This is taken from UCSC browser.
>
> The idea is to parse it and then extract the conserved block
> of aligned sequences.
>
>
> __DATA__
> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
> B D   D. melanogaster
> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
> B D       D. simulans
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
> B D      D. sechellia
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
> B D         D. yakuba
> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>            D. erecta
> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>         D. ananassae
> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>     D. pseudoobscura
> tata----ccagtacac-cttatatg------------tttttaaata--------------------
> B D     D. persimilis
> tata----ccagtacac-attatatg------------tttttaaata--------------------
>        D. willistoni
> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>           D. virilis
> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>        D. mojavensis
> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>         D. grimshawi
> ====================================================================
>         T. castaneum
> ====================================================================
>
> Inserts between block 3 and 4 in window
>    D. pseudoobscura 2008bp
> B D    D. persimilis 1421bp
>          D. virilis 5bp
>       D. mojavensis 4640bp
>
> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
> B D   D. melanogaster
> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
> B D       D. simulans
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D      D. sechellia
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D         D. yakuba
> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>            D. erecta
> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>     D. pseudoobscura
> ====================================================================
> B D     D. persimilis
> ====================================================================
>        D. willistoni
> ----aggattacgaagttcctttat-------------------aaag--------------------
>           D. virilis
> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>        D. mojavensis
> ====================================================================
>         D. grimshawi
> ====================================================================
>         T. castaneum
> ====================================================================
>
> __ END__
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 22 10:59:25 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 09:59:25 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
Message-ID: <4F3522BB-28F0-44A8-8DE1-7CF3F648402A@uiuc.edu>

A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!

chris

On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:

> Hi,
>
> Is there any module that can parse the following output
> of BLAT. This is taken from UCSC browser.
>
> The idea is to parse it and then extract the conserved block
> of aligned sequences.
>
>
> __DATA__
> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
> B D   D. melanogaster
> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
> B D       D. simulans
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
> B D      D. sechellia
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
> B D         D. yakuba
> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>           D. erecta
> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>        D. ananassae
> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>    D. pseudoobscura
> tata----ccagtacac-cttatatg------------tttttaaata--------------------
> B D     D. persimilis
> tata----ccagtacac-attatatg------------tttttaaata--------------------
>       D. willistoni
> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>          D. virilis
> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>       D. mojavensis
> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>        D. grimshawi
> ====================================================================
>        T. castaneum
> ====================================================================
>
> Inserts between block 3 and 4 in window
>   D. pseudoobscura 2008bp
> B D    D. persimilis 1421bp
>         D. virilis 5bp
>      D. mojavensis 4640bp
>
> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
> B D   D. melanogaster
> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
> B D       D. simulans
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D      D. sechellia
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D         D. yakuba
> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>           D. erecta
> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>    D. pseudoobscura
> ====================================================================
> B D     D. persimilis
> ====================================================================
>       D. willistoni
> ----aggattacgaagttcctttat-------------------aaag--------------------
>          D. virilis
> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>       D. mojavensis
> ====================================================================
>        D. grimshawi
> ====================================================================
>        T. castaneum
> ====================================================================
>
> __ END__
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Tue Apr 22 14:49:32 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 11:49:32 -0700
Subject: [Bioperl-l] Fwd: [blast-announce] New BLAST URL available at the
	NCBI
References: <EEEED756EF6626469B10653F745014389BAEAD@NIHCESMLBX15.nih.gov>
Message-ID: <F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>

Does anyone want to take a look at how to use these URLs in the  
RemoteBlast module, if the interface is the same?

-jason

Begin forwarded message:

> From: "Mcginnis, Scott (NIH/NLM/NCBI) [E]" <mcginnis at ncbi.nlm.nih.gov>
> Date: April 22, 2008 11:35:04 AM PDT
> To: <blast-announce at ncbi.nlm.nih.gov>
> Subject: [blast-announce] New BLAST URL available at the NCBI
>
> New BLAST URL available at the NCBI
>
>
>
> The NCBI has activated a new URL for BLAST searches at the NCBI:
> http://blast.ncbi.nlm.nih.gov.
>
>
>
> Searches sent to this URL can take advantage of a larger number of
> machines for searches and the system has a better overall fault
> tolerance.
>
>
>
> We recommend migration of all BLAST links and bookmarks (e.g.,
> http://www.ncbi.nlm.nih.gov/BLAST/ and
> http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) to the new URL.
>
>
>
> Links on the NCBI and BLAST home pages will start to change in the
> coming weeks.
>
>
>
> At this point in time the plans are to also maintain the current BLAST
> URL.
>
>
>
>
>


From jason at bioperl.org  Tue Apr 22 14:51:08 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 11:51:08 -0700
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
	<766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
Message-ID: <6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>

if you get it as axt it should parse fine in SearchIO but that is  
pairwise, if you can get an alignment blocks I can't remember what  
format this is from UCSC.
MSAs are going to be better handed through Bio::AlignIO though so it  
might be better to build a parser on that.

On Apr 22, 2008, at 7:22 AM, Chris Fields wrote:

> A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
> Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
> Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!
>
> chris
>
> On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:
>
>> Hi,
>>
>> Is there any module that can parse the following output
>> of BLAT. This is taken from UCSC browser.
>>
>> The idea is to parse it and then extract the conserved block
>> of aligned sequences.
>>
>>
>> __DATA__
>> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
>> B D   D. melanogaster
>> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
>> B D       D. simulans
>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
>> B D      D. sechellia
>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
>> B D         D. yakuba
>> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>>            D. erecta
>> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>>         D. ananassae
>> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>>     D. pseudoobscura
>> tata----ccagtacac-cttatatg------------tttttaaata--------------------
>> B D     D. persimilis
>> tata----ccagtacac-attatatg------------tttttaaata--------------------
>>        D. willistoni
>> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>>           D. virilis
>> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>>        D. mojavensis
>> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>>         D. grimshawi
>> ====================================================================
>>         T. castaneum
>> ====================================================================
>>
>> Inserts between block 3 and 4 in window
>>    D. pseudoobscura 2008bp
>> B D    D. persimilis 1421bp
>>          D. virilis 5bp
>>       D. mojavensis 4640bp
>>
>> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
>> B D   D. melanogaster
>> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
>> B D       D. simulans
>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>> B D      D. sechellia
>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>> B D         D. yakuba
>> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>>            D. erecta
>> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>>     D. pseudoobscura
>> ====================================================================
>> B D     D. persimilis
>> ====================================================================
>>        D. willistoni
>> ----aggattacgaagttcctttat-------------------aaag--------------------
>>           D. virilis
>> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>>        D. mojavensis
>> ====================================================================
>>         D. grimshawi
>> ====================================================================
>>         T. castaneum
>> ====================================================================
>>
>> __ END__
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Apr 22 15:02:14 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 14:02:14 -0500
Subject: [Bioperl-l] Fwd: [blast-announce] New BLAST URL available at
	the NCBI
In-Reply-To: <F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>
References: <EEEED756EF6626469B10653F745014389BAEAD@NIHCESMLBX15.nih.gov>
	<F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>
Message-ID: <13C2AD96-8297-40DD-ADCC-B2BEC923B9E0@uiuc.edu>

They work exactly the same as the old URL, at least on the surface; I  
haven't tried changing many URLAPI parameters.  I went ahead and  
changed the URL in RemoteBlast to http://blast.ncbi.nlm.nih.gov/Blast.cgi 
  as it works with RemoteBlast.t.

chris

On Apr 22, 2008, at 1:49 PM, Jason Stajich wrote:

> Does anyone want to take a look at how to use these URLs in the  
> RemoteBlast module, if the interface is the same?
>
> -jason
>
> Begin forwarded message:
>
>> From: "Mcginnis, Scott (NIH/NLM/NCBI) [E]"  
>> <mcginnis at ncbi.nlm.nih.gov>
>> Date: April 22, 2008 11:35:04 AM PDT
>> To: <blast-announce at ncbi.nlm.nih.gov>
>> Subject: [blast-announce] New BLAST URL available at the NCBI
>>
>> New BLAST URL available at the NCBI
>>
>>
>>
>> The NCBI has activated a new URL for BLAST searches at the NCBI:
>> http://blast.ncbi.nlm.nih.gov.
>>
>>
>>
>> Searches sent to this URL can take advantage of a larger number of
>> machines for searches and the system has a better overall fault
>> tolerance.
>>
>>
>>
>> We recommend migration of all BLAST links and bookmarks (e.g.,
>> http://www.ncbi.nlm.nih.gov/BLAST/ and
>> http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) to the new URL.
>>
>>
>>
>> Links on the NCBI and BLAST home pages will start to change in the
>> coming weeks.
>>
>>
>>
>> At this point in time the plans are to also maintain the current  
>> BLAST
>> URL.
>>
>>
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 22 14:58:40 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 13:58:40 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
	<766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
	<6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>
Message-ID: <43344C89-6B4D-4360-AF56-A6FDD065FFF3@uiuc.edu>

Related to that, I have thought about building a parser for some of  
the query-anchored alignments produced by blastall, just haven't had  
time to devote to it.  One of these days...

chris

On Apr 22, 2008, at 1:51 PM, Jason Stajich wrote:

> if you get it as axt it should parse fine in SearchIO but that is  
> pairwise, if you can get an alignment blocks I can't remember what  
> format this is from UCSC.
> MSAs are going to be better handed through Bio::AlignIO though so it  
> might be better to build a parser on that.
>
> On Apr 22, 2008, at 7:22 AM, Chris Fields wrote:
>
>> A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
>> Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
>> Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!
>>
>> chris
>>
>> On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:
>>
>>> Hi,
>>>
>>> Is there any module that can parse the following output
>>> of BLAT. This is taken from UCSC browser.
>>>
>>> The idea is to parse it and then extract the conserved block
>>> of aligned sequences.
>>>
>>>
>>> __DATA__
>>> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
>>> B D   D. melanogaster
>>> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
>>> B D       D. simulans
>>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
>>> B D      D. sechellia
>>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
>>> B D         D. yakuba
>>> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>>>           D. erecta
>>> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>>>        D. ananassae
>>> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>>>    D. pseudoobscura
>>> tata----ccagtacac-cttatatg------------tttttaaata--------------------
>>> B D     D. persimilis
>>> tata----ccagtacac-attatatg------------tttttaaata--------------------
>>>       D. willistoni
>>> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>>>          D. virilis
>>> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>>>       D. mojavensis
>>> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>>>        D. grimshawi
>>> ====================================================================
>>>        T. castaneum
>>> ====================================================================
>>>
>>> Inserts between block 3 and 4 in window
>>>   D. pseudoobscura 2008bp
>>> B D    D. persimilis 1421bp
>>>         D. virilis 5bp
>>>      D. mojavensis 4640bp
>>>
>>> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
>>> B D   D. melanogaster
>>> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
>>> B D       D. simulans
>>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>>> B D      D. sechellia
>>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>>> B D         D. yakuba
>>> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>>>           D. erecta
>>> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>>>    D. pseudoobscura
>>> ====================================================================
>>> B D     D. persimilis
>>> ====================================================================
>>>       D. willistoni
>>> ----aggattacgaagttcctttat-------------------aaag--------------------
>>>          D. virilis
>>> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>>>       D. mojavensis
>>> ====================================================================
>>>        D. grimshawi
>>> ====================================================================
>>>        T. castaneum
>>> ====================================================================
>>>
>>> __ END__
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bioperlanand at yahoo.com  Wed Apr 23 02:02:30 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Tue, 22 Apr 2008 23:02:30 -0700 (PDT)
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
Message-ID: <946658.12337.qm@web36802.mail.mud.yahoo.com>

Hi everybody,

I would like to use Bio::Graphics in conjunction with Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted blast report output along with an image of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I am able to get the HTML output using  "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the image using the examples outlined in the Bio::Graphics HOWTO: http://www.bioperl.org/wiki/HOWTO:Graphics

My question: How do I integrate Bio::Graphics with Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits image at the correct position in my BioPerl reformatted html file.

I also found that someone else has asked something similar to whatever I am asking & is listed under the "Orphans, Leftovers" category in the ListSummary:April 26-May 9,2006 document: 
http://www.bioperl.org/wiki/ListSummary:April_26-May_9%2C2006#Orphans.2C_Leftovers

Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;

my $infile = shift or die $!;

my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile);
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                                  -file   => ">${infile}.html");

$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------

Thanks in advance,

Anand

       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.


From jason at bioperl.org  Wed Apr 23 02:15:28 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 23:15:28 -0700
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <946658.12337.qm@web36802.mail.mud.yahoo.com>
References: <946658.12337.qm@web36802.mail.mud.yahoo.com>
Message-ID: <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org>


Basically you want to inject your own IMG tags into the file with  
these routines:

     $writerhtml->start_report(\&my_start_report);
     $writerhtml->title(\&my_title);
     $writerhtml->hit_link_align(\&my_hit_link_align);
     $writerhtml->hit_link_desc(\&my_hit_link_desc);

fgblast shows a way to do this in part. It relies on Gbrowse to  
generate the image but you can replace the gbrowse_img reference to  
your own image generating software.

http://people.genome.duke.edu/~jes12/software/scripts/fgblast

-jason
On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:

> Hi everybody,
>
> I would like to use Bio::Graphics in conjunction with  
> Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted  
> blast report output along with an image of the blast hits as shown  
> on Slide 60 in this pdf: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
> I am able to get the HTML output using   
> "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the  
> image using the examples outlined in the Bio::Graphics HOWTO:  
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
> My question: How do I integrate Bio::Graphics with  
> Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits  
> image at the correct position in my BioPerl reformatted html file.
>
> I also found that someone else has asked something similar to  
> whatever I am asking & is listed under the "Orphans, Leftovers"  
> category in the ListSummary:April 26-May 9,2006 document:
> http://www.bioperl.org/wiki/ListSummary:April_26-May_9% 
> 2C2006#Orphans.2C_Leftovers
>
> Here is my code so far:
> ----------------------------------------------------------------
> #!/usr/bin/perl -w
> # usage: $0 <blast_report>
> use strict;
> use Bio::SearchIO;
> use Bio::SearchIO::Writer::HTMLResultWriter;
>
> my $infile = shift or die $!;
>
> my $searchio = new Bio::SearchIO( -format => 'blast',-file   =>  
> $infile);
> my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
> my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
>                                                   -file   => ">$ 
> {infile}.html");
>
> $outhtml->write_result($searchio->next_result);
> ----------------------------------------------------------------
>
> Thanks in advance,
>
> Anand
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bamboowarrior at gmail.com  Wed Apr 23 15:39:21 2008
From: bamboowarrior at gmail.com (Arkady)
Date: Wed, 23 Apr 2008 14:39:21 -0500
Subject: [Bioperl-l] WebBlat, where'd it go?
Message-ID: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>

Hi folks,

I'm trying to use BioPerl to run a BLAT search on the four primate
genomes on UCSC. I understand that the proper tool for this is
Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
bioperl distribution (nor do I even know how to figure out what
version that is, unfortunately, though it's a very recent install -- a
month ago?). I also can't find it on CPAN. Is this deprecated? Has
something else replaced it? Or are we always supposed to run local
BLAT?

Thanks.

John Woods

Institute for Cellular and Molecular Biology
The University of Texas at Austin


From spiros at lokku.com  Wed Apr 23 15:48:12 2008
From: spiros at lokku.com (Spiros Denaxas)
Date: Wed, 23 Apr 2008 20:48:12 +0100
Subject: [Bioperl-l] WebBlat, where'd it go?
In-Reply-To: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
References: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
Message-ID: <bba689ec0804231248s47034503y3cbf0512e4344843@mail.gmail.com>

Hey,

a quick look at the list of deprecated modules reveals that it has
indeed been removed,

http://www.bioperl.org/wiki/Deprecated_modules

Spiros

On Wed, Apr 23, 2008 at 8:39 PM, Arkady <bamboowarrior at gmail.com> wrote:
> Hi folks,
>
>  I'm trying to use BioPerl to run a BLAT search on the four primate
>  genomes on UCSC. I understand that the proper tool for this is
>  Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
>  bioperl distribution (nor do I even know how to figure out what
>  version that is, unfortunately, though it's a very recent install -- a
>  month ago?). I also can't find it on CPAN. Is this deprecated? Has
>  something else replaced it? Or are we always supposed to run local
>  BLAT?
>
>  Thanks.
>
>  John Woods
>
>  Institute for Cellular and Molecular Biology
>  The University of Texas at Austin
>  _______________________________________________
>  Bioperl-l mailing list
>  Bioperl-l at lists.open-bio.org
>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Wed Apr 23 15:56:14 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 23 Apr 2008 14:56:14 -0500
Subject: [Bioperl-l] WebBlat, where'd it go?
In-Reply-To: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
References: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
Message-ID: <AF7BBBC2-6A6E-486A-872C-8BB8B0A7FC0C@uiuc.edu>

It's no longer maintained (deprecated); see the following for an  
explanation:

http://article.gmane.org/gmane.comp.lang.perl.bio.general/13545

Basically, only local BLAT searches are supported through BioPerl.

chris

On Apr 23, 2008, at 2:39 PM, Arkady wrote:

> Hi folks,
>
> I'm trying to use BioPerl to run a BLAT search on the four primate
> genomes on UCSC. I understand that the proper tool for this is
> Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
> bioperl distribution (nor do I even know how to figure out what
> version that is, unfortunately, though it's a very recent install -- a
> month ago?). I also can't find it on CPAN. Is this deprecated? Has
> something else replaced it? Or are we always supposed to run local
> BLAT?
>
> Thanks.
>
> John Woods
>
> Institute for Cellular and Molecular Biology
> The University of Texas at Austin
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bioperlanand at yahoo.com  Wed Apr 23 19:05:27 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Wed, 23 Apr 2008 16:05:27 -0700 (PDT)
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org>
Message-ID: <795696.39415.qm@web36804.mail.mud.yahoo.com>

Hi Jason,

Thanks for the reply.

I am a little lost with the solution suggested. Is that how slide 60 in the pdf is obtained: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I guess I am missing something quite obvious, I apologize.

What I have & want is this: I have a directory having say 100 different blast reports & hence I am looking to obtain 100 different bioperl formatted blast html outputs with the respective images just as it would appear in the blast report.

Thanks,

Anand

Jason Stajich <jason at bioperl.org> wrote: 

Basically you want to inject your own IMG tags into the file with these routines:


    $writerhtml->start_report(\&my_start_report);
    $writerhtml->title(\&my_title);
    $writerhtml->hit_link_align(\&my_hit_link_align);
    $writerhtml->hit_link_desc(\&my_hit_link_desc);


fgblast shows a way to do this in part. It relies on Gbrowse to generate the image but you can replace the gbrowse_img reference to your own image generating software.
http://people.genome.duke.edu/~jes12/software/scripts/fgblast


-jason
On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:

Hi everybody,


I would like to use Bio::Graphics in conjunction with Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted blast report output along with an image of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf


I am able to get the HTML output using  "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the image using the examples outlined in the Bio::Graphics HOWTO: http://www.bioperl.org/wiki/HOWTO:Graphics


My question: How do I integrate Bio::Graphics with Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits image at the correct position in my BioPerl reformatted html file.


I also found that someone else has asked something similar to whatever I am asking & is listed under the "Orphans, Leftovers" category in the ListSummary:April 26-May 9,2006 document: 
http://www.bioperl.org/wiki/ListSummary:April_26-May_9%2C2006#Orphans.2C_Leftovers


Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;


my $infile = shift or die $!;


my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile);
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                                  -file   => ">${infile}.html");


$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------


Thanks in advance,


Anand


---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
 

---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.


From jason at bioperl.org  Thu Apr 24 14:06:41 2008
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 24 Apr 2008 11:06:41 -0700
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <795696.39415.qm@web36804.mail.mud.yahoo.com>
References: <795696.39415.qm@web36804.mail.mud.yahoo.com>
Message-ID: <D47EBDB9-C15C-44A7-9376-89FA946270DD@bioperl.org>

The overview graphic is generated basically from the script in  
scripts/graphics/search_overview.PLS

So you'd have to run that on each report to generate the graphic,  
then use the other methods  to insert <img src="NAME"> images into  
each rendered HTML report.

-jason

On Apr 23, 2008, at 4:05 PM, Anand Venkatraman wrote:

> Hi Jason,
>
> Thanks for the reply.
>
> I am a little lost with the solution suggested. Is that how slide  
> 60 in the pdf is obtained: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
> I guess I am missing something quite obvious, I apologize.
>
> What I have & want is this: I have a directory having say 100  
> different blast reports & hence I am looking to obtain 100  
> different bioperl formatted blast html outputs with the respective  
> images just as it would appear in the blast report.
>
> Thanks,
>
> Anand
>
> Jason Stajich <jason at bioperl.org> wrote:
>
> Basically you want to inject your own IMG tags into the file with  
> these routines:
>
>
>     $writerhtml->start_report(\&my_start_report);
>     $writerhtml->title(\&my_title);
>     $writerhtml->hit_link_align(\&my_hit_link_align);
>     $writerhtml->hit_link_desc(\&my_hit_link_desc);
>
>
> fgblast shows a way to do this in part. It relies on Gbrowse to  
> generate the image but you can replace the gbrowse_img reference to  
> your own image generating software.
> http://people.genome.duke.edu/~jes12/software/scripts/fgblast
>
>
>
>
> -jason
> On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:
>
> Hi everybody,
>
>
> I would like to use Bio::Graphics in conjunction with  
> Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted  
> blast report output along with an image of the blast hits as shown  
> on Slide 60 in this pdf: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
>
> I am able to get the HTML output using   
> "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the  
> image using the examples outlined in the Bio::Graphics HOWTO:  
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
>
> My question: How do I integrate Bio::Graphics with  
> Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits  
> image at the correct position in my BioPerl reformatted html file.
>
>
> I also found that someone else has asked something similar to  
> whatever I am asking & is listed under the "Orphans, Leftovers"  
> category in the ListSummary:April 26-May 9,2006 document:
> http://www.bioperl.org/wiki/ListSummary:April_26-May_9% 
> 2C2006#Orphans.2C_Leftovers
>
>
> Here is my code so far:
> ----------------------------------------------------------------
> #!/usr/bin/perl -w
> # usage: $0 <blast_report>
> use strict;
> use Bio::SearchIO;
> use Bio::SearchIO::Writer::HTMLResultWriter;
>
>
> my $infile = shift or die $!;
>
>
> my $searchio = new Bio::SearchIO( -format => 'blast',-file   =>  
> $infile);
> my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
> my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
>                                                   -file   => ">$ 
> {infile}.html");
>
>
> $outhtml->write_result($searchio->next_result);
> ----------------------------------------------------------------
>
>
> Thanks in advance,
>
>
> Anand
>
>
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.


From 1zoujing at 163.com  Wed Apr 16 22:53:16 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 16 Apr 2008 19:53:16 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>
References: <16602770.post@talk.nabble.com> <16603225.post@talk.nabble.com>
	<Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>
Message-ID: <16737795.post@talk.nabble.com>


    Thank you very much!
I splited the file on \t directly.

   Zou Jing


Stefan Kirov-2 wrote:
> 
> It is not. If you use this file, why would you need a parser for it 
> anyway? Just split on \t or read with OpenOffice or equiv.
> Stefan
> 
> On Thu, 10 Apr 2008, zoujing wrote:
> 
>>
>> Seached  the web and found the answer now, quote the answer as following:
>>   The error was thrown by my Bio::ASN1::EntrezGene module because it
>> expects a text file, while you fed it with a binary file.  To use
>> gzipped ASN binary file from NCBI, download the NCBI gene2xml
>> (ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml),
>> then use this syntax to run my parser on the binary files:
>>
>> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
>> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped
>> binary file directly downloaded from NCBI
>>
>> Same syntax should be used when you're using SeqIO (thus
>> SeqIO::entrezgene).
>> Mingyi
>>
>>   But there still one thing, I want to parse "gene_info.gz" in Gene of
>> NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one
>> line
>> per GeneID, Column header line is the first line in the file
>> ) is not the right format for Bio::ASN1::EntrezGene?
>>
>>
>>
>> zoujing wrote:
>>>
>>>    I am a geen hand in Bioperl. When I run perl with
>>> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
>>> information:
>>>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>>>
>>>    But the Sus_scrofa.ags is download from NCBI, with the format of
>>> ASN1,
>>> should be the same as Homo_sapiens in the example. So it should be no
>>> error as the code is the example from Mingyi.
>>>    I wonder why this happen, and should I change something about the
>>> file?
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16737795.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Wed Apr 16 22:55:47 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 16 Apr 2008 19:55:47 -0700 (PDT)
Subject: [Bioperl-l] Bio::ASN1::EntrezGene parse so slowly?
In-Reply-To: <264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>
References: <16602210.post@talk.nabble.com>
	<264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>
Message-ID: <16737804.post@talk.nabble.com>


Thank you vey much!
  Solved the problem now.

   Jing

Sean Davis-3 wrote:
> 
> gene_info is a tab-delimited text file, if I recall correctly.  Have
> you looked at it?  If it is, you should be able to parse it in a few
> seconds with just a couple lines of code.
> 
> Sean
> 
> 
> On Thu, Apr 10, 2008 at 1:08 AM, zoujing <1zoujing at 163.com> wrote:
>>
>>   I want to parse a file "gene_info" from NCBI. The format of Gene in
>> NCBI is
>>  ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
>>  properly/too slow. The file is about 500M.
>>   The code is following:
>>   use Bio::ASN1::EntrezGene;
>>   my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
>>   my $i = 0;
>>   while(my $result = $parser->next_seq)
>>   { last; #something to do there, here use last for test}
>>
>>   When it goes to the "while" part, it is processing on and on, it does
>> not
>>  went out, even I used "last" in the "while" part.
>>    So I wonder whether it is too slow or the module is not fit for this
>> job,
>>  or I did something wrong?
>>
>>   Thank you!
>>  --
>>  View this message in context:
>> http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
>>  Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>>  _______________________________________________
>>  Bioperl-l mailing list
>>  Bioperl-l at lists.open-bio.org
>>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16737804.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From sbassi at clubdelarazon.org  Sat Apr 26 13:49:20 2008
From: sbassi at clubdelarazon.org (Sebastian Bassi)
Date: Sat, 26 Apr 2008 14:49:20 -0300
Subject: [Bioperl-l] bioperl installation problem
Message-ID: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>

I tried to install bioperl because I need to install cviewer.
Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and sdterr outputs.

Here is one of the errors I get:

set_attribute: not a compat02 graph at
/usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
sleeping for 3 seconds
set_attribute: not a compat02 graph at
/usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.

But I have GD::Graph, so I don't know what is going on:

sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install GD::Graph'
CPAN: Storable loaded ok
Going to read /home/sbassi/.cpan/Metadata
  Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
GD::Graph is up to date.

Any help regarding this: http://www.pastecode.com.ar/f37c1cd60
would be appreciated.

Best,
SB.

-- 
Sebasti?n Bassi (???????). Diplomado en Ciencia y Tecnolog?a.
Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
Mostr? tu c?digo: http://www.pastecode.com.ar
GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D


From jason at bioperl.org  Sat Apr 26 15:23:37 2008
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 26 Apr 2008 12:23:37 -0700
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
Message-ID: <B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>

the error refers to the 'Graph' module not 'GD::Graph';

-jason
On Apr 26, 2008, at 10:49 AM, Sebastian Bassi wrote:

> I tried to install bioperl because I need to install cviewer.
> Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and  
> sdterr outputs.
>
> Here is one of the errors I get:
>
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
> sleeping for 3 seconds
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.
>
> But I have GD::Graph, so I don't know what is going on:
>
> sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install GD::Graph'
> CPAN: Storable loaded ok
> Going to read /home/sbassi/.cpan/Metadata
>   Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
> GD::Graph is up to date.
>
> Any help regarding this: http://www.pastecode.com.ar/f37c1cd60
> would be appreciated.
>
> Best,
> SB.
>
> -- 
> Sebasti?n Bassi (???????). Diplomado en Ciencia y  
> Tecnolog?a.
> Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
> Mostr? tu c?digo: http://www.pastecode.com.ar
> GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sbassi at clubdelarazon.org  Sat Apr 26 17:08:13 2008
From: sbassi at clubdelarazon.org (Sebastian Bassi)
Date: Sat, 26 Apr 2008 18:08:13 -0300
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
	<B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>
Message-ID: <9e2f512b0804261408l45ff9f91j94f44065d21cd65f@mail.gmail.com>

On Sat, Apr 26, 2008 at 4:23 PM, Jason Stajich <jason at bioperl.org> wrote:
> the error refers to the 'Graph' module not 'GD::Graph';

You are right, but I have it also installed:

sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install Graph'
Password:
CPAN: Storable loaded ok
Going to read /home/sbassi/.cpan/Metadata
  Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
Graph is up to date.


-- 
Sebasti?n Bassi (???????). Diplomado en Ciencia y Tecnolog?a.
Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
Mostr? tu c?digo: http://www.pastecode.com.ar
GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D


From bix at sendu.me.uk  Sat Apr 26 19:30:56 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Sun, 27 Apr 2008 00:30:56 +0100
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
Message-ID: <4813BB30.6060703@sendu.me.uk>

Sebastian Bassi wrote:
> I tried to install bioperl because I need to install cviewer.
> Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and sdterr outputs.
> 
> Here is one of the errors I get:
> 
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
> sleeping for 3 seconds
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.

You're trying to install a very old version of Bioperl which apparently 
uses behaviour of the Graph module no longer supported:
http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod#Backward_compatibility_with_Graph_0.2

Your options are to force install your desired version of Bioperl (if 
you don't need to use the modules that are causing the errors you get), 
downgrade your version of Graph to pre-0.2, or install the latest 
version of Bioperl (1.5.2 or from svn).


From dr.hogart at gmail.com  Sun Apr 27 10:05:20 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Sun, 27 Apr 2008 18:05:20 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
Message-ID: <op.t99vyoejavnppr@hogart.hackers>

Hi all,

is it possible to add a GD::graphic object (chart) to Bio::Graphics panel  
to obtain a file with image of both the chart and bioseq object?


From Russell.Smithies at agresearch.co.nz  Sun Apr 27 17:27:23 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 28 Apr 2008 09:27:23 +1200
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <op.t99vyoejavnppr@hogart.hackers>
References: <op.t99vyoejavnppr@hogart.hackers>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>

You can get the GD object back from the Bio::Graphics::Panel  then draw
on it using GD methods

Eg:

#create a BioPerl panel
my $panel = Bio::Graphics::Panel->new(
                              			-length   => 600
                              			-width    => 800,
					-bgcolor  => 'white'
					);
# add your features
my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
200,);
$panel->add_track($feature, glyph   =>   'segments',
					-label   =>   0,
					-height  =>   30,
					-bgcolor  =>  'red',
					-fgcolor  => 'red'
					 );

# grab the GD thingy
my $gd = $panel->gd;

#create a color - not sure if there's a better way?
$black = $gd->colorAllocate(0,0,0);

#draw on your GD thingy
$gd->line(10,10,$panel->width -10,10,$black);
$gd->string(gdSmallFont,20,10,'test' ,'$black);

# print it as normal	
print $panel->png;


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-
> bio.org] On Behalf Of sergei ryazansky
> Sent: Monday, 28 April 2008 2:05 a.m.
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
> 
> Hi all,
> 
> is it possible to add a GD::graphic object (chart) to Bio::Graphics
panel
> to obtain a file with image of both the chart and bioseq object?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From dr.hogart at gmail.com  Sun Apr 27 20:25:18 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Mon, 28 Apr 2008 04:25:18 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
Message-ID: <op.uaaosgoeavnppr@hogart.hackers>

Thanks for answer!
Yours  script works fine, but nevertheless, as for as I understand 'gd'  
method return the gd::image object. But I need the to merge bioseq object  
with gd::graph object (gd::graph::area). Is it possible? Or maybe I  
misunderstood something in your example?


On Mon, 28 Apr 2008 01:27:23 +0400, Smithies, Russell  
<Russell.Smithies at agresearch.co.nz> wrote:

> You can get the GD object back from the Bio::Graphics::Panel  then draw
> on it using GD methods
>
> Eg:
>
> #create a BioPerl panel
> my $panel = Bio::Graphics::Panel->new(
>                               			-length   => 600
>                               			-width    => 800,
> 					-bgcolor  => 'white'
> 					);
> # add your features
> my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> 200,);
> $panel->add_track($feature, glyph   =>   'segments',
> 					-label   =>   0,
> 					-height  =>   30,
> 					-bgcolor  =>  'red',
> 					-fgcolor  => 'red'
> 					 );
>
> # grab the GD thingy
> my $gd = $panel->gd;
>
> #create a color - not sure if there's a better way?
> $black = $gd->colorAllocate(0,0,0);
>
> #draw on your GD thingy
> $gd->line(10,10,$panel->width -10,10,$black);
> $gd->string(gdSmallFont,20,10,'test' ,'$black);
>
> # print it as normal	
> print $panel->png;
>
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-
>> bio.org] On Behalf Of sergei ryazansky
>> Sent: Monday, 28 April 2008 2:05 a.m.
>> To: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
>>
>> Hi all,
>>
>> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> panel
>> to obtain a file with image of both the chart and bioseq object?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


From Bank.Beszteri at awi.de  Mon Apr 28 08:18:20 2008
From: Bank.Beszteri at awi.de (=?UTF-8?B?QsOhbmsgQmVzenRlcmk=?=)
Date: Mon, 28 Apr 2008 14:18:20 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FB204F.90405@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de>
Message-ID: <4815C08C.1060305@awi.de>

Dear BioSQL / bioperl-db-ists,

I would like  to share my experiences with trying to load uniprot_trembl 
into a BioSQL db, and also to ask a couple of questions; perhaps some of 
you know the problems I encountered. I used bioperl-live and 
bioperl-db-live as of 2008-04-03 and uniprot_trembl.dat as of 
2008-04-04. The command was like

load_seqdatabase.pl --safe --logchunk 1000 --host dbserv --dbname abc 
--dbuser efg --dbpass xyz --driver mysql --namespace uniprot_trembl 
--format embl uniprot_trembl.dat

although I split the dat file into 10 chunks and started them parallel 
to make it faster. This did not go quite as smoothly as Swissprot did. 
In the end, it seems to have loaded 5022284 entries of the 5443284 which 
appear to be there in the input file (when counting with grep -c "ID   ").

Besides the harmless taxonomy warnings which also appear with Swissprot 
(and have been discussed about here a couple of weeks ago and also 
earlier), there came a couple of more serious errors. Perhaps some of 
you know them already:

First of all, the below error seems to lead to a crash, in spite of --safe:

 >>>
------------- EXCEPTION -------------
MSG: A1XDT7 seems to have an invalid species classification.
STACK Bio::SeqIO::embl::_read_EMBL_Species 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:108
7
STACK Bio::SeqIO::embl::next_seq 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:320
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:634
-------------------------------------

Command exited with non-zero status 255
<<<

What this is about is NCBI Tax_ID:435 (Acetobacter aceti; it has some 30 
synonyms in my DB, too), which, to me, looks like a completely normal 
taxon: I could follow its taxonomy up to the root in my NCBI taxonomy in 
the BioSQL DB I used. I don?t know if someone else has seen / can 
reproduce the problem, or should I think about some problem with my 
taxonomy db? Besides, is it the expected behaviour from 
load_seqdatabase.pl to die upon this error?

###################

The other problems did not lead to a crash, only to a failure to load 
the sequence, which would be what I?d expect with --safe. The first type 
of errors looks like

 >>>
Could not store Q49I36:
------------- EXCEPTION -------------
MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor returned 2 rows 
instead of 1. Query was [name_class="scientific 
name",binomial="Onchocerca volvulus"]
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:958
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:854
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182
STACK Bio::DB::Persistent::PersistentObject::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:244
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
STACK Bio::DB::Persistent::PersistentObject::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:271
STACK (eval) 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:630
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:612
-------------------------------------
<<<

In this particular case, "Onchocerca volvulus" does indeed have two 
taxon_ids in my DB (6282 and 563188, of which only the first one is 
returned by a web search at NCBI taxonomy); but the same thing happened 
with a number of other taxa (followed by how many times the above error 
was caused by the particular taxa):

Wolbachia pipientis     64
Hemerocallis sp.        1
Hypsiglena torquata     3
Salmonella enterica     1211
Burkholderia sp.        31
Streptococcus sp.       4
Rhizobium sp.   600
Nostoc sp.      19
Drosophila sp.  18
Onchocerca volvulus     62
Atlapetes schistaceus   4
Symbiodinium sp.        3
Escherichia coli        7421
Hieraaetus fasciatus    4
Borrelia burgdorferi group      1
Pseudomonas sp. 29
Rotavirus A     1076
Gorilla gorilla 746
Rana plancyi    14
unclassified sequences  1

(This should be 11312 cases altogether, but the list might be incomplete 
because I accidentally removed one of my logs, which contained STDOUT 
&STDERR ~ for 10 % of the entries)

Again, is this a known problem for some of you, or could there be a 
problem with my copy of NCBI taxonomy? I don?t remember having updated 
it after the initial upload, so I?m quite surprised by such duplicate 
entries....

###################

Type 2 error w/o crash:

 >>>
Could not store A5HU09:
------------- EXCEPTION -------------
MSG: create: object (Bio::Species) failed to insert or to be found by 
unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206
STACK Bio::DB::Persistent::PersistentObject::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:244
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
STACK Bio::DB::Persistent::PersistentObject::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:271
STACK (eval) 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:630
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:612
<<<

This particular record has the NCBI_TaxID 44271, which looks completely 
normal in the NCBI taxonomy loaded in my BioSQL DB, but the same problem 
appeared in 53 further cases (I could not look into them in detail as 
yet to see whether they were all the same species). On the other hand, 7 
records which were succesfully loaded have this taxonomy ID in the DB 
(44271).

###################

Nr 3 no crash:

 >>>
Could not store Q6T859: Unmatched ( in regex; marked by <-- HERE in 
m/Camelina microcarpa (Littlepod false flax) ( <-- HERE microcarpa 
subsp.\s+/ at 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/Species.pm line 
466, <GEN0> line 357048.
<<<

This happens in the sub binomial in Species.pm using the option "FULL", 
which requests to also return subspecies. I have not looked much deeper 
into this yet, but is it possible that there is a parsing problem with 
multi-line species strings? In the above case the OS field in 
uniprot_trembl.dat looks like

OS   Camelina microcarpa (Littlepod false flax) (Camelina microcarpa subsp.
OS   sylvestris).

###################

I?m still looking for where the remaining records disappeared: of the 
421000 records not showing up in the DB, I could find these:

crasher (Tax_ID=435):   45 entries
problem 1 ("MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor 
returned 2 rows instead of 1."): 11312 entries
problem 2 ("MSG: create: object (Bio::Species) failed to insert or to be 
found by unique key"): 54 entries
problem 3 ("Unmatched ( in regex"): 28241 entries

381348 still remain... Although these could in principle come from the 
first 10 %, for which I don?t have the output, but they don?t seem to: 
after restarting that chunk, I get ~ 30 "Could not store" errors.

So the last question: are there any error messages I can expect which 
don?t contain "Could not store" and which I thus missed here?


Bank Beszteri


Bioinformatics
Alfred Wegener Institute for Polar and Marine Research
Am Handelshafen 12
27570 Bremerhaven


From cjfields at uiuc.edu  Mon Apr 28 09:20:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 28 Apr 2008 08:20:39 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815C08C.1060305@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
Message-ID: <5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>

On Apr 28, 2008, at 7:18 AM, B?nk Beszteri wrote:

> Dear BioSQL / bioperl-db-ists,
>
> I would like  to share my experiences with trying to load  
> uniprot_trembl into a BioSQL db, and also to ask a couple of  
> questions; perhaps some of you know the problems I encountered. I  
> used bioperl-live and bioperl-db-live as of 2008-04-03 and  
> uniprot_trembl.dat as of 2008-04-04. The command was like
>
> load_seqdatabase.pl --safe --logchunk 1000 --host dbserv --dbname  
> abc --dbuser efg --dbpass xyz --driver mysql --namespace  
> uniprot_trembl --format embl uniprot_trembl.dat
>
> ....
>
> First of all, the below error seems to lead to a crash, in spite of  
> --safe:
>
> >>>
> ------------- EXCEPTION -------------
> MSG: A1XDT7 seems to have an invalid species classification.
> STACK Bio::SeqIO::embl::_read_EMBL_Species /home/biocl/bbeszter/lib/ 
> bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:108
> 7
> STACK Bio::SeqIO::embl::next_seq /home/biocl/bbeszter/lib/bioperl- 
> live/bioperl-live/Bio/SeqIO/embl.pm:320
> STACK toplevel /home/biocl/bbeszter/lib/bioperl-live/bioperl-db/ 
> scripts/biosql/load_seqdatabase.pl:634
> -------------------------------------
>
> Command exited with non-zero status 255
> <<<
>
> What this is about is NCBI Tax_ID:435 (Acetobacter aceti; it has  
> some 30 synonyms in my DB, too), which, to me, looks like a  
> completely normal taxon: I could follow its taxonomy up to the root  
> in my NCBI taxonomy in the BioSQL DB I used. I don?t know if someone  
> else has seen / can reproduce the problem, or should I think about  
> some problem with my taxonomy db? Besides, is it the expected  
> behaviour from load_seqdatabase.pl to die upon this error?

...

You should use 'swiss' format instead of 'embl' when loading Uniprot/ 
SwissProt sequences.  Though on the surface they're similar the  
feature table (among other things) is completely different.  I'm not  
sure if that's causing all of the issues here but it certainly could  
contribute to them.

In the meantime, it's much easier for us to track these problems if  
you file a bug (BioPerl, file for bioperl-db):

http://bugzilla.open-bio.org/

chris


From cjfields at uiuc.edu  Sun Apr 27 17:54:03 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 27 Apr 2008 16:54:03 -0500
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
Message-ID: <FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>

I think this is how some of the synteny mapping is done using  
SynBrowse (the trapezoids connecting syntenous genes on different  
tracks).

http://www.gmod.org/wiki/index.php/SynView

chris

On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:

> You can get the GD object back from the Bio::Graphics::Panel  then  
> draw
> on it using GD methods
>
> Eg:
>
> #create a BioPerl panel
> my $panel = Bio::Graphics::Panel->new(
>                              			-length   => 600
>                              			-width    => 800,
> 					-bgcolor  => 'white'
> 					);
> # add your features
> my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> 200,);
> $panel->add_track($feature, glyph   =>   'segments',
> 					-label   =>   0,
> 					-height  =>   30,
> 					-bgcolor  =>  'red',
> 					-fgcolor  => 'red'
> 					 );
>
> # grab the GD thingy
> my $gd = $panel->gd;
>
> #create a color - not sure if there's a better way?
> $black = $gd->colorAllocate(0,0,0);
>
> #draw on your GD thingy
> $gd->line(10,10,$panel->width -10,10,$black);
> $gd->string(gdSmallFont,20,10,'test' ,'$black);
>
> # print it as normal	
> print $panel->png;
>
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-
>> bio.org] On Behalf Of sergei ryazansky
>> Sent: Monday, 28 April 2008 2:05 a.m.
>> To: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
>>
>> Hi all,
>>
>> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> panel
>> to obtain a file with image of both the chart and bioseq object?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Mon Apr 28 09:51:53 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Mon, 28 Apr 2008 15:51:53 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
Message-ID: <4815D679.3070307@awi.de>

Chris Fields schrieb:
>
> ...
>
> You should use 'swiss' format instead of 'embl' when loading 
> Uniprot/SwissProt sequences.  Though on the surface they're similar 
> the feature table (among other things) is completely different.  I'm 
> not sure if that's causing all of the issues here but it certainly 
> could contribute to them.
>
> In the meantime, it's much easier for us to track these problems if 
> you file a bug (BioPerl, file for bioperl-db):
>
> http://bugzilla.open-bio.org/
>
Hi Chris,

I will do so; in the meanwhile: I?m not loading Swissprot, but TrEMBL. 
Is swiss also the appropriate format here? By reading 
http://expasy.org/sprot/userman.html#diffEMBL, I concluded that embl 
should be the one I?d need for TrEMBL.

Bank


From cjfields at uiuc.edu  Mon Apr 28 12:24:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 28 Apr 2008 11:24:39 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815D679.3070307@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
	<4815D679.3070307@awi.de>
Message-ID: <B7918B56-536D-497F-A59D-D48A61085339@uiuc.edu>


On Apr 28, 2008, at 8:51 AM, B?nk Beszteri wrote:

> Chris Fields schrieb:
>>
>> ...
>>
>> You should use 'swiss' format instead of 'embl' when loading  
>> Uniprot/SwissProt sequences.  Though on the surface they're similar  
>> the feature table (among other things) is completely different.   
>> I'm not sure if that's causing all of the issues here but it  
>> certainly could contribute to them.
>>
>> In the meantime, it's much easier for us to track these problems if  
>> you file a bug (BioPerl, file for bioperl-db):
>>
>> http://bugzilla.open-bio.org/
>>
> Hi Chris,
>
> I will do so; in the meanwhile: I?m not loading Swissprot, but  
> TrEMBL. Is swiss also the appropriate format here? By reading http://expasy.org/sprot/userman.html#diffEMBL 
> , I concluded that embl should be the one I?d need for TrEMBL.
>
> Bank

The section you link to describes several important differences  
between EMBL and SwissProt/UniProt format (i.e. how each indicated  
line type differs between SwissProt and EMBL formats, including ID,  
AC, OS/OC, FT, etc).  I'm unsure how you derived that 'embl' would  
work from that, e.g. they are close, but there are enough significant  
differences that using 'embl' for SwissProt (or vice versa) will not  
work as intended, if at all.

chris


From hlapp at gmx.net  Mon Apr 28 15:46:07 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 28 Apr 2008 15:46:07 -0400
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815D679.3070307@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
	<4815D679.3070307@awi.de>
Message-ID: <3BD6A261-D023-4A5F-9CBC-C3216B0145F0@gmx.net>


On Apr 28, 2008, at 9:51 AM, B?nk Beszteri wrote:
>  I?m not loading Swissprot, but TrEMBL. Is swiss also the  
> appropriate format here?


Yes, though I guess it can be confusing.

Maybe we should create a symlink uniprot.pm to swiss.pm, or in fact  
fork them if UniProt starts accumulating enough differences from the  
traditional Swissprot format.

BTW as you had noticed, the --safe switch only protects the script  
from crashing due to a db loading error. A parsing error will still  
cause a crash.

I guess you can argue that that's not nice, and having a chance to  
skip over the record that offends the (BioPerl) parser would be  
useful. The problem is that if the parser errors out, it's not  
guaranteed where we are in the file and whether the parser module is  
in a state that it can recover itself from. For the database it's a  
bit easier as one just needs to rollback() the transaction (each  
sequence is its own transaction).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From Russell.Smithies at agresearch.co.nz  Mon Apr 28 17:15:16 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 29 Apr 2008 09:15:16 +1200
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>

I thought it was a bit of a hack but I guess if someone else is doing it
too, it can't be all bad  :-)

It looks like you can combine your drawing methods like this:
(I'm sure Lincoln will tell us this is bad but it seems to work ok)
------------------------------------------------------------------------
-------------

#!perl -w
use GD::Graph::lines;
use GD::Graph::colour;
use GD::Graph::Data;

use Bio::Graphics;
use Bio::SeqFeature::Generic;

# create and draw on a graphics panel
my $panel = Bio::Graphics::Panel->new(
                                      -length => 500,
                                      -width  => 500
                                     );
my $track = $panel->add_track(
                              -glyph => 'generic',
                              -label => 1
                             );

# create and add a few features
for($i = 100; $i < 500; $i+= 100){
  my $feature = Bio::SeqFeature::Generic->new(
                                              -display_name => "feature:
$i",
                                              -score        => $i,
                                              -start        => $i,
                                              -end          => $i + 100
                                             );
  $track->add_feature($feature);
}


# create and draw the graph
my @data = (
    ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
    [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
    [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
);
my $graph = GD::Graph::lines->new(500, 300);

$graph->set(
      x_label           => 'X Label',
      y_label           => 'Y label',
      title             => 'Some simple graph',
      y_max_value       => 8,
      y_tick_number     => 8,
      y_label_skip      => 2
) or die $graph->error;

$graph->set( dclrs => [ qw( green blue black red pink) ] );

my $gd = $graph->plot(\@data) or die $graph->error;

# combine the two images
my $combined = $panel->gd($gd);

open(IMG, '>file.png') or die $!;
binmode IMG;
print IMG $combined->png;

------------------------------------------------------------------------
------------------

> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Monday, 28 April 2008 9:54 a.m.
> To: Smithies, Russell
> Cc: sergei ryazansky; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] addition of GD::graphic object to
Bio::Graphics
> 
> I think this is how some of the synteny mapping is done using
> SynBrowse (the trapezoids connecting syntenous genes on different
> tracks).
> 
> http://www.gmod.org/wiki/index.php/SynView
> 
> chris
> 
> On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
> 
> > You can get the GD object back from the Bio::Graphics::Panel  then
> > draw
> > on it using GD methods
> >
> > Eg:
> >
> > #create a BioPerl panel
> > my $panel = Bio::Graphics::Panel->new(
> >                              			-length   => 600
> >                              			-width    =>
800,
> > 					-bgcolor  => 'white'
> > 					);
> > # add your features
> > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> > 200,);
> > $panel->add_track($feature, glyph   =>   'segments',
> > 					-label   =>   0,
> > 					-height  =>   30,
> > 					-bgcolor  =>  'red',
> > 					-fgcolor  => 'red'
> > 					 );
> >
> > # grab the GD thingy
> > my $gd = $panel->gd;
> >
> > #create a color - not sure if there's a better way?
> > $black = $gd->colorAllocate(0,0,0);
> >
> > #draw on your GD thingy
> > $gd->line(10,10,$panel->width -10,10,$black);
> > $gd->string(gdSmallFont,20,10,'test' ,'$black);
> >
> > # print it as normal
> > print $panel->png;
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org
> > [mailto:bioperl-l-bounces at lists.open-
> >> bio.org] On Behalf Of sergei ryazansky
> >> Sent: Monday, 28 April 2008 2:05 a.m.
> >> To: bioperl-l at bioperl.org
> >> Subject: [Bioperl-l] addition of GD::graphic object to
Bio::Graphics
> >>
> >> Hi all,
> >>
> >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> > panel
> >> to obtain a file with image of both the chart and bioseq object?
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =
> >
> =============================================================
> =========
> > Attention: The information contained in this message and/or
> > attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or
> > privileged
> > material. Any review, retransmission, dissemination or other use of,
> > or
> > taking of any action in reliance upon, this information by persons
or
> > entities other than the intended recipients is prohibited by
> > AgResearch
> > Limited. If you have received this message in error, please notify
the
> > sender immediately.
> > =
> >
> =============================================================
> =========
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From lincoln.stein at gmail.com  Mon Apr 28 17:33:19 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 28 Apr 2008 17:33:19 -0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
	<D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
Message-ID: <6dce9a0b0804281433i697cda2fo2c47ce59010d0858@mail.gmail.com>

Hi,

No, I'm perfectly happy with combining images like this. It is part of what
I intended.

Another idea would be to use the Image glyph to embed graphs at particular
genomic locations in the panel. Right now the glyph is designed in the
expectation that the image passed to it is sitting on the file system (or a
web URL), but it would be easy to modify it so that a callback can generate
the GD on the fly, by using, for example GD::Graph.

Lincoln

On Mon, Apr 28, 2008 at 5:15 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

> I thought it was a bit of a hack but I guess if someone else is doing it
> too, it can't be all bad  :-)
>
> It looks like you can combine your drawing methods like this:
> (I'm sure Lincoln will tell us this is bad but it seems to work ok)
> ------------------------------------------------------------------------
> -------------
>
> #!perl -w
> use GD::Graph::lines;
> use GD::Graph::colour;
> use GD::Graph::Data;
>
> use Bio::Graphics;
> use Bio::SeqFeature::Generic;
>
> # create and draw on a graphics panel
> my $panel = Bio::Graphics::Panel->new(
>                                       -length => 500,
>                                      -width  => 500
>                                     );
> my $track = $panel->add_track(
>                              -glyph => 'generic',
>                              -label => 1
>                             );
>
> # create and add a few features
> for($i = 100; $i < 500; $i+= 100){
>   my $feature = Bio::SeqFeature::Generic->new(
>                                               -display_name => "feature:
> $i",
>                                              -score        => $i,
>                                              -start        => $i,
>                                              -end          => $i + 100
>                                             );
>  $track->add_feature($feature);
> }
>
>
> # create and draw the graph
> my @data = (
>    ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
>    [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
>    [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
> );
> my $graph = GD::Graph::lines->new(500, 300);
>
> $graph->set(
>      x_label           => 'X Label',
>      y_label           => 'Y label',
>      title             => 'Some simple graph',
>      y_max_value       => 8,
>      y_tick_number     => 8,
>      y_label_skip      => 2
> ) or die $graph->error;
>
> $graph->set( dclrs => [ qw( green blue black red pink) ] );
>
> my $gd = $graph->plot(\@data) or die $graph->error;
>
> # combine the two images
> my $combined = $panel->gd($gd);
>
> open(IMG, '>file.png') or die $!;
> binmode IMG;
> print IMG $combined->png;
>
> ------------------------------------------------------------------------
> ------------------
>
> > -----Original Message-----
> > From: Chris Fields [mailto:cjfields at uiuc.edu]
> > Sent: Monday, 28 April 2008 9:54 a.m.
> > To: Smithies, Russell
> > Cc: sergei ryazansky; bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
> >
> > I think this is how some of the synteny mapping is done using
> > SynBrowse (the trapezoids connecting syntenous genes on different
> > tracks).
> >
> > http://www.gmod.org/wiki/index.php/SynView
> >
> > chris
> >
> > On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
> >
> > > You can get the GD object back from the Bio::Graphics::Panel  then
> > > draw
> > > on it using GD methods
> > >
> > > Eg:
> > >
> > > #create a BioPerl panel
> > > my $panel = Bio::Graphics::Panel->new(
> > >                                                     -length   => 600
> > >                                                     -width    =>
> 800,
> > >                                     -bgcolor  => 'white'
> > >                                     );
> > > # add your features
> > > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> > > 200,);
> > > $panel->add_track($feature, glyph   =>   'segments',
> > >                                     -label   =>   0,
> > >                                     -height  =>   30,
> > >                                     -bgcolor  =>  'red',
> > >                                     -fgcolor  => 'red'
> > >                                      );
> > >
> > > # grab the GD thingy
> > > my $gd = $panel->gd;
> > >
> > > #create a color - not sure if there's a better way?
> > > $black = $gd->colorAllocate(0,0,0);
> > >
> > > #draw on your GD thingy
> > > $gd->line(10,10,$panel->width -10,10,$black);
> > > $gd->string(gdSmallFont,20,10,'test' ,'$black);
> > >
> > > # print it as normal
> > > print $panel->png;
> > >
> > >
> > >
> > >
> > >> -----Original Message-----
> > >> From: bioperl-l-bounces at lists.open-bio.org
> > > [mailto:bioperl-l-bounces at lists.open-
> > >> bio.org] On Behalf Of sergei ryazansky
> > >> Sent: Monday, 28 April 2008 2:05 a.m.
> > >> To: bioperl-l at bioperl.org
> > >> Subject: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
> > >>
> > >> Hi all,
> > >>
> > >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> > > panel
> > >> to obtain a file with image of both the chart and bioseq object?
> > >>
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > =
> > >
> > =============================================================
> > =========
> > > Attention: The information contained in this message and/or
> > > attachments
> > > from AgResearch Limited is intended only for the persons or entities
> > > to which it is addressed and may contain confidential and/or
> > > privileged
> > > material. Any review, retransmission, dissemination or other use of,
> > > or
> > > taking of any action in reliance upon, this information by persons
> or
> > > entities other than the intended recipients is prohibited by
> > > AgResearch
> > > Limited. If you have received this message in error, please notify
> the
> > > sender immediately.
> > > =
> > >
> > =============================================================
> > =========
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From dr.hogart at gmail.com  Tue Apr 29 03:56:24 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Tue, 29 Apr 2008 11:56:24 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
	<D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
Message-ID: <op.uac4caojavnppr@hogart.img.ras.ru>

Thank you very much! It is exactly that I was looking for.

On Tue, 29 Apr 2008 01:15:16 +0400, Smithies, Russell  
<Russell.Smithies at agresearch.co.nz> wrote:

> I thought it was a bit of a hack but I guess if someone else is doing it
> too, it can't be all bad  :-)
>
> It looks like you can combine your drawing methods like this:
> (I'm sure Lincoln will tell us this is bad but it seems to work ok)
> ------------------------------------------------------------------------
> -------------
>
> #!perl -w
> use GD::Graph::lines;
> use GD::Graph::colour;
> use GD::Graph::Data;
>
> use Bio::Graphics;
> use Bio::SeqFeature::Generic;
>
> # create and draw on a graphics panel
> my $panel = Bio::Graphics::Panel->new(
>                                       -length => 500,
>                                       -width  => 500
>                                      );
> my $track = $panel->add_track(
>                               -glyph => 'generic',
>                               -label => 1
>                              );
>
> # create and add a few features
> for($i = 100; $i < 500; $i+= 100){
>   my $feature = Bio::SeqFeature::Generic->new(
>                                               -display_name => "feature:
> $i",
>                                               -score        => $i,
>                                               -start        => $i,
>                                               -end          => $i + 100
>                                              );
>   $track->add_feature($feature);
> }
>
>
> # create and draw the graph
> my @data = (
>     ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
>     [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
>     [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
> );
> my $graph = GD::Graph::lines->new(500, 300);
>
> $graph->set(
>       x_label           => 'X Label',
>       y_label           => 'Y label',
>       title             => 'Some simple graph',
>       y_max_value       => 8,
>       y_tick_number     => 8,
>       y_label_skip      => 2
> ) or die $graph->error;
>
> $graph->set( dclrs => [ qw( green blue black red pink) ] );
>
> my $gd = $graph->plot(\@data) or die $graph->error;
>
> # combine the two images
> my $combined = $panel->gd($gd);
>
> open(IMG, '>file.png') or die $!;
> binmode IMG;
> print IMG $combined->png;
>
> ------------------------------------------------------------------------
> ------------------
>
>> -----Original Message-----
>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> Sent: Monday, 28 April 2008 9:54 a.m.
>> To: Smithies, Russell
>> Cc: sergei ryazansky; bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
>>
>> I think this is how some of the synteny mapping is done using
>> SynBrowse (the trapezoids connecting syntenous genes on different
>> tracks).
>>
>> http://www.gmod.org/wiki/index.php/SynView
>>
>> chris
>>
>> On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
>>
>> > You can get the GD object back from the Bio::Graphics::Panel  then
>> > draw
>> > on it using GD methods
>> >
>> > Eg:
>> >
>> > #create a BioPerl panel
>> > my $panel = Bio::Graphics::Panel->new(
>> >                              			-length   => 600
>> >                              			-width    =>
> 800,
>> > 					-bgcolor  => 'white'
>> > 					);
>> > # add your features
>> > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
>> > 200,);
>> > $panel->add_track($feature, glyph   =>   'segments',
>> > 					-label   =>   0,
>> > 					-height  =>   30,
>> > 					-bgcolor  =>  'red',
>> > 					-fgcolor  => 'red'
>> > 					 );
>> >
>> > # grab the GD thingy
>> > my $gd = $panel->gd;
>> >
>> > #create a color - not sure if there's a better way?
>> > $black = $gd->colorAllocate(0,0,0);
>> >
>> > #draw on your GD thingy
>> > $gd->line(10,10,$panel->width -10,10,$black);
>> > $gd->string(gdSmallFont,20,10,'test' ,'$black);
>> >
>> > # print it as normal
>> > print $panel->png;
>> >
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org
>> > [mailto:bioperl-l-bounces at lists.open-
>> >> bio.org] On Behalf Of sergei ryazansky
>> >> Sent: Monday, 28 April 2008 2:05 a.m.
>> >> To: bioperl-l at bioperl.org
>> >> Subject: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
>> >>
>> >> Hi all,
>> >>
>> >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
>> > panel
>> >> to obtain a file with image of both the chart and bioseq object?
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =
>> >
>> =============================================================
>> =========
>> > Attention: The information contained in this message and/or
>> > attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or
>> > privileged
>> > material. Any review, retransmission, dissemination or other use of,
>> > or
>> > taking of any action in reliance upon, this information by persons
> or
>> > entities other than the intended recipients is prohibited by
>> > AgResearch
>> > Limited. If you have received this message in error, please notify
> the
>> > sender immediately.
>> > =
>> >
>> =============================================================
>> =========
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


-- 
?????????? M2, ????????????? ???????? ?????????? Opera:  
http://www.opera.com/mail/mail/


From d.gatherer at mrcvu.gla.ac.uk  Tue Apr 29 08:21:05 2008
From: d.gatherer at mrcvu.gla.ac.uk (Derek Gatherer)
Date: Tue, 29 Apr 2008 13:21:05 +0100
Subject: [Bioperl-l] translate() oddities
Message-ID: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>

Hi

I thought I'd better run this by the community before I embarrass 
myself on Bugzilla.  It seems like a clear bug to me.  I'm running 
Bioperl 1.5.0 on RedHat.

For a test input:

 >test
ATGATGATGATGATGTGA

the following code is fine.

while((my $seqobj = $seq_in->next_seq()))
{
     print "\n".$seqobj->display_id;
     my $len  = $seqobj->length();
     print " length: $len";
     my $frame1_obj = $seqobj->translate();
     my $f1_prot = $frame1_obj->seq();
     print "\n$f1_prot";
}

Output:

test length: 18
MMMMM*

But if I want to change the frame as specified in the BioPerl 
tutorial, by using:

my $frame1_obj = $seqobj->translate(frame => 1); # which should now 
give frame 2, I get:

test length: 18
MMMMM-frame

The frame is unchanged and the text "-frame" is tacked on the end of 
the output.  The same occurs with translate(frame => 2).

Any ideas?  Can something as fundamental as translate() really be 
bugged?  or am I guilty of some particularly heinous syntax error?

Cheers
Derek


From tristan.lefebure at gmail.com  Tue Apr 29 09:58:21 2008
From: tristan.lefebure at gmail.com (Tristan Lefebure)
Date: Tue, 29 Apr 2008 09:58:21 -0400
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
Message-ID: <200804290958.21548.tristan.lefebure@gmail.com>

Aren't you forgetting the dash?

my $frame1_obj = $seqobj->translate(-frame => 1)


On Tuesday 29 April 2008 08:21:05 Derek Gatherer wrote:
> my $frame1_obj = $seqobj->translate(frame => 1)


-Tristan


From d.gatherer at mrcvu.gla.ac.uk  Tue Apr 29 10:05:03 2008
From: d.gatherer at mrcvu.gla.ac.uk (Derek Gatherer)
Date: Tue, 29 Apr 2008 15:05:03 +0100
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <481726BF.1060609@bms.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<481726BF.1060609@bms.com>
Message-ID: <E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>

Thanks Stefan

Actually, there was a typo in my message, I did use -frame => 
1.  However, the problem disappears on upgrading from 1.5.0 to 1.5.2.

So not a bug anymore.

Cheers
Derek

At 14:46 29/04/2008, Stefan Kirov wrote:
>my $frame1_obj = $seqobj->translate(-frame => 1);
>not
>my $frame1_obj = $seqobj->translate(frame => 1);
>Stefan
>
>Derek Gatherer wrote:
> > Hi
> >
> > I thought I'd better run this by the community before I embarrass
> > myself on Bugzilla.  It seems like a clear bug to me.  I'm running
> > Bioperl 1.5.0 on RedHat.
> >
> > For a test input:
> >
> > >test
> > ATGATGATGATGATGTGA
> >
> > the following code is fine.
> >
> > while((my $seqobj = $seq_in->next_seq()))
> > {
> >     print "\n".$seqobj->display_id;
> >     my $len  = $seqobj->length();
> >     print " length: $len";
> >     my $frame1_obj = $seqobj->translate();
> >     my $f1_prot = $frame1_obj->seq();
> >     print "\n$f1_prot";
> > }
> >
> > Output:
> >
> > test length: 18
> > MMMMM*
> >
> > But if I want to change the frame as specified in the BioPerl
> > tutorial, by using:
> >
> > my $frame1_obj = $seqobj->translate(frame => 1); # which should now
> > give frame 2, I get:
> >
> > test length: 18
> > MMMMM-frame
> >
> > The frame is unchanged and the text "-frame" is tacked on the end of
> > the output.  The same occurs with translate(frame => 2).
> >
> > Any ideas?  Can something as fundamental as translate() really be
> > bugged?  or am I guilty of some particularly heinous syntax error?
> >
> > Cheers
> > Derek
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >


From l.douchy at gmail.com  Tue Apr 29 10:16:40 2008
From: l.douchy at gmail.com (Laurent DOUCHY)
Date: Tue, 29 Apr 2008 16:16:40 +0200
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <200804290958.21548.tristan.lefebure@gmail.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<200804290958.21548.tristan.lefebure@gmail.com>
Message-ID: <2fb209dd0804290716x36e403dek55978dc4f54e34ff@mail.gmail.com>

Hello,

I resolved this issue in Bio::seqIO with the following line :

my $sequence = $seq->translate('*', 'X', '0', '1', '0', '0', '0', '0')->seq;
the third parameter set the frame.

I hope to have been helpful.

laurent.

On Tue, Apr 29, 2008 at 3:58 PM, Tristan Lefebure <
tristan.lefebure at gmail.com> wrote:

> Aren't you forgetting the dash?
>
> my $frame1_obj = $seqobj->translate(-frame => 1)
>
>
> On Tuesday 29 April 2008 08:21:05 Derek Gatherer wrote:
> > my $frame1_obj = $seqobj->translate(frame => 1)
>
>
>
> -Tristan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From roy.chaudhuri at gmail.com  Tue Apr 29 10:27:10 2008
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 29 Apr 2008 15:27:10 +0100
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>	<481726BF.1060609@bms.com>
	<E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
Message-ID: <4817303E.1040903@gmail.com>

Spent two minutes looking at this, so may as well chip in with what I 
discovered even though you solved your problem.

This "bug" comes about because in version 1.5.1 and earlier, the 
arguments to translate were a simple list, with the first argument the 
terminator (defaults to "*"). Your old version therefore assumed that 
you wanted to translate the stop codon to "-frame". Amusingly given your 
typo, if you miss the hyphen off the frame argument in version 1.5.2 it 
reverts to the old interface and you end up with the output 
"MMMMMframe". The moral of the story is of course to read the docs 
relevant to the version you are using.

Roy.
--
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

Derek Gatherer wrote:
> Thanks Stefan
> 
> Actually, there was a typo in my message, I did use -frame => 
> 1.  However, the problem disappears on upgrading from 1.5.0 to 1.5.2.
> 
> So not a bug anymore.
> 
> Cheers
> Derek
> 
> At 14:46 29/04/2008, Stefan Kirov wrote:
>> my $frame1_obj = $seqobj->translate(-frame => 1);
>> not
>> my $frame1_obj = $seqobj->translate(frame => 1);
>> Stefan
>>
>> Derek Gatherer wrote:
>>> Hi
>>>
>>> I thought I'd better run this by the community before I embarrass
>>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>>> Bioperl 1.5.0 on RedHat.
>>>
>>> For a test input:
>>>
>>>> test
>>> ATGATGATGATGATGTGA
>>>
>>> the following code is fine.
>>>
>>> while((my $seqobj = $seq_in->next_seq()))
>>> {
>>>     print "\n".$seqobj->display_id;
>>>     my $len  = $seqobj->length();
>>>     print " length: $len";
>>>     my $frame1_obj = $seqobj->translate();
>>>     my $f1_prot = $frame1_obj->seq();
>>>     print "\n$f1_prot";
>>> }
>>>
>>> Output:
>>>
>>> test length: 18
>>> MMMMM*
>>>
>>> But if I want to change the frame as specified in the BioPerl
>>> tutorial, by using:
>>>
>>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>>> give frame 2, I get:
>>>
>>> test length: 18
>>> MMMMM-frame
>>>
>>> The frame is unchanged and the text "-frame" is tacked on the end of
>>> the output.  The same occurs with translate(frame => 2).
>>>
>>> Any ideas?  Can something as fundamental as translate() really be
>>> bugged?  or am I guilty of some particularly heinous syntax error?
>>>
>>> Cheers
>>> Derek
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From stefan.kirov at bms.com  Tue Apr 29 09:46:39 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Tue, 29 Apr 2008 09:46:39 -0400
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
Message-ID: <481726BF.1060609@bms.com>

my $frame1_obj = $seqobj->translate(-frame => 1);
not
my $frame1_obj = $seqobj->translate(frame => 1);
Stefan

Derek Gatherer wrote:
> Hi
>
> I thought I'd better run this by the community before I embarrass
> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
> Bioperl 1.5.0 on RedHat.
>
> For a test input:
>
> >test
> ATGATGATGATGATGTGA
>
> the following code is fine.
>
> while((my $seqobj = $seq_in->next_seq()))
> {
>     print "\n".$seqobj->display_id;
>     my $len  = $seqobj->length();
>     print " length: $len";
>     my $frame1_obj = $seqobj->translate();
>     my $f1_prot = $frame1_obj->seq();
>     print "\n$f1_prot";
> }
>
> Output:
>
> test length: 18
> MMMMM*
>
> But if I want to change the frame as specified in the BioPerl
> tutorial, by using:
>
> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
> give frame 2, I get:
>
> test length: 18
> MMMMM-frame
>
> The frame is unchanged and the text "-frame" is tacked on the end of
> the output.  The same occurs with translate(frame => 2).
>
> Any ideas?  Can something as fundamental as translate() really be
> bugged?  or am I guilty of some particularly heinous syntax error?
>
> Cheers
> Derek
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Tue Apr 29 11:00:00 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 29 Apr 2008 10:00:00 -0500
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <4817303E.1040903@gmail.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>	<481726BF.1060609@bms.com>
	<E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
	<4817303E.1040903@gmail.com>
Message-ID: <36045A08-AEA8-4639-A384-1DC53B5DC129@uiuc.edu>

Yes the interface changed somewhat post 1.5.1, mainly to accept named  
parameters.  I think a few methods do this now as passing in lists of  
more than 2 args, undef'ing those one doesn't want set, gets confusing.

chris

On Apr 29, 2008, at 9:27 AM, Roy Chaudhuri wrote:

> Spent two minutes looking at this, so may as well chip in with what  
> I discovered even though you solved your problem.
>
> This "bug" comes about because in version 1.5.1 and earlier, the  
> arguments to translate were a simple list, with the first argument  
> the terminator (defaults to "*"). Your old version therefore assumed  
> that you wanted to translate the stop codon to "-frame". Amusingly  
> given your typo, if you miss the hyphen off the frame argument in  
> version 1.5.2 it reverts to the old interface and you end up with  
> the output "MMMMMframe". The moral of the story is of course to read  
> the docs relevant to the version you are using.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
>
> Derek Gatherer wrote:
>> Thanks Stefan
>> Actually, there was a typo in my message, I did use -frame => 1.   
>> However, the problem disappears on upgrading from 1.5.0 to 1.5.2.
>> So not a bug anymore.
>> Cheers
>> Derek
>> At 14:46 29/04/2008, Stefan Kirov wrote:
>>> my $frame1_obj = $seqobj->translate(-frame => 1);
>>> not
>>> my $frame1_obj = $seqobj->translate(frame => 1);
>>> Stefan
>>>
>>> Derek Gatherer wrote:
>>>> Hi
>>>>
>>>> I thought I'd better run this by the community before I embarrass
>>>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>>>> Bioperl 1.5.0 on RedHat.
>>>>
>>>> For a test input:
>>>>
>>>>> test
>>>> ATGATGATGATGATGTGA
>>>>
>>>> the following code is fine.
>>>>
>>>> while((my $seqobj = $seq_in->next_seq()))
>>>> {
>>>>    print "\n".$seqobj->display_id;
>>>>    my $len  = $seqobj->length();
>>>>    print " length: $len";
>>>>    my $frame1_obj = $seqobj->translate();
>>>>    my $f1_prot = $frame1_obj->seq();
>>>>    print "\n$f1_prot";
>>>> }
>>>>
>>>> Output:
>>>>
>>>> test length: 18
>>>> MMMMM*
>>>>
>>>> But if I want to change the frame as specified in the BioPerl
>>>> tutorial, by using:
>>>>
>>>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>>>> give frame 2, I get:
>>>>
>>>> test length: 18
>>>> MMMMM-frame
>>>>
>>>> The frame is unchanged and the text "-frame" is tacked on the end  
>>>> of
>>>> the output.  The same occurs with translate(frame => 2).
>>>>
>>>> Any ideas?  Can something as fundamental as translate() really be
>>>> bugged?  or am I guilty of some particularly heinous syntax error?
>>>>
>>>> Cheers
>>>> Derek
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 29 11:07:30 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 29 Apr 2008 10:07:30 -0500
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <481726BF.1060609@bms.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<481726BF.1060609@bms.com>
Message-ID: <18DB95FB-52B9-4091-ACEE-996891F8A5AE@uiuc.edu>

As an aside, I've been playing around with perl6 (Rakudo) for a bit  
now.  Parameter-like passing (using autoaccessors and other means)  
will be added in soon, so you will be able to do this:

$seqobj = Seq.new(seq => 'ATGATGATGATGATGTGA', alphabet => 'dna');
my $protobj = $seq.translate(frame => 1);

Yes, I'm a geek. ; >

chris

On Apr 29, 2008, at 8:46 AM, Stefan Kirov wrote:

> my $frame1_obj = $seqobj->translate(-frame => 1);
> not
> my $frame1_obj = $seqobj->translate(frame => 1);
> Stefan
>
> Derek Gatherer wrote:
>> Hi
>>
>> I thought I'd better run this by the community before I embarrass
>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>> Bioperl 1.5.0 on RedHat.
>>
>> For a test input:
>>
>>> test
>> ATGATGATGATGATGTGA
>>
>> the following code is fine.
>>
>> while((my $seqobj = $seq_in->next_seq()))
>> {
>>    print "\n".$seqobj->display_id;
>>    my $len  = $seqobj->length();
>>    print " length: $len";
>>    my $frame1_obj = $seqobj->translate();
>>    my $f1_prot = $frame1_obj->seq();
>>    print "\n$f1_prot";
>> }
>>
>> Output:
>>
>> test length: 18
>> MMMMM*
>>
>> But if I want to change the frame as specified in the BioPerl
>> tutorial, by using:
>>
>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>> give frame 2, I get:
>>
>> test length: 18
>> MMMMM-frame
>>
>> The frame is unchanged and the text "-frame" is tacked on the end of
>> the output.  The same occurs with translate(frame => 2).
>>
>> Any ideas?  Can something as fundamental as translate() really be
>> bugged?  or am I guilty of some particularly heinous syntax error?
>>
>> Cheers
>> Derek
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dr.hogart at gmail.com  Tue Apr 29 11:57:51 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Tue, 29 Apr 2008 19:57:51 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
Message-ID: <op.uadqmpg8avnppr@hogart.img.ras.ru>

Hi all!

I am trying to perform TCoffe aligment by  
Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the script.  
This subroutine works fine, but it is not single subroutine - there are a  
lot of other ones in the script. The problem is when compilation of script  
finish execution (nb! successful execution) of tcoffee subroutine the  
compiliation of the end of the script also interrupted. It seems that the  
tcoffee program itself induce interraption of perl compilation. Is it  
possible to pass this problem?

-- 


From darin.london at duke.edu  Tue Apr 29 12:49:53 2008
From: darin.london at duke.edu (darin.london at duke.edu)
Date: Tue, 29 Apr 2008 12:49:53 -0400
Subject: [Bioperl-l] BOSC 2008 Announcement and Call For Submissions
Message-ID: <200804291650.m3TGnr0H020814@tenero.duhs.duke.edu>


BOSC 2008 Call for Abstracts Reminder

The 9th annual Bioinformatics Open Source Conference (BOSC 2008) will take place in Toronto, Ontario, Canada, as one of several Special Interest Group (SIG) meetings occurring in conjunction with the 16th annual Intelligent Systems for Molecular Biology Conference (ISMB 2008).

This is a reminder to submit your proposals for talks to the BOSC submission system before May 11.

Submission Process:
All abstracts must be submitted through our Open Conference Systems site (http://events.open-bio.org/BOSC2008/openconf.php).
The form will ask for a small Abstract Text to be pasted into it, and a full paper.  The small Abstract text should be a summary, while the longer abstract (should provide more details, including the open-source license requirement details)
Full-length abstracts are limited to one page with one inch (2.5 cm) margins on the top, sides, and bottom.  The full-length abstract should include the title, authors, and affiliations.  We prefer your abstract to be in PDF format, although plain t

Important Dates:
May 11: Abstract submission deadline.
June 2: Notification of accepted talks.
June 4: Early registration discount cut-off.
July 18-19: BOSC 2008!

We hope to see you at BOSC 2008!

Kam Dahlquist and Darin London
BOSC 2008 Co-organizers

			 
From bix at sendu.me.uk  Tue Apr 29 12:54:41 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 29 Apr 2008 17:54:41 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uadqmpg8avnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
Message-ID: <481752D1.7010904@sendu.me.uk>

sergei ryazansky wrote:
> I am trying to perform TCoffe aligment by 
> Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the 
> script. This subroutine works fine, but it is not single subroutine - 
> there are a lot of other ones in the script. The problem is when 
> compilation of script finish execution (nb! successful execution) of 
> tcoffee subroutine the compiliation of the end of the script also 
> interrupted. It seems that the tcoffee program itself induce 
> interraption of perl compilation. Is it possible to pass this problem?

You'll have to supply us with a minimal version of the script and the 
complete error message.


From dr.hogart at gmail.com  Wed Apr 30 07:24:35 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 15:24:35 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
Message-ID: <op.uae8m9tzavnppr@hogart.img.ras.ru>

On Tue, 29 Apr 2008 19:57:51 +0400, sergei ryazansky <dr.hogart at gmail.com>  
wrote:

> Hi all!
>
> I am trying to perform TCoffe aligment by  
> Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the  
> script. This subroutine works fine, but it is not single subroutine -  
> there are a lot of other ones in the script. The problem is when  
> compilation of script finish execution (nb! successful execution) of  
> tcoffee subroutine the compiliation of the end of the script also  
> interrupted. It seems that the tcoffee program itself induce  
> interraption of perl compilation. Is it possible to pass this problem?
>


My subroutine is following:

sub align {
	my $file=shift @_;
	my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 'fasta',  
'outfile' => 'temp_align.out');
	my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
	my $aln=$factory->align ($file);
	open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
	return @temp_file;
}

This subroutine is called by the following command:

my @align_fa = align($inputfile_align);

After successful execution of this subroutine (accompaning with the  
corresponding messages on the terminal window) the execution of remainder  
script is terminated without any error messages.

-- 


From bix at sendu.me.uk  Wed Apr 30 08:47:17 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 13:47:17 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uae8m9tzavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
Message-ID: <48186A55.4030406@sendu.me.uk>

sergei ryazansky wrote:
> My subroutine is following:
> 
> sub align {
>     my $file=shift @_;
>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
> 'fasta', 'outfile' => 'temp_align.out');
>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>     my $aln=$factory->align ($file);
>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>     return @temp_file;
> }
> 
> This subroutine is called by the following command:
> 
> my @align_fa = align($inputfile_align);
> 
> After successful execution of this subroutine (accompaning with the 
> corresponding messages on the terminal window) the execution of 
> remainder script is terminated without any error messages.

The problem lies somewhere within the rest of your script, so we have to 
see it if you want help.

Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you 
don't make use of the resulting alignment object? A system call might 
make more sense given what you're doing. The beauty of 
Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the 
result file (temp_align.out) yourself.


From dr.hogart at gmail.com  Wed Apr 30 09:36:58 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 17:36:58 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
Message-ID: <op.uaferwytavnppr@hogart.img.ras.ru>

On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:

> sergei ryazansky wrote:
>> My subroutine is following:
>>  sub align {
>>     my $file=shift @_;
>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>  
>> 'fasta', 'outfile' => 'temp_align.out');
>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>     my $aln=$factory->align ($file);
>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>     return @temp_file;
>> }
>>  This subroutine is called by the following command:
>>  my @align_fa = align($inputfile_align);
>>  After successful execution of this subroutine (accompaning with the  
>> corresponding messages on the terminal window) the execution of  
>> remainder script is terminated without any error messages.
>
> The problem lies somewhere within the rest of your script, so we have to  
> see it if you want help.
>
> Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you  
> don't make use of the resulting alignment object? A system call might  
> make more sense given what you're doing. The beauty of  
> Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the  
> result file (temp_align.out) yourself.

The rest of script,imho, is ok, because without this sub it is work fine.  
May be problem lies into the TCoffee itself?

One of the feature of script is to estimate the quantity of nt changes in  
each position in the different similar sequences in comparing with  
consensus sequences. To perform this it is nesseccary to obtain the  
multiply alignment: the result of TCoffee alignment goes to another  
subroutine, that estemated the level of changes. Of course, I dont think  
that this way is the best approach, most probably there are a lot of the  
better ways to do it. But for my today purposes it is ok.

-- 


From avilella at gmail.com  Wed Apr 30 10:16:56 2008
From: avilella at gmail.com (Albert Vilella)
Date: Wed, 30 Apr 2008 15:16:56 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru> <48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>

Hi Sergei,

Can you try to isolate this call with a simpler example to see if it still
fails? When you say that the problems are in the compilation, do you mean
that the interpreter won't even compile or that it fails during execution?
Have you checked that you have all the dependencies right?

Cheers,

    Albert.

On Wed, Apr 30, 2008 at 2:36 PM, sergei ryazansky <dr.hogart at gmail.com>
wrote:

> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
>
>  sergei ryazansky wrote:
> >
> > > My subroutine is following:
> > >  sub align {
> > >    my $file=shift @_;
> > >    my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>
> > > 'fasta', 'outfile' => 'temp_align.out');
> > >    my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
> > >    my $aln=$factory->align ($file);
> > >    open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
> > >    return @temp_file;
> > > }
> > >  This subroutine is called by the following command:
> > >  my @align_fa = align($inputfile_align);
> > >  After successful execution of this subroutine (accompaning with the
> > > corresponding messages on the terminal window) the execution of remainder
> > > script is terminated without any error messages.
> > >
> >
> > The problem lies somewhere within the rest of your script, so we have to
> > see it if you want help.
> >
> > Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you
> > don't make use of the resulting alignment object? A system call might make
> > more sense given what you're doing. The beauty of
> > Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the
> > result file (temp_align.out) yourself.
> >
>
> The rest of script,imho, is ok, because without this sub it is work fine.
> May be problem lies into the TCoffee itself?
>
> One of the feature of script is to estimate the quantity of nt changes in
> each position in the different similar sequences in comparing with consensus
> sequences. To perform this it is nesseccary to obtain the multiply
> alignment: the result of TCoffee alignment goes to another subroutine, that
> estemated the level of changes. Of course, I dont think that this way is the
> best approach, most probably there are a lot of the better ways to do it.
> But for my today purposes it is ok.
>
> --
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From bix at sendu.me.uk  Wed Apr 30 10:22:01 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 15:22:01 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <48188089.8000300@sendu.me.uk>

sergei ryazansky wrote:
> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
> 
>> sergei ryazansky wrote:
>>> My subroutine is following:
>>>  sub align {
>>>     my $file=shift @_;
>>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
>>> 'fasta', 'outfile' => 'temp_align.out');
>>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>>     my $aln=$factory->align ($file);
>>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>>     return @temp_file;
>>> }
>>>  This subroutine is called by the following command:
>>>  my @align_fa = align($inputfile_align);
>>>  After successful execution of this subroutine (accompaning with the 
>>> corresponding messages on the terminal window) the execution of 
>>> remainder script is terminated without any error messages.
>>
>> The problem lies somewhere within the rest of your script, so we have 
>> to see it if you want help.
> 
> The rest of script,imho, is ok, because without this sub it is work 
> fine. May be problem lies into the TCoffee itself?

I've run your subroutine in a simple script of my own and it doesn't 
cause script termination. Again, the problem lies elsewhere in your 
script. Supply it or it is impossible for anyone to help you.


From Sebastien.Moretti at unil.ch  Wed Apr 30 10:06:28 2008
From: Sebastien.Moretti at unil.ch (Sebastien MORETTI)
Date: Wed, 30 Apr 2008 16:06:28 +0200
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <48187CE4.8030606@unil.ch>

>>> My subroutine is following:
>>>  sub align {
>>>     my $file=shift @_;
>>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
>>> 'fasta', 'outfile' => 'temp_align.out');
>>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>>     my $aln=$factory->align ($file);
>>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>>     return @temp_file;
>>> }
>>>  This subroutine is called by the following command:
>>>  my @align_fa = align($inputfile_align);
>>>  After successful execution of this subroutine (accompaning with the 
>>> corresponding messages on the terminal window) the execution of 
>>> remainder script is terminated without any error messages.
>>
>> The problem lies somewhere within the rest of your script, so we have 
>> to see it if you want help.
>>
>> Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you 
>> don't make use of the resulting alignment object? A system call might 
>> make more sense given what you're doing. The beauty of 
>> Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse 
>> the result file (temp_align.out) yourself.
> 
> The rest of script,imho, is ok, because without this sub it is work 
> fine. May be problem lies into the TCoffee itself?
> 
> One of the feature of script is to estimate the quantity of nt changes 
> in each position in the different similar sequences in comparing with 
> consensus sequences. To perform this it is nesseccary to obtain the 
> multiply alignment: the result of TCoffee alignment goes to another 
> subroutine, that estemated the level of changes. Of course, I dont think 
> that this way is the best approach, most probably there are a lot of the 
> better ways to do it. But for my today purposes it is ok.

Do you have tried to use the tcoffee command, called via bioperl, as a 
command line ?
To check if it is a problem with tcoffee or with the tcoffee release 
that bioperl must use.

-- 
S?bastien Moretti


From dr.hogart at gmail.com  Wed Apr 30 10:54:59 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 18:54:59 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
Message-ID: <op.uafidxitavnppr@hogart.img.ras.ru>

Hi Albert,

The isolated call is executed without any problem, so the code is  
absolutely correct. The problem arise when this sub executed within the  
whole script - after successful execution of TCoffee alignment the  
execution of the rest of script is terminated. The whole code is very big  
(~500 lines), so for simplicity lets imagine the sheme of script in the  
following view:
sub1;
sub2;
sub3;
sub align;  # TCoffe alignment;
sub4;
sub5;

Each sub (subroutine) is independent from the others subs; The order of  
script execution is 1,2,3,align,4,5. But after the execution of align the  
execution of the rest of subs (4 and 5) is terminated. The script without  
sub align {} successfully execute the sub 4 and sub 5. So, I mean that  
interpreter won't compile sub 4 and 5 if sub align is placed before them.

On Wed, 30 Apr 2008 18:16:56 +0400, Albert Vilella <avilella at gmail.com>  
wrote:

> Hi Sergei,
>
> Can you try to isolate this call with a simpler example to see if it  
> still
> fails? When you say that the problems are in the compilation, do you mean
> that the interpreter won't even compile or that it fails during  
> execution?
> Have you checked that you have all the dependencies right?
>
> Cheers,
>
>     Albert.
>
> On Wed, Apr 30, 2008 at 2:36 PM, sergei ryazansky <dr.hogart at gmail.com>
> wrote:
>
>> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
>>
>>  sergei ryazansky wrote:
>> >
>> > > My subroutine is following:
>> > >  sub align {
>> > >    my $file=shift @_;
>> > >    my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>
>> > > 'fasta', 'outfile' => 'temp_align.out');
>> > >    my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>> > >    my $aln=$factory->align ($file);
>> > >    open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>> > >    return @temp_file;
>> > > }
>> > >  This subroutine is called by the following command:
>> > >  my @align_fa = align($inputfile_align);
>> > >  After successful execution of this subroutine (accompaning with the
>> > > corresponding messages on the terminal window) the execution of  
>> remainder
>> > > script is terminated without any error messages.
>> > >
>> >
>> > The problem lies somewhere within the rest of your script, so we have  
>> to
>> > see it if you want help.
>> >
>> > Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you
>> > don't make use of the resulting alignment object? A system call might  
>> make
>> > more sense given what you're doing. The beauty of
>> > Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse  
>> the
>> > result file (temp_align.out) yourself.
>> >
>>
>> The rest of script,imho, is ok, because without this sub it is work  
>> fine.
>> May be problem lies into the TCoffee itself?
>>
>> One of the feature of script is to estimate the quantity of nt changes  
>> in
>> each position in the different similar sequences in comparing with  
>> consensus
>> sequences. To perform this it is nesseccary to obtain the multiply
>> alignment: the result of TCoffee alignment goes to another subroutine,  
>> that
>> estemated the level of changes. Of course, I dont think that this way  
>> is the
>> best approach, most probably there are a lot of the better ways to do  
>> it.
>> But for my today purposes it is ok.
>>
>> --
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>


-- 
?????????? M2, ????????????? ???????? ?????????? Opera:  
http://www.opera.com/mail/mail/


From dr.hogart at gmail.com  Wed Apr 30 11:14:09 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 19:14:09 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru> <48187CE4.8030606@unil.ch>
Message-ID: <op.uafi7ytravnppr@hogart.img.ras.ru>

No, I didn tried.
To tell the truth the problem like this I have obtatin earlier. I simply  
wanted to aling the several set of sequences by TCoffee Bioperl package.  
The script should have been consequently add the set one after another to  
TCoffee wrapper. But after the alignment of the first set of sequences the  
alignment of the rest sets was terminated. So it was neccessary to use  
another "super_script" that called first script with different arguments  
linked to the corresponding set.


> Do you have tried to use the tcoffee command, called via bioperl, as a  
> command line ?


-- 


From bix at sendu.me.uk  Wed Apr 30 11:28:50 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 16:28:50 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uafidxitavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>	<op.uaferwytavnppr@hogart.img.ras.ru>	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru>
Message-ID: <48189032.20102@sendu.me.uk>

sergei ryazansky wrote:
> Hi Albert,
> 
> The isolated call is executed without any problem, so the code is 
> absolutely correct. The problem arise when this sub executed within the 
> whole script - after successful execution of TCoffee alignment the 
> execution of the rest of script is terminated. The whole code is very 
> big (~500 lines), so for simplicity lets imagine the sheme of script in 
> the following view:
> sub1;
> sub2;
> sub3;
> sub align;  # TCoffe alignment;
> sub4;
> sub5;
> 
> Each sub (subroutine) is independent from the others subs; The order of 
> script execution is 1,2,3,align,4,5. But after the execution of align 
> the execution of the rest of subs (4 and 5) is terminated. The script 
> without sub align {} successfully execute the sub 4 and sub 5. So, I 
> mean that interpreter won't compile sub 4 and 5 if sub align is placed 
> before them.

This has nothing to do with interpreter compilation, which is successful 
if the script runs at all.

What do you do with the output of &align? The thing you are doing with 
that output is most likely the cause of your script terminating, which 
is why &sub4 and &sub5 run when you don't run &align (have no output 
that causes the problem).

If you're not willing to show us your script, here are some simple 
debugging steps you can do yourself:

# don't do anything with the output of align() - does &sub4 still run?

# add some print statements after you call align(), and then after every 
further block of code in your script to see exactly where the script 
terminates

# reduce your script down to a minimal script that shows the problem 
(with the help of the previous step) and show us that


From dr.hogart at gmail.com  Wed Apr 30 11:42:41 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 19:42:41 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafkhojw9ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
Message-ID: <op.uafklfmd9ju7si@hogart.img.ras.ru>


------- Forwarded message -------
From: "Sergei Ryazansky" <dr.hogart at gmail.com>
To: "Sendu Bala" <bix at sendu.me.uk>
Cc:
Subject: Re: [Bioperl-l] alignment by TCoffee as a subroutine
Date: Wed, 30 Apr 2008 19:40:26 +0400

> What do you do with the output of &align? The thing you are doing with  
> that output is most likely the cause of your script terminating, which  
> is why &sub4 and &sub5 run when you don't run &align (have no output  
> that causes the problem).

please sea my answer to Sebastien Moretti - there are description of
another similar problem. The only thing that I did there with output is
printing to file. Nevetheless the problem was the same.

> # don't do anything with the output of align() - does &sub4 still run?

please sea above.

> # add some print statements after you call align(), and then after every  
> further block of code in your script to see exactly where the script  
> terminates
> # reduce your script down to a minimal script that shows the problem  
> (with the help of the previous step) and show us that

all tests with individual bloks was performed earlier. the results is ok.


From cjfields at uiuc.edu  Wed Apr 30 12:25:06 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 11:25:06 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafklfmd9ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
Message-ID: <5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>

Sergei,

I agree with Sendu; we can't diagnose this unless we either have the  
entire script of a minimal version of it demonstrating the bug.

The best way to handle this is to file a bug report, attaching  
relevant data using the 'Create a new attachment' link (including  
either the full script or a shortened one which demonstrates the bug).  
Otherwise we're just shooting in the dark trying to diagnose the  
problem.

http://bugzilla.open-bio.org/

chris

On Apr 30, 2008, at 10:42 AM, Sergei Ryazansky wrote:

>
>
> ------- Forwarded message -------
> From: "Sergei Ryazansky" <dr.hogart at gmail.com>
> To: "Sendu Bala" <bix at sendu.me.uk>
> Cc:
> Subject: Re: [Bioperl-l] alignment by TCoffee as a subroutine
> Date: Wed, 30 Apr 2008 19:40:26 +0400
>
>> What do you do with the output of &align? The thing you are doing  
>> with that output is most likely the cause of your script  
>> terminating, which is why &sub4 and &sub5 run when you don't run  
>> &align (have no output that causes the problem).
>
> please sea my answer to Sebastien Moretti - there are description of
> another similar problem. The only thing that I did there with output  
> is
> printing to file. Nevetheless the problem was the same.
>
>> # don't do anything with the output of align() - does &sub4 still  
>> run?
>
> please sea above.
>
>> # add some print statements after you call align(), and then after  
>> every further block of code in your script to see exactly where the  
>> script terminates
>> # reduce your script down to a minimal script that shows the  
>> problem (with the help of the previous step) and show us that
>
> all tests with individual bloks was performed earlier. the results  
> is ok.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dr.hogart at gmail.com  Wed Apr 30 12:40:19 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 20:40:19 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
Message-ID: <op.uafm9hl79ju7si@hogart.img.ras.ru>

On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu> wrote:

Chris, I have already sent file to Sendu and also I am attaching it here.  
I have removed from it really unnecessary parts.

> Sergei,
>
> I agree with Sendu; we can't diagnose this unless we either have the  
> entire script of a minimal version of it demonstrating the bug.
>
> The best way to handle this is to file a bug report, attaching relevant  
> data using the 'Create a new attachment' link (including either the full  
> script or a shortened one which demonstrates the bug). Otherwise we're  
> just shooting in the dark trying to diagnose the problem.
>
> http://bugzilla.open-bio.org/
>
> chris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: script.pl
Type: application/octet-stream
Size: 6870 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20080430/6aef0fde/attachment-0003.obj>

From cjfields at uiuc.edu  Wed Apr 30 13:02:19 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 12:02:19 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafm9hl79ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
Message-ID: <EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>

Hmm, maybe you were confused?  From my last email:

"The best way to handle this is to file a bug report, attaching  
relevant data using the 'Create a new attachment' link (including  
either the full script or a shortened one which demonstrates the bug).  
Otherwise we're just shooting in the dark trying to diagnose the  
problem."

http://bugzilla.open-bio.org/

Anyone can work on fixing the issue there (so it'll probably get fixed  
faster).  The devs can also track progress on the problem via the dev  
mail list (bioperl-guts).  Diagnosing the bug may also reveal issues  
not just with Bio::Tools::Run::Alignment::TCoffee but also with other  
related modules.

If needed I can post it to bugzilla, but it helps to submit the bug  
yourself (so you can receive posts on it's progress).

chris

On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:

> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu>  
> wrote:
>
> Chris, I have already sent file to Sendu and also I am attaching it  
> here. I have removed from it really unnecessary parts.
>
>> Sergei,
>>
>> I agree with Sendu; we can't diagnose this unless we either have  
>> the entire script of a minimal version of it demonstrating the bug.
>>
>> The best way to handle this is to file a bug report, attaching  
>> relevant data using the 'Create a new attachment' link (including  
>> either the full script or a shortened one which demonstrates the  
>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>> the problem.
>>
>> http://bugzilla.open-bio.org/
>>
>> chris


From dr.hogart at gmail.com  Wed Apr 30 13:39:35 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 21:39:35 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafop6079ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
	<EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>
	<op.uafop6079ju7si@hogart.img.ras.ru>
Message-ID: <op.uafpz9n79ju7si@hogart.img.ras.ru>

On Wed, 30 Apr 2008 21:11:56 +0400, Sergei Ryazansky <dr.hogart at gmail.com>  
wrote:

> Oh, sorry, you right - I too fast read you message. I do it slight later.
>
>> Hmm, maybe you were confused?  From my last email:
>>
>> "The best way to handle this is to file a bug report, attaching  
>> relevant data using the 'Create a new attachment' link (including  
>> either the full script or a shortened one which demonstrates the bug).  
>> Otherwise we're just shooting in the dark trying to diagnose the  
>> problem."
>>
>> http://bugzilla.open-bio.org/
>>
>> Anyone can work on fixing the issue there (so it'll probably get fixed  
>> faster).  The devs can also track progress on the problem via the dev  
>> mail list (bioperl-guts).  Diagnosing the bug may also reveal issues  
>> not just with Bio::Tools::Run::Alignment::TCoffee but also with other  
>> related modules.
>>
>> If needed I can post it to bugzilla, but it helps to submit the bug  
>> yourself (so you can receive posts on it's progress).
>>
>> chris
>>
>> On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:
>>
>>> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu>  
>>> wrote:
>>>
>>> Chris, I have already sent file to Sendu and also I am attaching it  
>>> here. I have removed from it really unnecessary parts.
>>>
>>>> Sergei,
>>>>
>>>> I agree with Sendu; we can't diagnose this unless we either have the  
>>>> entire script of a minimal version of it demonstrating the bug.
>>>>
>>>> The best way to handle this is to file a bug report, attaching  
>>>> relevant data using the 'Create a new attachment' link (including  
>>>> either the full script or a shortened one which demonstrates the  
>>>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>>>> the problem.
>>>>
>>>> http://bugzilla.open-bio.org/
>>>>
>>>> chris
>


From cjfields at uiuc.edu  Wed Apr 30 14:29:28 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 13:29:28 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafpz9n79ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
	<EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>
	<op.uafop6079ju7si@hogart.img.ras.ru>
	<op.uafpz9n79ju7si@hogart.img.ras.ru>
Message-ID: <39A139E4-6783-41E6-8EE9-1FE60CB57577@uiuc.edu>

Sorry, didn't catch that...

chris

On Apr 30, 2008, at 12:39 PM, Sergei Ryazansky wrote:

> On Wed, 30 Apr 2008 21:11:56 +0400, Sergei Ryazansky <dr.hogart at gmail.com 
> > wrote:
>
>> Oh, sorry, you right - I too fast read you message. I do it slight  
>> later.
>>
>>> Hmm, maybe you were confused?  From my last email:
>>>
>>> "The best way to handle this is to file a bug report, attaching  
>>> relevant data using the 'Create a new attachment' link (including  
>>> either the full script or a shortened one which demonstrates the  
>>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>>> the problem."
>>>
>>> http://bugzilla.open-bio.org/
>>>
>>> Anyone can work on fixing the issue there (so it'll probably get  
>>> fixed faster).  The devs can also track progress on the problem  
>>> via the dev mail list (bioperl-guts).  Diagnosing the bug may also  
>>> reveal issues not just with Bio::Tools::Run::Alignment::TCoffee  
>>> but also with other related modules.
>>>
>>> If needed I can post it to bugzilla, but it helps to submit the  
>>> bug yourself (so you can receive posts on it's progress).
>>>
>>> chris
>>>
>>> On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:
>>>
>>>> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields  
>>>> <cjfields at uiuc.edu> wrote:
>>>>
>>>> Chris, I have already sent file to Sendu and also I am attaching  
>>>> it here. I have removed from it really unnecessary parts.
>>>>
>>>>> Sergei,
>>>>>
>>>>> I agree with Sendu; we can't diagnose this unless we either have  
>>>>> the entire script of a minimal version of it demonstrating the  
>>>>> bug.
>>>>>
>>>>> The best way to handle this is to file a bug report, attaching  
>>>>> relevant data using the 'Create a new attachment' link  
>>>>> (including either the full script or a shortened one which  
>>>>> demonstrates the bug). Otherwise we're just shooting in the dark  
>>>>> trying to diagnose the problem.
>>>>>
>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> chris
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Tue Apr  1 12:31:49 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Tue, 01 Apr 2008 14:31:49 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
Message-ID: <47F22B35.1030502@awi.de>

Dear list,

we have recently started to try to find a solution for indexing large 
sequence databases / flat files for a java project, and because we ran 
into problems using biojava, and because both the OBDA and BioSQL ways 
seem to be compatible across bio~ projects, we also started to 
experiment with bioperl. It looks like this should work fine, but we had 
a couple of problems here, too. Perhaps some of you can give me hint 
what we are doing wrong!

The first thing we tried was to use Bio::DB::Flat for indexing a TrEMBL 
flat file (~ 12 GB); but it seems we haven?t got a machine with enough 
memory to be able to handle this. (Perhaps you would be using the "bdb" 
style index in such a case in bioperl, but this apparently doesn?t work 
with biojava, so we had to stick with "flat"). So next we started to 
test BioSQL, by trying to load just Swissprot in a MySQL DB first, like:

load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser xyz 
--dbpass abc --driver mysql --namespace uniprot_sprot --format swiss 
uniprot_sprot.dat

Here we get an error message

###########################################

Loading /biodb/spinkern/uniprot_sprot.dat ...
Could not store Q6DAH5:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: The supplied lineage does not start near 'Erwinia carotovora subsp. 
atroseptica' (I was supplied 'Erwinia carotovora subsp. | Pectobacterium 
| Enterobacteriaceae | Enterobacteriales | Gammaproteobacteria | 
Proteobacteria | Bacteria')
STACK: Error::throw
STACK: Bio::Root::Root::throw 
/biodb/spinkern/bioperl-1.5/bioperl-1.5.2_102/Bio/Root/Root.pm:359
STACK: Bio::Species::classification 
/biodb/spinkern/bioperl-1.5/bioperl-1.5.2_102/Bio/Species.pm:174
STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:552 

STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:1305 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:973 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:852 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182 

STACK: Bio::DB::Persistent::PersistentObject::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:244 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169 

STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251 

STACK: Bio::DB::Persistent::PersistentObject::store 
/biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271 

STACK: load_seqdatabase.pl:622
-----------------------------------------------------------

at load_seqdatabase.pl line 635

############################################

or similar, depending on whether we use a pre-loaded ncbi taxonomy or 
not, and which Swissprot release we are trying to load. It often seems 
to come from sg. like here, subsp. or other special addition to the 
species line; but alternative genus names and other curious things also 
to appear. It looks like Species.pm tries to validate the species name 
against the lineage info already there in the BioSQL DB, and in several 
cases, it finds inconsistencies. If we start with the ncbi taxonomy 
already loaded in the database, the first error comes much earlier.

I found a thread on the same problem from ~ two years ago 
(http://thread.gmane.org/gmane.comp.lang.perl.bio.general/13766/focus=13788), 
where the solution recommended was to update bioperl, so I was quite 
surprised to find the problem with the version you can see above 
(1.5.2_102 bioperl core, 1.5.2_100 bioperl_db). Can someone give me any 
hints as to what is going wrong here?

The only workaround we have found so far was to comment out line 174 in 
Species.pm:

$self->throw("The supplied lineage does not start near '$name' (I was 
supplied '".join(" | ", @vals)."')");

After doing so, load_seqdatabase.pl runs for several hours (until it 
evetually crashes; I haven?t found out yet why), but proceeds really 
slowly. I also found some info on this for Pg and Oracle in the mailing 
list, but has anyone some approximate numbers for MySQL, how long should 
a first Swissprot load take?

Would be grateful to hear about your ideas / experiences on these issues!

Bank Beszteri


Bioinformatics / Scientific Computing
Alfred Wegener Institute for Polar and Marine Research
Am Handelshafen 12.
27570 Bremerhaven
Germany


From cjfields at uiuc.edu  Wed Apr  2 00:45:28 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 1 Apr 2008 19:45:28 -0500
Subject: [Bioperl-l] quick update on bioperl nightly builds
Message-ID: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>

I'm simplifying the nightly build archive names (removing svn revision  
# and date) in case anyone needs to update bioperl-live/run/db/network  
on a regular basis (read: GBrowse installations).  When I have time  
I'll start working on automated builds, which will require some extra  
work with Module::Build and Build.PL.

chris


From hiekeen at gmail.com  Wed Apr  2 02:14:07 2008
From: hiekeen at gmail.com (Jinyan Huang)
Date: Wed, 2 Apr 2008 10:14:07 +0800
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
Message-ID: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>

I have 20 pathways. My interesting genes are in these pathways. There
are some genes overlaps in these pathways. How can I make a graphic
network using these genes? It means connecting these pathways through
these overlap genes. What kind of software can I use?

Thank you very much in advance.

-- 
Best regards,
Jinyan Huang (ekeen)
School of Life Sciences and Technology, 1302 Room
Tongji University
Siping Road 1239, Shanghai 200092
P.R. China
Tel :0086-21-65981041
Msn: hiekeen at hotmail.com
eMail: hiekeen at gmail.com


From hlapp at gmx.net  Wed Apr  2 02:30:06 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 1 Apr 2008 22:30:06 -0400
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47F22B35.1030502@awi.de>
References: <47F22B35.1030502@awi.de>
Message-ID: <CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>


On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
> [...] So next we started to test BioSQL, by trying to load just  
> Swissprot in a MySQL DB first, like:
>
> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser  
> xyz --dbpass abc --driver mysql --namespace uniprot_sprot --format  
> swiss uniprot_sprot.dat
>
> Here we get an error message
>
> ###########################################
>
> Loading /biodb/spinkern/uniprot_sprot.dat ...
> Could not store Q6DAH5:
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: The supplied lineage does not start near 'Erwinia carotovora  
> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |  
> Pectobacterium | Enterobacteriaceae | Enterobacteriales |  
> Gammaproteobacteria | Proteobacteria | Bacteria')
> STACK: Error::throw
> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/ 
> bioperl-1.5.2_102/Bio/Root/Root.pm:359
> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/ 
> bioperl-1.5.2_102/Bio/Species.pm:174
> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 
> 552
> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object / 
> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:1305
> STACK:  
> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:973
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / 
> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:852
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:182
> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 
> 244
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:169
> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/ 
> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:251
> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/spinkern/ 
> bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271
> STACK: load_seqdatabase.pl:622
> -----------------------------------------------------------
>
> at load_seqdatabase.pl line 635
>
> ############################################
>
> or similar, depending on whether we use a pre-loaded ncbi taxonomy  
> or not

I recommend to always use a pre-loaded NCBI taxonomy unless you know  
there are only a few organisms that are straightforward (for the  
parser, that is).

> , and which Swissprot release we are trying to load. It often seems  
> to come from sg. like here, subsp. or other special addition to the  
> species line; but alternative genus names and other curious things  
> also to appear. It looks like Species.pm tries to validate the  
> species name against the lineage info already there in the BioSQL  
> DB, and in several cases, it finds inconsistencies.

It actually happens upon a successful lookup when the species object  
is populated from the database.

> [...]
> The only workaround we have found so far was to comment out line  
> 174 in Species.pm:
>
> $self->throw("The supplied lineage does not start near '$name' (I  
> was supplied '".join(" | ", @vals)."')");

That should be OK if you work with a pre-loaded taxonomy. It's sort  
of a sanity check that should catch a parser having messed up a  
species. If you use a pre-loaded NCBI taxonomy the results of the  
species parsing don't matter in all details so long as the NCBI  
taxonID is parsed out correctly, and then found in the database.

Note that this actually a warn() in the main trunk version of  
BioPerl, so you might want to upgrade to that (or change throw() to  
warn() in your version). You still get the records flagged with that,  
but it isn't an exception.

>
> After doing so, load_seqdatabase.pl runs for several hours (until  
> it evetually crashes; I haven?t found out yet why), but proceeds  
> really slowly.

It should certainly *not* crash. Note also that you can supply --safe  
on the command line, in which case the script will continue with the  
next record if one fails to load for whatever reason.

You will want to adjust the width constraint of dbxref.accession, for  
example to 128 chars. This will also be fixed for BioSQL 1.0.1.
See http://bugzilla.open-bio.org/show_bug.cgi?id=2474


> I also found some info on this for Pg and Oracle in the mailing  
> list, but has anyone some approximate numbers for MySQL, how long  
> should a first Swissprot load take?

Possibly around 20 hours according to Erik Rijkers:
See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html

You can use the --logchunks N option to have it print out performance  
statistics every N records.

Hope this helps,

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Wed Apr  2 02:38:12 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Tue, 1 Apr 2008 22:38:12 -0400
Subject: [Bioperl-l] Very basic implementation of GenBank XML SeqIO
	module
In-Reply-To: <47F13C2C.4070909@umdnj.edu>
References: <47F13C2C.4070909@umdnj.edu>
Message-ID: <DBDEDED2-656B-4CFD-B603-C0868ED5DAD9@gmx.net>

Ryan - do you not have a committer account?

I do agree with Chris on the test. Modules w/o tests tend to become  
'pseudogenized.'

	-hilmar

On Mar 31, 2008, at 3:31 PM, Ryan Golhar wrote:
> I have a (very) basic SAX implementation of a SeqIO module to parse  
> GenBank XML records.  Right now, it only reads in basic information  
> regarding the sequence and the sequence itself.
>
> It does not yet parse the features table.  Should I submit it to be  
> included in bioperl or wait until I implement more for the features  
> table?  I'm not sure when I'll get around to it though
>
> Ryan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cain.cshl at gmail.com  Wed Apr  2 03:12:04 2008
From: cain.cshl at gmail.com (Scott Cain)
Date: Tue, 01 Apr 2008 23:12:04 -0400
Subject: [Bioperl-l] quick update on bioperl nightly builds
In-Reply-To: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
Message-ID: <1207105924.6184.4.camel@frissell>

Hi Chris,

The tarball is currently (Apr 1) being built in a tmp directory, so that
the extracted tarball is ./tmp/bioperl-live/.  Is that intended?

Thanks,
Scott

On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
> I'm simplifying the nightly build archive names (removing svn revision  
> # and date) in case anyone needs to update bioperl-live/run/db/network  
> on a regular basis (read: GBrowse installations).  When I have time  
> I'll start working on automated builds, which will require some extra  
> work with Module::Build and Build.PL.
> 
> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                         cain at cshl.edu
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory


From cjfields at uiuc.edu  Wed Apr  2 03:59:30 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 1 Apr 2008 22:59:30 -0500
Subject: [Bioperl-l] quick update on bioperl nightly builds
In-Reply-To: <1207105924.6184.4.camel@frissell>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
Message-ID: <D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>

Nope, that isn't intended.  I fixed it and reran it manually, so it  
should be fine now (note I didn't update the log file; the next cron  
run will catch that).

I may toy around with your recent passthrough flag addition to try  
getting automated PPM's up and running.

chris

On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:

> Hi Chris,
>
> The tarball is currently (Apr 1) being built in a tmp directory, so  
> that
> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>
> Thanks,
> Scott
>
> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>> I'm simplifying the nightly build archive names (removing svn  
>> revision
>> # and date) in case anyone needs to update bioperl-live/run/db/ 
>> network
>> on a regular basis (read: GBrowse installations).  When I have time
>> I'll start working on automated builds, which will require some extra
>> work with Module::Build and Build.PL.
>>
>> chris
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                         cain at cshl.edu
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sdavis2 at mail.nih.gov  Wed Apr  2 11:33:38 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Wed, 2 Apr 2008 07:33:38 -0400
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
In-Reply-To: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
Message-ID: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>

On Tue, Apr 1, 2008 at 10:14 PM, Jinyan Huang <hiekeen at gmail.com> wrote:
> I have 20 pathways. My interesting genes are in these pathways. There
>  are some genes overlaps in these pathways. How can I make a graphic
>  network using these genes? It means connecting these pathways through
>  these overlap genes. What kind of software can I use?

R/Bioconductor has tools for working with graphs and pathways.
Cytoscape is another open-source graphical solution.  Ingenuity is, of
course, not free.  If you are looking at a perl solution, you can look
at the various graph modules and their integration with the Graphviz
libraries.

SEan


From cain.cshl at gmail.com  Wed Apr  2 12:28:22 2008
From: cain.cshl at gmail.com (Scott Cain)
Date: Wed, 02 Apr 2008 08:28:22 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl
	nightly	builds
In-Reply-To: <D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
Message-ID: <1207139302.6507.7.camel@frissell>

Hi Chris,

(trimmed out gbrowse mailing list since this is just bioperl business)

Speaking of the pass through stuff, Sendu mentioned that I stomped on
some changes to Build.PL that you and he did when I committed that
change, so it should be rolled back.  Is there a good (svn) way to do
that?  Or should I just copy the contents of the old (good) Build.PL
into a fresh file in my checkout and commit it?

Thanks,
Scott

On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
> Nope, that isn't intended.  I fixed it and reran it manually, so it  
> should be fine now (note I didn't update the log file; the next cron  
> run will catch that).
> 
> I may toy around with your recent passthrough flag addition to try  
> getting automated PPM's up and running.
> 
> chris
> 
> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
> 
> > Hi Chris,
> >
> > The tarball is currently (Apr 1) being built in a tmp directory, so  
> > that
> > the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
> >
> > Thanks,
> > Scott
> >
> > On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
> >> I'm simplifying the nightly build archive names (removing svn  
> >> revision
> >> # and date) in case anyone needs to update bioperl-live/run/db/ 
> >> network
> >> on a regular basis (read: GBrowse installations).  When I have time
> >> I'll start working on automated builds, which will require some extra
> >> work with Module::Build and Build.PL.
> >>
> >> chris
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > -- 
> > ------------------------------------------------------------------------
> > Scott Cain, Ph. D.                                         cain at cshl.edu
> > GMOD Coordinator (http://www.gmod.org/)                      
> > 216-392-3087
> > Cold Spring Harbor Laboratory
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   cain.cshl at gmail.com
GMOD Coordinator (http://www.gmod.org/)                     216-392-3087
Cold Spring Harbor Laboratory


From robert.citek at gmail.com  Wed Apr  2 12:24:06 2008
From: robert.citek at gmail.com (Robert Citek)
Date: Wed, 2 Apr 2008 07:24:06 -0500
Subject: [Bioperl-l] module for pubchem queries
Message-ID: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>

Hello all,

I have a list of chemical compounds that have some kind of interaction
with proteins or genes.  The current list contains names or SMILES and
I would like to get the CID number for those compounds.  Currently,
I'm using perl to query the NCBI's eutils[1], which works great.  But
I was just curious to know of there was a bioperl module to do
something similar.  A quick google didn't turn up anything, so I
thought I'd ask.

[1] http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html

Regards,
- Robert


From David.Messina at sbc.su.se  Wed Apr  2 12:41:45 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 2 Apr 2008 14:41:45 +0200
Subject: [Bioperl-l] How to make a network graphic using my genes in
	pathways?
In-Reply-To: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
Message-ID: <628aabb70804020541v6cee4584ibd9935290ae7cc0a@mail.gmail.com>

I have no personal experience with it, but a colleague of mine suggested
VisANT <http://visant.bu.edu/>.


Dave


From cjfields at uiuc.edu  Wed Apr  2 15:03:32 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 2 Apr 2008 10:03:32 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl nightly
	builds
In-Reply-To: <1207139302.6507.7.camel@frissell>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
	<1207139302.6507.7.camel@frissell>
Message-ID: <3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>

The changes I made were related to problems checking MySQL for  
Bio::DB::SeqFeature::Store tests when connectivity requires username/ 
password.  For some reason it tests DB connectivity up front, while  
Bio::DB::GFF assumes the DB setup is correct (no direct DB check) then  
runs tests assuming the setup is correct.

You can view the diffs for your commits here:

http://code.open-bio.org/svnweb/index.cgi/bioperl/diff/bioperl-live/trunk/ModuleBuildBioperl.pm?revs=14604&revs=14548

http://code.open-bio.org/svnweb/index.cgi/bioperl/diff/bioperl-live/trunk/Build.PL?revs=14604&revs=14565

I'll try working on merging them together today; it shouldn't be too  
hard (the changes were fairly minor in both Build.PL and  
Module::Build).  I'll test to make sure your changes stay in as well.   
Down the road I believe we need to rethink how we want the Build  
process to run using Module::Build as it's a bit convoluted, but it  
works for now.

chris

On Apr 2, 2008, at 7:28 AM, Scott Cain wrote:
> Hi Chris,
>
> (trimmed out gbrowse mailing list since this is just bioperl business)
>
> Speaking of the pass through stuff, Sendu mentioned that I stomped on
> some changes to Build.PL that you and he did when I committed that
> change, so it should be rolled back.  Is there a good (svn) way to do
> that?  Or should I just copy the contents of the old (good) Build.PL
> into a fresh file in my checkout and commit it?
>
> Thanks,
> Scott
>
> On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
>> Nope, that isn't intended.  I fixed it and reran it manually, so it
>> should be fine now (note I didn't update the log file; the next cron
>> run will catch that).
>>
>> I may toy around with your recent passthrough flag addition to try
>> getting automated PPM's up and running.
>>
>> chris
>>
>> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
>>
>>> Hi Chris,
>>>
>>> The tarball is currently (Apr 1) being built in a tmp directory, so
>>> that
>>> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>>>
>>> Thanks,
>>> Scott
>>>
>>> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>>>> I'm simplifying the nightly build archive names (removing svn
>>>> revision
>>>> # and date) in case anyone needs to update bioperl-live/run/db/
>>>> network
>>>> on a regular basis (read: GBrowse installations).  When I have time
>>>> I'll start working on automated builds, which will require some  
>>>> extra
>>>> work with Module::Build and Build.PL.
>>>>
>>>> chris
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>> -- 
>>> ------------------------------------------------------------------------
>>> Scott Cain, Ph. D.                                         cain at cshl.edu
>>> GMOD Coordinator (http://www.gmod.org/)
>>> 216-392-3087
>>> Cold Spring Harbor Laboratory
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> -------------------------------------------------------------------------
>> Check out the new SourceForge.net Marketplace.
>> It's the best place to buy or sell services for
>> just about anything Open Source.
>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>> _______________________________________________
>> Gmod-gbrowse mailing list
>> Gmod-gbrowse at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   cain.cshl at gmail.com
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Wed Apr  2 15:54:05 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 2 Apr 2008 10:54:05 -0500
Subject: [Bioperl-l] [Gmod-gbrowse] quick update on bioperl nightly
	builds
In-Reply-To: <3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>
References: <02D78F8E-276F-46C1-91CD-F80BA6A09C14@uiuc.edu>
	<1207105924.6184.4.camel@frissell>
	<D8F81518-1389-46E9-811E-FA0559BBC27C@uiuc.edu>
	<1207139302.6507.7.camel@frissell>
	<3B490712-3413-4662-99D7-7B115CECB6E1@uiuc.edu>
Message-ID: <71375DA3-A751-4908-8000-D9ACAE39B19C@uiuc.edu>

Okay, committed them.  The accept passthrough still appears to work;  
let me know if anything pops up.

chris

On Apr 2, 2008, at 10:03 AM, Chris Fields wrote:

> ...
> I'll try working on merging them together today; it shouldn't be too  
> hard (the changes were fairly minor in both Build.PL and  
> Module::Build).  I'll test to make sure your changes stay in as  
> well.  Down the road I believe we need to rethink how we want the  
> Build process to run using Module::Build as it's a bit convoluted,  
> but it works for now.
>
> chris
>
> On Apr 2, 2008, at 7:28 AM, Scott Cain wrote:
>> Hi Chris,
>>
>> (trimmed out gbrowse mailing list since this is just bioperl  
>> business)
>>
>> Speaking of the pass through stuff, Sendu mentioned that I stomped on
>> some changes to Build.PL that you and he did when I committed that
>> change, so it should be rolled back.  Is there a good (svn) way to do
>> that?  Or should I just copy the contents of the old (good) Build.PL
>> into a fresh file in my checkout and commit it?
>>
>> Thanks,
>> Scott
>>
>> On Tue, 2008-04-01 at 22:59 -0500, Chris Fields wrote:
>>> Nope, that isn't intended.  I fixed it and reran it manually, so it
>>> should be fine now (note I didn't update the log file; the next cron
>>> run will catch that).
>>>
>>> I may toy around with your recent passthrough flag addition to try
>>> getting automated PPM's up and running.
>>>
>>> chris
>>>
>>> On Apr 1, 2008, at 10:12 PM, Scott Cain wrote:
>>>
>>>> Hi Chris,
>>>>
>>>> The tarball is currently (Apr 1) being built in a tmp directory, so
>>>> that
>>>> the extracted tarball is ./tmp/bioperl-live/.  Is that intended?
>>>>
>>>> Thanks,
>>>> Scott
>>>>
>>>> On Tue, 2008-04-01 at 19:45 -0500, Chris Fields wrote:
>>>>> I'm simplifying the nightly build archive names (removing svn
>>>>> revision
>>>>> # and date) in case anyone needs to update bioperl-live/run/db/
>>>>> network
>>>>> on a regular basis (read: GBrowse installations).  When I have  
>>>>> time
>>>>> I'll start working on automated builds, which will require some  
>>>>> extra
>>>>> work with Module::Build and Build.PL.
>>>>>
>>>>> chris
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>> -- 
>>>> ------------------------------------------------------------------------
>>>> Scott Cain, Ph. D.                                         cain at cshl.edu
>>>> GMOD Coordinator (http://www.gmod.org/)
>>>> 216-392-3087
>>>> Cold Spring Harbor Laboratory
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>> -------------------------------------------------------------------------
>>> Check out the new SourceForge.net Marketplace.
>>> It's the best place to buy or sell services for
>>> just about anything Open Source.
>>> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
>>> _______________________________________________
>>> Gmod-gbrowse mailing list
>>> Gmod-gbrowse at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>> -- 
>> ------------------------------------------------------------------------
>> Scott Cain, Ph. D.                                   cain.cshl at gmail.com
>> GMOD Coordinator (http://www.gmod.org/)                      
>> 216-392-3087
>> Cold Spring Harbor Laboratory
>>
>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From zhpan99 at yahoo.com  Wed Apr  2 17:52:46 2008
From: zhpan99 at yahoo.com (Pan Zheng)
Date: Wed, 2 Apr 2008 10:52:46 -0700 (PDT)
Subject: [Bioperl-l] installing bioperl-1.5.2 errors:DB_File
Message-ID: <726978.82400.qm@web53105.mail.re2.yahoo.com>

Hi,
   
  I am installing bioperl-1.5.2_102 under cygwin on my Windows XP and having some errors during the process.
   
  When I was running "perl Build test", one major error is the error about DB_File. I tried to install DB_File from cpan and rpm without any luck.
   
  ++++++++++++++++++++++++
  CPAN: File::Temp loaded ok (v0.16)
CPAN: YAML loaded ok (v0.62)
    CPAN.pm: Going to build P/PM/PMQS/DB_File-1.817.tar.gz
  Parsing config.in...
Looks Good.
Checking if your kit is complete...
Looks good
Note (probably harmless): No library found for -ldb
Writing Makefile for DB_File
cp DB_File.pm blib/lib/DB_File.pm
AutoSplitting blib/lib/DB_File.pm (blib/lib/auto/DB_File)
gcc -c  -I/usr/local/BerkeleyDB/include -DPERL_USE_SAFE_PUTENV -fno-strict-alias
ing -pipe -Wdeclaration-after-statement -DUSEIMPORTLIB -O3   -DVERSION=\"1.817\"
 -DXS_VERSION=\"1.817\"  "-I/usr/lib/perl5/5.8/cygwin/CORE"  -D_NOT_CORE  -DmDB_
Prefix_t=size_t -DmDB_Hash_t=u_int32_t   version.c
version.c:30:16: db.h: No such file or directory
make: *** [version.o] Error 1
  PMQS/DB_File-1.817.tar.gz
  /usr/bin/make -- NOT OK
Running make test
  Can't test without successful make
Running make install
  Make had returned bad status, install seems impossible
Failed during this command:
 PMQS/DB_File-1.817.tar.gz                    : make NO
  +++++++++++++++++++++++++++++++++++++++++++++++
   
   
  I can't remember I had this kind error while installing earlier version.
   
  Would you please help me on DB_File installation ?
   
  Thanks.
   
  Pan

       
---------------------------------
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.


From dr.hogart at gmail.com  Thu Apr  3 13:01:03 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Thu, 03 Apr 2008 17:01:03 +0400
Subject: [Bioperl-l] support of clustalw2 in bio::run::tool::alignment
Message-ID: <op.t81c31ljavnppr@hogart.img.ras.ru>

As for as I understand clustalw2 is not supported in bioperl v1.5.2.100.  
In what version it will be realized?
Thank you in advance.


From slduncan at iastate.edu  Thu Apr  3 18:13:16 2008
From: slduncan at iastate.edu (slduncan at iastate.edu)
Date: Thu, 3 Apr 2008 13:13:16 -0500 (CDT)
Subject: [Bioperl-l] help installing bioperl with cygwin
Message-ID: <161313331084931@webmail.iastate.edu>

I am trying to use cpan to install bioperl and I had an error message saying:
c:\Documents not recognized as and external or internal....
Any ideas here.  Also, I am new to the computer world so please be kind. :)

Stacy Duncan
Iowa State University
Bioinformatics and Computational Biology
1802 University Blvd.
VMRI Building 6
Ames, IA 50011-1240
office phone: (515) 294-8385
office fax: (515) 294-1401
home phone: (336) 965-5622
e-mail: slduncan at iastate.edu


From cjfields at uiuc.edu  Fri Apr  4 20:13:23 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 15:13:23 -0500
Subject: [Bioperl-l] help installing bioperl with cygwin
In-Reply-To: <161313331084931@webmail.iastate.edu>
References: <161313331084931@webmail.iastate.edu>
Message-ID: <B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>

It's best if you use ActiveState's Perl installation (it's the only  
one we really support at this moment, unless someone wants to give  
StrawberryPerl a run).  See:

http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows

chris

On Apr 3, 2008, at 1:13 PM, slduncan at iastate.edu wrote:

> I am trying to use cpan to install bioperl and I had an error  
> message saying:
> c:\Documents not recognized as and external or internal....
> Any ideas here.  Also, I am new to the computer world so please be  
> kind. :)
>
> Stacy Duncan
> Iowa State University
> Bioinformatics and Computational Biology
> 1802 University Blvd.
> VMRI Building 6
> Ames, IA 50011-1240
> office phone: (515) 294-8385
> office fax: (515) 294-1401
> home phone: (336) 965-5622
> e-mail: slduncan at iastate.edu
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Apr  4 20:07:12 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 15:07:12 -0500
Subject: [Bioperl-l] installing bioperl-1.5.2 errors:DB_File
In-Reply-To: <726978.82400.qm@web53105.mail.re2.yahoo.com>
References: <726978.82400.qm@web53105.mail.re2.yahoo.com>
Message-ID: <F786C444-6A18-4AA5-8AE8-6C0ECEEACC5E@uiuc.edu>

I think you have to use the cygwin installer to install DB_File (it  
also installs dependencies, such as BDB).  According to 'perldoc  
perlcygwin':

....
Optional Libraries for Perl on Cygwin

Several Perl functions and modules depend on the existence of some  
optional libraries. Configure will find them if they are installed in  
one of the directories listed as being used for library searches. Pre- 
built packages for most of these are available from the Cygwin  
installer.
....

chris
On Apr 2, 2008, at 12:52 PM, Pan Zheng wrote:

> Hi,
>
>  I am installing bioperl-1.5.2_102 under cygwin on my Windows XP and  
> having some errors during the process.
>
>  When I was running "perl Build test", one major error is the error  
> about DB_File. I tried to install DB_File from cpan and rpm without  
> any luck.
>
>  ++++++++++++++++++++++++
>  CPAN: File::Temp loaded ok (v0.16)
> CPAN: YAML loaded ok (v0.62)
>    CPAN.pm: Going to build P/PM/PMQS/DB_File-1.817.tar.gz
>  Parsing config.in...
> Looks Good.
> Checking if your kit is complete...
> Looks good
> Note (probably harmless): No library found for -ldb
> Writing Makefile for DB_File
> cp DB_File.pm blib/lib/DB_File.pm
> AutoSplitting blib/lib/DB_File.pm (blib/lib/auto/DB_File)
> gcc -c  -I/usr/local/BerkeleyDB/include -DPERL_USE_SAFE_PUTENV -fno- 
> strict-alias
> ing -pipe -Wdeclaration-after-statement -DUSEIMPORTLIB -O3   - 
> DVERSION=\"1.817\"
> -DXS_VERSION=\"1.817\"  "-I/usr/lib/perl5/5.8/cygwin/CORE"  - 
> D_NOT_CORE  -DmDB_
> Prefix_t=size_t -DmDB_Hash_t=u_int32_t   version.c
> version.c:30:16: db.h: No such file or directory
> make: *** [version.o] Error 1
>  PMQS/DB_File-1.817.tar.gz
>  /usr/bin/make -- NOT OK
> Running make test
>  Can't test without successful make
> Running make install
>  Make had returned bad status, install seems impossible
> Failed during this command:
> PMQS/DB_File-1.817.tar.gz                    : make NO
>  +++++++++++++++++++++++++++++++++++++++++++++++
>
>
>  I can't remember I had this kind error while installing earlier  
> version.
>
>  Would you please help me on DB_File installation ?
>
>  Thanks.
>
>  Pan
>
>
> ---------------------------------
> You rock. That's why Blockbuster's offering you one month of  
> Blockbuster Total Access, No Cost.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Fri Apr  4 21:25:41 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Fri, 4 Apr 2008 16:25:41 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
Message-ID: <15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>

Do you need something to access eutils via BioPerl, or are you looking  
for a specific set of classes?  I wrote an interface to eutils  
(Bio::DB::EUtilities), you could do something like this:

#!/usr/bin/perl -w

use strict;
use warnings;
use Bio::DB::EUtilities;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
                                      -term => 'dihydroorotate',
                                      -db => 'pcsubstance',
                                      -retmax => 1000);

print join(',',$eutil->get_ids)."\n";

chris

On Apr 2, 2008, at 7:24 AM, Robert Citek wrote:

> Hello all,
>
> I have a list of chemical compounds that have some kind of interaction
> with proteins or genes.  The current list contains names or SMILES and
> I would like to get the CID number for those compounds.  Currently,
> I'm using perl to query the NCBI's eutils[1], which works great.  But
> I was just curious to know of there was a bioperl module to do
> something similar.  A quick google didn't turn up anything, so I
> thought I'd ask.
>
> [1] http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html
>
> Regards,
> - Robert
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From ekeen at mail.tongji.edu.cn  Mon Apr  7 06:57:04 2008
From: ekeen at mail.tongji.edu.cn (Jinyan Huang)
Date: Mon, 7 Apr 2008 14:57:04 +0800
Subject: [Bioperl-l] How to analysis the relationship of my interesting KEGG
	pathways?
Message-ID: <fb5dae380804062357ka7de019kb3451a5e169c0bf4@mail.gmail.com>

In my research, I got 25 interesting pathways. I want to know the
regulated relationship of these pathways. It is better if there some
software to connect these KEGG pathways.

Thank you very much in advance.


From miguel.pignatelli at uv.es  Mon Apr  7 10:12:58 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Mon, 07 Apr 2008 12:12:58 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
Message-ID: <47F9F3AA.2090003@uv.es>

Hi all,

Is there any way to obtain the date of creation of individual GenBank 
entries? I don't mean the "last revision" date that can be found in the 
first line of a GenBank file.

I can access this creation date by looking at the "revision history" of 
any GenBank entry (for example, see
http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105), 
but I need a systematic (and local=fast) way to access this information.

Any help would be very appreciated,
Thank you very much in advance,

M;


From Bank.Beszteri at awi.de  Mon Apr  7 11:46:43 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Mon, 07 Apr 2008 13:46:43 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
References: <47F22B35.1030502@awi.de>
	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
Message-ID: <47FA09A3.2070004@awi.de>

Hi Hilmar,

it was important to understand that the inconsistency in taxon names is 
apparently only between the Swissprot entries with "non-standard" names 
and the contents of the taxonomy tables and that it is best to use a 
pre-loaded taxonomy, thanks for that! We have now updated to 
bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have 
loaded everything OK in ~26 hours (with many of the "The supplied 
lineage does not start near..." warnings, but no other problems). Our 
next test is to try to load trembl (will try to do this in parallel in 
multiple chunks), hope it will work just as nicely!

Thanks for your tips & insights!

Bank

Hilmar Lapp wrote:

>
> On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
>
>> [...] So next we started to test BioSQL, by trying to load just  
>> Swissprot in a MySQL DB first, like:
>>
>> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser  
>> xyz --dbpass abc --driver mysql --namespace uniprot_sprot --format  
>> swiss uniprot_sprot.dat
>>
>> Here we get an error message
>>
>> ###########################################
>>
>> Loading /biodb/spinkern/uniprot_sprot.dat ...
>> Could not store Q6DAH5:
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: The supplied lineage does not start near 'Erwinia carotovora  
>> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |  
>> Pectobacterium | Enterobacteriaceae | Enterobacteriales |  
>> Gammaproteobacteria | Proteobacteria | Bacteria')
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/ 
>> bioperl-1.5.2_102/Bio/Root/Root.pm:359
>> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/ 
>> bioperl-1.5.2_102/Bio/Species.pm:174
>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 552
>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object / 
>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:1305
>> STACK:  Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
>> /biodb/ spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:973
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key / 
>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:852
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:182
>> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm: 244
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:169
>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/ 
>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/ 
>> BasePersistenceAdaptor.pm:251
>> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/spinkern/ 
>> bioperl-db-1.5.2_100/Bio/DB/Persistent/PersistentObject.pm:271
>> STACK: load_seqdatabase.pl:622
>> -----------------------------------------------------------
>>
>> at load_seqdatabase.pl line 635
>>
>> ############################################
>>
>> or similar, depending on whether we use a pre-loaded ncbi taxonomy  
>> or not
>
>
> I recommend to always use a pre-loaded NCBI taxonomy unless you know  
> there are only a few organisms that are straightforward (for the  
> parser, that is).
>
>> , and which Swissprot release we are trying to load. It often seems  
>> to come from sg. like here, subsp. or other special addition to the  
>> species line; but alternative genus names and other curious things  
>> also to appear. It looks like Species.pm tries to validate the  
>> species name against the lineage info already there in the BioSQL  
>> DB, and in several cases, it finds inconsistencies.
>
>
> It actually happens upon a successful lookup when the species object  
> is populated from the database.
>
>> [...]
>> The only workaround we have found so far was to comment out line  174 
>> in Species.pm:
>>
>> $self->throw("The supplied lineage does not start near '$name' (I  
>> was supplied '".join(" | ", @vals)."')");
>
>
> That should be OK if you work with a pre-loaded taxonomy. It's sort  
> of a sanity check that should catch a parser having messed up a  
> species. If you use a pre-loaded NCBI taxonomy the results of the  
> species parsing don't matter in all details so long as the NCBI  
> taxonID is parsed out correctly, and then found in the database.
>
> Note that this actually a warn() in the main trunk version of  
> BioPerl, so you might want to upgrade to that (or change throw() to  
> warn() in your version). You still get the records flagged with that,  
> but it isn't an exception.
>
>>
>> After doing so, load_seqdatabase.pl runs for several hours (until  it 
>> evetually crashes; I haven?t found out yet why), but proceeds  really 
>> slowly.
>
>
> It should certainly *not* crash. Note also that you can supply --safe  
> on the command line, in which case the script will continue with the  
> next record if one fails to load for whatever reason.
>
> You will want to adjust the width constraint of dbxref.accession, for  
> example to 128 chars. This will also be fixed for BioSQL 1.0.1.
> See http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>
>
>> I also found some info on this for Pg and Oracle in the mailing  
>> list, but has anyone some approximate numbers for MySQL, how long  
>> should a first Swissprot load take?
>
>
> Possibly around 20 hours according to Erik Rijkers:
> See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html
>
> You can use the --logchunks N option to have it print out performance  
> statistics every N records.
>
> Hope this helps,
>
>     -hilmar


From cjfields at uiuc.edu  Mon Apr  7 12:32:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 07:32:45 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FA09A3.2070004@awi.de>
References: <47F22B35.1030502@awi.de>
	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
	<47FA09A3.2070004@awi.de>
Message-ID: <E8A1ED59-830D-473F-8818-1BAC4E0A2FA2@uiuc.edu>

The warnings are something that we still need to resolve, but the only  
fix I can think of likely breaks backward compatibility with older  
bioperl-db installations (i.e. storing the given scientific name  
instead of the binomial name, which is used as a fallback when no  
taxid is found).  There is a full explanation here:

http://bugzilla.open-bio.org/show_bug.cgi?id=2092

Anyway, I think it needs further testing when someone, likely Hilmar  
or I, have time.

chris

On Apr 7, 2008, at 6:46 AM, B?nk Beszteri wrote:

> Hi Hilmar,
>
> it was important to understand that the inconsistency in taxon names  
> is apparently only between the Swissprot entries with "non-standard"  
> names and the contents of the taxonomy tables and that it is best to  
> use a pre-loaded taxonomy, thanks for that! We have now updated to  
> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to  
> have loaded everything OK in ~26 hours (with many of the "The  
> supplied lineage does not start near..." warnings, but no other  
> problems). Our next test is to try to load trembl (will try to do  
> this in parallel in multiple chunks), hope it will work just as  
> nicely!
>
> Thanks for your tips & insights!
>
> Bank
>
> Hilmar Lapp wrote:
>
>>
>> On Apr 1, 2008, at 8:31 AM, B?nk Beszteri wrote:
>>
>>> [...] So next we started to test BioSQL, by trying to load just   
>>> Swissprot in a MySQL DB first, like:
>>>
>>> load_seqdatabase.pl --host mysql.awi.de --dbname biosql2 --dbuser   
>>> xyz --dbpass abc --driver mysql --namespace uniprot_sprot -- 
>>> format  swiss uniprot_sprot.dat
>>>
>>> Here we get an error message
>>>
>>> ###########################################
>>>
>>> Loading /biodb/spinkern/uniprot_sprot.dat ...
>>> Could not store Q6DAH5:
>>> ------------- EXCEPTION: Bio::Root::Exception -------------
>>> MSG: The supplied lineage does not start near 'Erwinia carotovora   
>>> subsp. atroseptica' (I was supplied 'Erwinia carotovora subsp. |   
>>> Pectobacterium | Enterobacteriaceae | Enterobacteriales |   
>>> Gammaproteobacteria | Proteobacteria | Bacteria')
>>> STACK: Error::throw
>>> STACK: Bio::Root::Root::throw /biodb/spinkern/bioperl-1.5/  
>>> bioperl-1.5.2_102/Bio/Root/Root.pm:359
>>> STACK: Bio::Species::classification /biodb/spinkern/bioperl-1.5/  
>>> bioperl-1.5.2_102/Bio/Species.pm:174
>>> STACK: Bio::DB::Persistent::PersistentObject::AUTOLOAD /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm: 552
>>> STACK: Bio::DB::BioSQL::SpeciesAdaptor::populate_from_row /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/SpeciesAdaptor.pm:281
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object /  
>>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:1305
>>> STACK:   
>>> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key / 
>>> biodb/ spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:973
>>> STACK:  
>>> Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key /  
>>> biodb/spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:852
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:182
>>> STACK: Bio::DB::Persistent::PersistentObject::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm: 244
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:169
>>> STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /biodb/  
>>> spinkern/bioperl-db-1.5.2_100/Bio/DB/BioSQL/  
>>> BasePersistenceAdaptor.pm:251
>>> STACK: Bio::DB::Persistent::PersistentObject::store /biodb/ 
>>> spinkern/ bioperl-db-1.5.2_100/Bio/DB/Persistent/ 
>>> PersistentObject.pm:271
>>> STACK: load_seqdatabase.pl:622
>>> -----------------------------------------------------------
>>>
>>> at load_seqdatabase.pl line 635
>>>
>>> ############################################
>>>
>>> or similar, depending on whether we use a pre-loaded ncbi  
>>> taxonomy  or not
>>
>>
>> I recommend to always use a pre-loaded NCBI taxonomy unless you  
>> know  there are only a few organisms that are straightforward (for  
>> the  parser, that is).
>>
>>> , and which Swissprot release we are trying to load. It often  
>>> seems  to come from sg. like here, subsp. or other special  
>>> addition to the  species line; but alternative genus names and  
>>> other curious things  also to appear. It looks like Species.pm  
>>> tries to validate the  species name against the lineage info  
>>> already there in the BioSQL  DB, and in several cases, it finds  
>>> inconsistencies.
>>
>>
>> It actually happens upon a successful lookup when the species  
>> object  is populated from the database.
>>
>>> [...]
>>> The only workaround we have found so far was to comment out line   
>>> 174 in Species.pm:
>>>
>>> $self->throw("The supplied lineage does not start near '$name' (I   
>>> was supplied '".join(" | ", @vals)."')");
>>
>>
>> That should be OK if you work with a pre-loaded taxonomy. It's  
>> sort  of a sanity check that should catch a parser having messed up  
>> a  species. If you use a pre-loaded NCBI taxonomy the results of  
>> the  species parsing don't matter in all details so long as the  
>> NCBI  taxonID is parsed out correctly, and then found in the  
>> database.
>>
>> Note that this actually a warn() in the main trunk version of   
>> BioPerl, so you might want to upgrade to that (or change throw()  
>> to  warn() in your version). You still get the records flagged with  
>> that,  but it isn't an exception.
>>
>>>
>>> After doing so, load_seqdatabase.pl runs for several hours (until   
>>> it evetually crashes; I haven?t found out yet why), but proceeds   
>>> really slowly.
>>
>>
>> It should certainly *not* crash. Note also that you can supply -- 
>> safe  on the command line, in which case the script will continue  
>> with the  next record if one fails to load for whatever reason.
>>
>> You will want to adjust the width constraint of dbxref.accession,  
>> for  example to 128 chars. This will also be fixed for BioSQL 1.0.1.
>> See http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>>
>>
>>> I also found some info on this for Pg and Oracle in the mailing   
>>> list, but has anyone some approximate numbers for MySQL, how long   
>>> should a first Swissprot load take?
>>
>>
>> Possibly around 20 hours according to Erik Rijkers:
>> See http://lists.open-bio.org/pipermail/bioperl-l/2008-March/027427.html
>>
>> You can use the --logchunks N option to have it print out  
>> performance  statistics every N records.
>>
>> Hope this helps,
>>
>>    -hilmar
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bix at sendu.me.uk  Mon Apr  7 12:34:00 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Mon, 07 Apr 2008 13:34:00 +0100
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FA09A3.2070004@awi.de>
References: <47F22B35.1030502@awi.de>	<CCDF4ECA-2888-40D1-B903-34195CFABD07@gmx.net>
	<47FA09A3.2070004@awi.de>
Message-ID: <47FA14B8.7000500@sendu.me.uk>

B?nk Beszteri wrote:
> Hi Hilmar,
> 
> it was important to understand that the inconsistency in taxon names is 
> apparently only between the Swissprot entries with "non-standard" names 
> and the contents of the taxonomy tables and that it is best to use a 
> pre-loaded taxonomy, thanks for that! We have now updated to 
> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have 
> loaded everything OK in ~26 hours (with many of the "The supplied 
> lineage does not start near..." warnings, but no other problems).

Can you provide some examples of these warnings (of the taxons that 
cause them)? If there's anything consistent about them perhaps 
Bio::Species can be improved to accommodate them properly (instead of 
just issuing the warning and getting the classification wrong).


From heikki at sanbi.ac.za  Mon Apr  7 12:48:34 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Mon, 7 Apr 2008 14:48:34 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47F9F3AA.2090003@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
Message-ID: <200804071448.34769.heikki@sanbi.ac.za>

Miguel,

You probably know this but:

- Your entry example below is a GenPept entry, not a GenBank entry
- The NCBI sequence format "genbank" has only the last modified date.
   I do not know about other formats (ASN.1, ...)
- NCBI Entrez is a great tool but it obscures the source database.
- If you really are working on real GenBank entries, you can use the accession 
number to see find corresponding EMBL (and Swiss-Prot) flat file formats that 
have both creation and last modified dates.

Post to the list if you have trouble getting the dates from EMBL/Swiss-Prot 
formats using bioperl.

Yours,

	-Heikki

On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
> Hi all,
>
> Is there any way to obtain the date of creation of individual GenBank
> entries? I don't mean the "last revision" date that can be found in the
> first line of a GenBank file.
>
> I can access this creation date by looking at the "revision history" of
> any GenBank entry (for example, see
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> but I need a systematic (and local=fast) way to access this information.
>
> Any help would be very appreciated,
> Thank you very much in advance,
>
> M;
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From granjeau at tagc.univ-mrs.fr  Mon Apr  7 13:30:10 2008
From: granjeau at tagc.univ-mrs.fr (Samuel GRANJEAUD - IR/ICIM)
Date: Mon, 07 Apr 2008 15:30:10 +0200
Subject: [Bioperl-l] help installing bioperl with cygwin
In-Reply-To: <B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>
References: <161313331084931@webmail.iastate.edu>
	<B7F7923E-4226-4B83-BDC1-8548F0FDB6CC@uiuc.edu>
Message-ID: <47FA21E2.3010602@tagc.univ-mrs.fr>

Hi,

I'm using BioPerl under Cygwin, because Cygwin allows one to work in a 
Unix-like environment in a command line point of view.

So, I use the CVS version which runs out of the box
http://www.bioperl.org/wiki/Using_CVS
which has been replaced by SVN at the beginning of the year
http://www.bioperl.org/wiki/Using_Subversion

So if you really want to work under Cygwin, you can try this quick and 
dirty way, but you still have to become experienced because BioPerl is 
not supported under Cygwin.

You may try Strawberry, but in my experience in installing wxPerl, 
wxPerl fails on both flavours of Perl. ActiveState's Perl is still the 
easiest way to install many packages.

Regards,
Samuel


Chris Fields wrote:
> It's best if you use ActiveState's Perl installation (it's the only 
> one we really support at this moment, unless someone wants to give 
> StrawberryPerl a run).  See:
>
> http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
>
> chris
>
> On Apr 3, 2008, at 1:13 PM, slduncan at iastate.edu wrote:
>
>> I am trying to use cpan to install bioperl and I had an error message 
>> saying:
>> c:\Documents not recognized as and external or internal....
>> Any ideas here.  Also, I am new to the computer world so please be 
>> kind. :)
>>
>> Stacy Duncan
>> Iowa State University
>> Bioinformatics and Computational Biology
>> 1802 University Blvd.
>> VMRI Building 6
>> Ames, IA 50011-1240
>> office phone: (515) 294-8385
>> office fax: (515) 294-1401
>> home phone: (336) 965-5622
>> e-mail: slduncan at iastate.edu
>>
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 

Samuel GRANJEAUD                   granjeau at tagc.univ-mrs.fr
INSERM - ICIM - TAGC               Tel: +33  (0)491 82 87 24
http://tagc.univ-mrs.fr            Fax: +33  (0)491 82 87 01
http://icim.marseille.inserm.fr/proteomique


From er at xs4all.nl  Mon Apr  7 14:36:57 2008
From: er at xs4all.nl (Erik)
Date: Mon, 7 Apr 2008 16:36:57 +0200 (CEST)
Subject: [Bioperl-l] Indexing large databases / BioSQL
Message-ID: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>

On Mon, April 7, 2008 14:34, Sendu Bala wrote:
> B?nk Beszteri wrote:
>> Hi Hilmar,
>>
>> it was important to understand that the inconsistency in taxon names is
>> apparently only between the Swissprot entries with "non-standard" names
>> and the contents of the taxonomy tables and that it is best to use a
>> pre-loaded taxonomy, thanks for that! We have now updated to
>> bioperl-live (and bp-db-live, too) and load_seqdatabase.pl seems to have
>> loaded everything OK in ~26 hours (with many of the "The supplied
>> lineage does not start near..." warnings, but no other problems).
>
> Can you provide some examples of these warnings (of the taxons that
> cause them)? If there's anything consistent about them perhaps
> Bio::Species can be improved to accommodate them properly (instead of
> just issuing the warning and getting the classification wrong).
>

I did this a little while ago and saved the output
(UniProtKB/Swiss-Prot Release 55.1 of 18-Mar-2008, I think).

All warnings (and a few errors) for swissprot are here:

   http://bugzilla.open-bio.org/show_bug.cgi?id=2474

as an attached file

I suppose the OP will have encountered similar output - I don't think there is
much RDBMS-type-dependency involved.

   regards,

   Erik Rijkers


From cjfields at uiuc.edu  Mon Apr  7 15:46:01 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 10:46:01 -0500
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <200804071448.34769.heikki@sanbi.ac.za>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
Message-ID: <2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>

Strangely enough, if you use NCBI's esummary you can get both dates.   
Via Bio::DB::EUtilities in bioperl-live, if you dump out DocSum data  
(using a debugging method I added in a while back):

---------------------------------------

use Bio::DB::EUtilities;

# for multiple IDs use an array ref; also only use GI's (not accessions)
my $factory = Bio::DB::EUtilities->new(
                         -eutil => 'esummary',
                         -db => 'protein',
                         -id => 1621261);

$factory->print_DocSums;

---------------------------------------

One gets the following tag/value pairs:

UID: 1621261
Caption             :CAB02640
Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN  
PYRR [Mycobacterium tuberculosis
		     H37Rv]
Extra               :gi|1621261|emb|CAB02640.1|[1621261]
Gi                  :1621261
CreateDate          :2003/11/21
UpdateDate          :2006/11/14
Flags               :
TaxId               :83332
Length              :193
Status              :live
ReplacedBy          :
Comment             :

I'll add in a method to grab the data element by tag (in this case,  
grab the creation date by asking for the 'CreateDate' key).  Might  
come in handy for scripts.

chris

On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:

> Miguel,
>
> You probably know this but:
>
> - Your entry example below is a GenPept entry, not a GenBank entry
> - The NCBI sequence format "genbank" has only the last modified date.
>   I do not know about other formats (ASN.1, ...)
> - NCBI Entrez is a great tool but it obscures the source database.
> - If you really are working on real GenBank entries, you can use the  
> accession
> number to see find corresponding EMBL (and Swiss-Prot) flat file  
> formats that
> have both creation and last modified dates.
>
> Post to the list if you have trouble getting the dates from EMBL/ 
> Swiss-Prot
> formats using bioperl.
>
> Yours,
>
> 	-Heikki
>
> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in  
>> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision  
>> history" of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi? 
>> val=74311105),
>> but I need a systematic (and local=fast) way to access this  
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From miguel.pignatelli at uv.es  Mon Apr  7 16:24:50 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Mon, 07 Apr 2008 18:24:50 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
Message-ID: <47FA4AD2.5030206@uv.es>


I've noticed that the ASN.1 version of those records has a 
"creation-date" tag.
But this is somehow strange, because the creation date obtained by you 
and that obtained via ASN.1 format is 2003/11/21, but if you look at the 
revision history of the record:

http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=CAB02640

reports a creation date of "Oct 19 1996 12:28 AM"

I don't know how to get this, because the EMBL version of this gene:

http://www.ebi.ac.uk/cgi-bin/dbfetch?db=emblcds&id=CAB02640&style=raw

doesn't has DT fields at all.

M;


Chris Fields wrote:
> Strangely enough, if you use NCBI's esummary you can get both dates.  
> Via Bio::DB::EUtilities in bioperl-live, if you dump out DocSum data 
> (using a debugging method I added in a while back):
> 
> ---------------------------------------
> 
> use Bio::DB::EUtilities;
> 
> # for multiple IDs use an array ref; also only use GI's (not accessions)
> my $factory = Bio::DB::EUtilities->new(
>                         -eutil => 'esummary',
>                         -db => 'protein',
>                         -id => 1621261);
> 
> $factory->print_DocSums;
> 
> ---------------------------------------
> 
> One gets the following tag/value pairs:
> 
> UID: 1621261
> Caption             :CAB02640
> Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN PYRR 
> [Mycobacterium tuberculosis
>              H37Rv]
> Extra               :gi|1621261|emb|CAB02640.1|[1621261]
> Gi                  :1621261
> CreateDate          :2003/11/21
> UpdateDate          :2006/11/14
> Flags               :
> TaxId               :83332
> Length              :193
> Status              :live
> ReplacedBy          :
> Comment             :
> 
> I'll add in a method to grab the data element by tag (in this case, grab 
> the creation date by asking for the 'CreateDate' key).  Might come in 
> handy for scripts.
> 
> chris
> 
> On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:
> 
>> Miguel,
>>
>> You probably know this but:
>>
>> - Your entry example below is a GenPept entry, not a GenBank entry
>> - The NCBI sequence format "genbank" has only the last modified date.
>>   I do not know about other formats (ASN.1, ...)
>> - NCBI Entrez is a great tool but it obscures the source database.
>> - If you really are working on real GenBank entries, you can use the 
>> accession
>> number to see find corresponding EMBL (and Swiss-Prot) flat file 
>> formats that
>> have both creation and last modified dates.
>>
>> Post to the list if you have trouble getting the dates from 
>> EMBL/Swiss-Prot
>> formats using bioperl.
>>
>> Yours,
>>
>>     -Heikki
>>
>> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>>> Hi all,
>>>
>>> Is there any way to obtain the date of creation of individual GenBank
>>> entries? I don't mean the "last revision" date that can be found in the
>>> first line of a GenBank file.
>>>
>>> I can access this creation date by looking at the "revision history" of
>>> any GenBank entry (for example, see
>>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>>> but I need a systematic (and local=fast) way to access this information.
>>>
>>> Any help would be very appreciated,
>>> Thank you very much in advance,
>>>
>>> M;
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> -- 
>> ______ _/      _/_____________________________________________________
>>      _/      _/
>>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>>  _/  _/  _/  University of Western Cape, South Africa
>>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/________________________________________________________
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
> 


From cjfields at uiuc.edu  Mon Apr  7 17:48:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 7 Apr 2008 12:48:45 -0500
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FA4AD2.5030206@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>
	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es> <200804071448.34769.heikki@sanbi.ac.za>
	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>
	<47FA4AD2.5030206@uv.es>
Message-ID: <CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>

Note in the example I gave that, during the revision history, the  
DBSOURCE changed at the point of the creation date (the original nuc.  
record was a M. tuberculosis contig sequence, which later changed to  
an updated full M. tuberculosis genome record at the time of the  
'create date').

Couldn't find anything specific in the GenBank docs on this, but it  
appears (at least for a protein record) the creation date reflects the  
date in which the sequence was either originally deposited or  
originally derived from the nucleotide source record present in the  
record.  In other words, it may not reflect the original date of  
deposition (which could have come from a different record, as in this  
case).

chris

On Apr 7, 2008, at 11:24 AM, Miguel Pignatelli wrote:

>
> I've noticed that the ASN.1 version of those records has a "creation- 
> date" tag.
> But this is somehow strange, because the creation date obtained by  
> you and that obtained via ASN.1 format is 2003/11/21, but if you  
> look at the revision history of the record:
>
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=CAB02640
>
> reports a creation date of "Oct 19 1996 12:28 AM"
>
> I don't know how to get this, because the EMBL version of this gene:
>
> http://www.ebi.ac.uk/cgi-bin/dbfetch?db=emblcds&id=CAB02640&style=raw
>
> doesn't has DT fields at all.
>
> M;
>
>
> Chris Fields wrote:
>> Strangely enough, if you use NCBI's esummary you can get both  
>> dates.  Via Bio::DB::EUtilities in bioperl-live, if you dump out  
>> DocSum data (using a debugging method I added in a while back):
>> ---------------------------------------
>> use Bio::DB::EUtilities;
>> # for multiple IDs use an array ref; also only use GI's (not  
>> accessions)
>> my $factory = Bio::DB::EUtilities->new(
>>                        -eutil => 'esummary',
>>                        -db => 'protein',
>>                        -id => 1621261);
>> $factory->print_DocSums;
>> ---------------------------------------
>> One gets the following tag/value pairs:
>> UID: 1621261
>> Caption             :CAB02640
>> Title               :PROBABLE PYRIMIDINE OPERON REGULATORY PROTEIN  
>> PYRR [Mycobacterium tuberculosis
>>             H37Rv]
>> Extra               :gi|1621261|emb|CAB02640.1|[1621261]
>> Gi                  :1621261
>> CreateDate          :2003/11/21
>> UpdateDate          :2006/11/14
>> Flags               :
>> TaxId               :83332
>> Length              :193
>> Status              :live
>> ReplacedBy          :
>> Comment             :
>> I'll add in a method to grab the data element by tag (in this case,  
>> grab the creation date by asking for the 'CreateDate' key).  Might  
>> come in handy for scripts.
>> chris
>> On Apr 7, 2008, at 7:48 AM, Heikki Lehvaslaiho wrote:
>>> Miguel,
>>>
>>> You probably know this but:
>>>
>>> - Your entry example below is a GenPept entry, not a GenBank entry
>>> - The NCBI sequence format "genbank" has only the last modified  
>>> date.
>>>  I do not know about other formats (ASN.1, ...)
>>> - NCBI Entrez is a great tool but it obscures the source database.
>>> - If you really are working on real GenBank entries, you can use  
>>> the accession
>>> number to see find corresponding EMBL (and Swiss-Prot) flat file  
>>> formats that
>>> have both creation and last modified dates.
>>>
>>> Post to the list if you have trouble getting the dates from EMBL/ 
>>> Swiss-Prot
>>> formats using bioperl.
>>>
>>> Yours,
>>>
>>>    -Heikki
>>>
>>> On Monday 07 April 2008 12:12:58 Miguel Pignatelli wrote:
>>>> Hi all,
>>>>
>>>> Is there any way to obtain the date of creation of individual  
>>>> GenBank
>>>> entries? I don't mean the "last revision" date that can be found  
>>>> in the
>>>> first line of a GenBank file.
>>>>
>>>> I can access this creation date by looking at the "revision  
>>>> history" of
>>>> any GenBank entry (for example, see
>>>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105) 
>>>> ,
>>>> but I need a systematic (and local=fast) way to access this  
>>>> information.
>>>>
>>>> Any help would be very appreciated,
>>>> Thank you very much in advance,
>>>>
>>>> M;
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>> -- 
>>> ______ _/      _/ 
>>> _____________________________________________________
>>>     _/      _/
>>>    _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>>   _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>>  _/  _/  _/  SANBI, South African National Bioinformatics Institute
>>> _/  _/  _/  University of Western Cape, South Africa
>>>    _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>>> ___ _/_/_/_/_/ 
>>> ________________________________________________________
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Tue Apr  8 07:35:43 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Tue, 08 Apr 2008 09:35:43 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
Message-ID: <47FB204F.90405@awi.de>


>>Can you provide some examples of these warnings (of the taxons that
>>cause them)? If there's anything consistent about them perhaps
>>Bio::Species can be improved to accommodate them properly (instead of
>>just issuing the warning and getting the classification wrong).
>>    
>>
>
>All warnings (and a few errors) for swissprot are here:
>
>   http://bugzilla.open-bio.org/show_bug.cgi?id=2474
>
>as an attached file
>
>I suppose the OP will have encountered similar output - I don't think there is
>much RDBMS-type-dependency involved.
>  
>
Hi Erik & Sendu,

yes, the same kind of thing, probably no DBMS-type dependency; in case 
it could be useful, I uploaded my output as a second attachment to the 
bugzilla report cited above.

Bank


From heikki at sanbi.ac.za  Tue Apr  8 08:32:12 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Tue, 8 Apr 2008 10:32:12 +0200
Subject: [Bioperl-l] Blast database sequence retrieval perl script
In-Reply-To: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
References: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
Message-ID: <200804081032.12312.heikki@sanbi.ac.za>


Dear Nelson,

I am cc:ing the bioperl mailing list where all these kind of queries should 
go. More people can help you that way.


Since you have your own local data set, you need to create an index that 
catalogues you sequences for easy retrieval.

You need to install bioperl-live first. See for example: 	
	http://www.bioperl.org/wiki/Using_Subversion

Then you can follow this HOWTO:
	http://www.bioperl.org/wiki/HOWTO:Flat_databases

The other HOWTOs will help you dealing with BioPerl sequence objects that are 
retrieved: http://www.bioperl.org/wiki/HOWTOs. 


Yours,

	-Heikki


On Monday 07 April 2008 14:50:23 Ndegwa, Nelson (IITA-Nairobi) wrote:
> Dear Prof. Heikki,
>
> Hi. We met at the Pathogen Bioinformatics Conference held in Nairobi
> Kenya in May 2007 at ICIPE. I recall you are a developer of Bioperl and
> Perl. I have managed to install a local Blast, having just cowpea Contig
> sequences, about 50,000 in total. This runs fine, as I can perform
> various queries and get results. However, any good match/hit on the
> local Blast database is hard to retrieve and the only option seems to go
> back to that database and search manually for the top hit sequence - an
> exceedingly manual task. Might you perhaps be having a Perl script I
> could adopt to my database to help with this task Such that the hits
> have a hyperlink which can be used to retrieve that specific entry? I
> have limited knowledge of Perl. Thank you.
>
> With Kind Regards,
>
> Nelson.


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From David.Messina at sbc.su.se  Tue Apr  8 11:29:12 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Tue, 8 Apr 2008 13:29:12 +0200
Subject: [Bioperl-l] How to analysis the relationship of my interesting KEGG
	pathways?
In-Reply-To: <628aabb70804080053g1fd9120ex9d5fd12f65f216f9@mail.gmail.com>
References: <fb5dae380804062357ka7de019kb3451a5e169c0bf4@mail.gmail.com>
	<628aabb70804080053g1fd9120ex9d5fd12f65f216f9@mail.gmail.com>
Message-ID: <628aabb70804080429k2aa17a6eu12197709d4cc1af0@mail.gmail.com>

Hi Jinyan,

You asked a similar question last week and received a couple of suggestions
-- did you take a look at those?

I'm not an expert on this topic, but I believe that since regulatory
information is much harder to obtain experimentally and therefore much less
well known, there isn't a lot of it in pathway databases like KEGG. You may
have to look through the literature and start trying to put together
possible regulatory links on your own.

Dave


From hrh at sanger.ac.uk  Tue Apr  8 12:48:32 2008
From: hrh at sanger.ac.uk (Hans Rudolf Hotz)
Date: Tue, 8 Apr 2008 13:48:32 +0100 (BST)
Subject: [Bioperl-l] Blast database sequence retrieval perl script
In-Reply-To: <200804081032.12312.heikki@sanbi.ac.za>
References: <6BEABCD5CA640A44A848448A42A03B73079E48C9@ilrikeadx1.ILRI.CGIARAD.ORG>
	<200804081032.12312.heikki@sanbi.ac.za>
Message-ID: <Pine.LNX.4.64.0804081340180.7147@deskpro50.dynamic.sanger.ac.uk>

Nelson

or simply use the BLAST indices for the sequence retrieval as well.

All you need to do is adding the "-o" option to the 'formatdb' command for 
the BLAST index creation (this will create some extra files). Then you can 
use 'fastacmd' (which is also part of the NCBI BLAST package) to retrieve 
the sequences.


Hans

On Tue, 8 Apr 2008, Heikki Lehvaslaiho wrote:

>
> Dear Nelson,
>
> I am cc:ing the bioperl mailing list where all these kind of queries should
> go. More people can help you that way.
>
>
> Since you have your own local data set, you need to create an index that
> catalogues you sequences for easy retrieval.
>
> You need to install bioperl-live first. See for example:
> 	http://www.bioperl.org/wiki/Using_Subversion
>
> Then you can follow this HOWTO:
> 	http://www.bioperl.org/wiki/HOWTO:Flat_databases
>
> The other HOWTOs will help you dealing with BioPerl sequence objects that are
> retrieved: http://www.bioperl.org/wiki/HOWTOs.
>
>
> Yours,
>
> 	-Heikki
>
>
> On Monday 07 April 2008 14:50:23 Ndegwa, Nelson (IITA-Nairobi) wrote:
>> Dear Prof. Heikki,
>>
>> Hi. We met at the Pathogen Bioinformatics Conference held in Nairobi
>> Kenya in May 2007 at ICIPE. I recall you are a developer of Bioperl and
>> Perl. I have managed to install a local Blast, having just cowpea Contig
>> sequences, about 50,000 in total. This runs fine, as I can perform
>> various queries and get results. However, any good match/hit on the
>> local Blast database is hard to retrieve and the only option seems to go
>> back to that database and search manually for the top hit sequence - an
>> exceedingly manual task. Might you perhaps be having a Perl script I
>> could adopt to my database to help with this task Such that the hits
>> have a hyperlink which can be used to retrieve that specific entry? I
>> have limited knowledge of Perl. Thank you.
>>
>> With Kind Regards,
>>
>> Nelson.
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From robert.citek at gmail.com  Tue Apr  8 14:09:27 2008
From: robert.citek at gmail.com (Robert Citek)
Date: Tue, 8 Apr 2008 09:09:27 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
	<15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
Message-ID: <4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>

Wrapping bioperl around eutils will work just fine.  Thanks for the pointer.

http://search.cpan.org/~sendu/bioperl-1.5.2_102/Bio/DB/EUtilities.pm

Regards,
- Robert

On Fri, Apr 4, 2008 at 4:25 PM, Chris Fields <cjfields at uiuc.edu> wrote:
> Do you need something to access eutils via BioPerl, or are you looking for a
> specific set of classes?  I wrote an interface to eutils
> (Bio::DB::EUtilities), you could do something like this:
>
>  #!/usr/bin/perl -w
>
>  use strict;
>  use warnings;
>  use Bio::DB::EUtilities;
>
>  my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>                                      -term => 'dihydroorotate',
>                                      -db => 'pcsubstance',
>                                      -retmax => 1000);
>
>  print join(',',$eutil->get_ids)."\n";
>
>  chris


From cjfields at uiuc.edu  Tue Apr  8 15:10:26 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 8 Apr 2008 10:10:26 -0500
Subject: [Bioperl-l] module for pubchem queries
In-Reply-To: <4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>
References: <4145b6790804020524g33672578q535b287e93792bdd@mail.gmail.com>
	<15B44EC6-3660-4925-BA7A-6763D51E6837@uiuc.edu>
	<4145b6790804080709l20f1e56erf4b7af04b0a52870@mail.gmail.com>
Message-ID: <32D210FC-575E-4D95-95DA-FC6F5BE1FC24@uiuc.edu>

Just to note, the the API has changed significantly from the interface  
in the 1.5.2 release.  The up-to-date (supported) interface is in  
subversion; there are some example recipes here:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

I'm working on a full HOWTO, just haven't had time to get it up on the  
wiki yet.

chris

On Apr 8, 2008, at 9:09 AM, Robert Citek wrote:

> Wrapping bioperl around eutils will work just fine.  Thanks for the  
> pointer.
>
> http://search.cpan.org/~sendu/bioperl-1.5.2_102/Bio/DB/EUtilities.pm
>
> Regards,
> - Robert
>
> On Fri, Apr 4, 2008 at 4:25 PM, Chris Fields <cjfields at uiuc.edu>  
> wrote:
>> Do you need something to access eutils via BioPerl, or are you  
>> looking for a
>> specific set of classes?  I wrote an interface to eutils
>> (Bio::DB::EUtilities), you could do something like this:
>>
>> #!/usr/bin/perl -w
>>
>> use strict;
>> use warnings;
>> use Bio::DB::EUtilities;
>>
>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>>                                     -term => 'dihydroorotate',
>>                                     -db => 'pcsubstance',
>>                                     -retmax => 1000);
>>
>> print join(',',$eutil->get_ids)."\n";
>>
>> chris
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cuiw at ncbi.nlm.nih.gov  Tue Apr  8 20:41:58 2008
From: cuiw at ncbi.nlm.nih.gov (Cui, Wenwu (NIH/NLM/NCBI) [C])
Date: Tue, 8 Apr 2008 16:41:58 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47F9F3AA.2090003@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
Message-ID: <6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>

Hi, Miguel:

id1_fetch can do it. Detailed instruction can be found at:  

http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
1_fetch.html

Here is an example:

>id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
GI        Loaded      DB    Retrieval No.
--        ------      --    -------------
74311105  12/07/2007  NCBI  19766263
74311105  01/23/2007  NCBI  16325656
74311105  03/30/2006  NCBI  13131204
74311105  03/03/2006  NCBI  12915541
74311105  03/02/2006  NCBI  12885275
74311105  12/03/2005  NCBI  12259793
74311105  09/09/2005  NCBI  11257262
74311105  09/09/2005  NCBI  11242667

Wenwu Cui PhD
NCBI/NLM/NIH

> -----Original Message-----
> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> Sent: Monday, April 07, 2008 6:13 AM
> Cc: bioperl-l at bioperl.org
> Subject: [Bioperl-l] GenBank entries creation dates
> 
> Hi all,
> 
> Is there any way to obtain the date of creation of individual GenBank
> entries? I don't mean the "last revision" date that can be found in
the
> first line of a GenBank file.
> 
> I can access this creation date by looking at the "revision history"
of
> any GenBank entry (for example, see
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> but I need a systematic (and local=fast) way to access this
> information.
> 
> Any help would be very appreciated,
> Thank you very much in advance,
> 
> M;
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From miguel.pignatelli at uv.es  Wed Apr  9 11:32:39 2008
From: miguel.pignatelli at uv.es (Miguel Pignatelli)
Date: Wed, 09 Apr 2008 13:32:39 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
	<6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
Message-ID: <47FCA957.5040409@uv.es>

Wow, impressive, thanks Wenwu for the information, I have never used 
this tool before. The problem is that I need to know all the revision 
history (or at least the creation date) for *all* the GIs present in nr 
(well, or at least a significant portion of it) and this tool queries 
via web.

The existence of this tool confirms me that this information is 
available somewhere, is it possible to download the data that contains 
this information?

Thanks again,

M;


Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at:  
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
> 1_fetch.html
> 
> Here is an example:
> 
>> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD
> NCBI/NLM/NIH
> 
>> -----Original Message-----
>> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
>> Sent: Monday, April 07, 2008 6:13 AM
>> Cc: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] GenBank entries creation dates
>>
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in
> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision history"
> of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>> but I need a systematic (and local=fast) way to access this
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 


From cuiw at ncbi.nlm.nih.gov  Wed Apr  9 13:25:16 2008
From: cuiw at ncbi.nlm.nih.gov (Cui, Wenwu (NIH/NLM/NCBI) [C])
Date: Wed, 9 Apr 2008 09:25:16 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FCA957.5040409@uv.es>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com><264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>
	<47F9F3AA.2090003@uv.es>
	<6F230E9769AA8D4EB4BC401DF133EDB7180BE0@NIHCESMLBX15.nih.gov>
	<47FCA957.5040409@uv.es>
Message-ID: <6F230E9769AA8D4EB4BC401DF133EDB7180BE1@NIHCESMLBX15.nih.gov>

Hi, Miguel,

I do not know whether the data file is publically available. However,
you can perform 'real time' query via id1_fetch:

####step 1: generate GI file #####
id1_fetch -query 'YOUR-GENBANK-QUERY-STRING' -lt none -db Nucleotide
-out qfile

####step 2: retrieve revisions for GIs stored in qfile #####

id1_fetch -lt revisions -qf qfile  -fmt fasta -db Nucleotide

Good luck!

Wenwu Cui

> -----Original Message-----
> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> Sent: Wednesday, April 09, 2008 7:33 AM
> To: Cui, Wenwu (NIH/NLM/NCBI) [C]
> Cc: bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] GenBank entries creation dates
> 
> Wow, impressive, thanks Wenwu for the information, I have never used
> this tool before. The problem is that I need to know all the revision
> history (or at least the creation date) for *all* the GIs present in
nr
> (well, or at least a significant portion of it) and this tool queries
> via web.
> 
> The existence of this tool confirms me that this information is
> available somewhere, is it possible to download the data that contains
> this information?
> 
> Thanks again,
> 
> M;
> 
> 
> Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> > Hi, Miguel:
> >
> > id1_fetch can do it. Detailed instruction can be found at:
> >
> >
>
http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.i
> d
> > 1_fetch.html
> >
> > Here is an example:
> >
> >> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> > GI        Loaded      DB    Retrieval No.
> > --        ------      --    -------------
> > 74311105  12/07/2007  NCBI  19766263
> > 74311105  01/23/2007  NCBI  16325656
> > 74311105  03/30/2006  NCBI  13131204
> > 74311105  03/03/2006  NCBI  12915541
> > 74311105  03/02/2006  NCBI  12885275
> > 74311105  12/03/2005  NCBI  12259793
> > 74311105  09/09/2005  NCBI  11257262
> > 74311105  09/09/2005  NCBI  11242667
> >
> > Wenwu Cui PhD
> > NCBI/NLM/NIH
> >
> >> -----Original Message-----
> >> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
> >> Sent: Monday, April 07, 2008 6:13 AM
> >> Cc: bioperl-l at bioperl.org
> >> Subject: [Bioperl-l] GenBank entries creation dates
> >>
> >> Hi all,
> >>
> >> Is there any way to obtain the date of creation of individual
> GenBank
> >> entries? I don't mean the "last revision" date that can be found in
> > the
> >> first line of a GenBank file.
> >>
> >> I can access this creation date by looking at the "revision
history"
> > of
> >> any GenBank entry (for example, see
> >>
> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
> >> but I need a systematic (and local=fast) way to access this
> >> information.
> >>
> >> Any help would be very appreciated,
> >> Thank you very much in advance,
> >>
> >> M;
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> >


From CALLEY_JOHN_N at LILLY.COM  Wed Apr  9 13:45:23 2008
From: CALLEY_JOHN_N at LILLY.COM (John N Calley)
Date: Wed, 9 Apr 2008 09:45:23 -0400
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <47FCA957.5040409@uv.es>
Message-ID: <OF73E5AA49.8E1EF918-ON85257426.004AF961-85257426.004B915C@EliLilly.lilly.com>

You might want to keep in mind that the creation date is not always 
reliable. I am aware of one example where the recorded creation date 
precedes the sequencing date by several months (as determined by the trace 
file date). NCBI was not able to explain exactly what happened but (as I 
recall) hypothesized that some dates had been scrambled in a database 
rebuild. If there was interest I could probably pull up more details.

John Calley


Miguel Pignatelli <miguel.pignatelli at uv.es> 
Sent by: bioperl-l-bounces at lists.open-bio.org
04/09/2008 07:32 AM
Please respond to
miguel.pignatelli at uv.es


To
"Cui, Wenwu (NIH/NLM/NCBI) [C]" <cuiw at ncbi.nlm.nih.gov>
cc
bioperl-l at bioperl.org
Subject
Re: [Bioperl-l] GenBank entries creation dates


Wow, impressive, thanks Wenwu for the information, I have never used 
this tool before. The problem is that I need to know all the revision 
history (or at least the creation date) for *all* the GIs present in nr 
(well, or at least a significant portion of it) and this tool queries 
via web.

The existence of this tool confirms me that this information is 
available somewhere, is it possible to download the data that contains 
this information?

Thanks again,

M;


Cui, Wenwu (NIH/NLM/NCBI) [C] wrote:
> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at: 
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id
> 1_fetch.html
> 
> Here is an example:
> 
>> id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD
> NCBI/NLM/NIH
> 
>> -----Original Message-----
>> From: Miguel Pignatelli [mailto:miguel.pignatelli at uv.es]
>> Sent: Monday, April 07, 2008 6:13 AM
>> Cc: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] GenBank entries creation dates
>>
>> Hi all,
>>
>> Is there any way to obtain the date of creation of individual GenBank
>> entries? I don't mean the "last revision" date that can be found in
> the
>> first line of a GenBank file.
>>
>> I can access this creation date by looking at the "revision history"
> of
>> any GenBank entry (for example, see
>> http://www.ncbi.nlm.nih.gov/entrez/sutils/girevhist.cgi?val=74311105),
>> but I need a systematic (and local=fast) way to access this
>> information.
>>
>> Any help would be very appreciated,
>> Thank you very much in advance,
>>
>> M;
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l


From frederic.romagne at gmail.com  Wed Apr  9 20:45:50 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 09 Apr 2008 15:45:50 -0500
Subject: [Bioperl-l] question about clustalw module.
Message-ID: <1207773950.483.13.camel@kiss-laptop>

Hello,

i have a problem when using Bio::Tools::Run::Alignment::Clustalw :

I give it an array_ref scalar (the array contains some fasta sequences)
and all the good parameters and i write the result via  Bio::SeqIO.

The fact is that my result file only contains the Accession number in
the header... An example :

the initial stream is : 

>NM_052854 Homo sapiens cAMP responsive element binding protein 3-like 1
(CREB3L1), mRNA.
AGAAGACGTGCGGAGGGAGACGCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGG
GGGAGCACTTAGCTCCCCCGCCCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTC
AGCCCCAACCCCGGGCTCCCCATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGT
GGAGTCGGCTGAATGCCCACGGTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCG
CTGCCCTAAGGCCCCCGCGCGCCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCG
CCCCTCCCCCGGGGCTTCGCCCCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAG
GAGCTCTGGACTGGGCGCGCCGCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCC
CGGGAGCCGGCTGCGATGGACGCCGTCTTGGAACCCTTCCCGGCCGACAGGCTGTTCCCC
GGATCCAGCTTCCTGGACTTGGGGGATCTGAACGAGTCGGACTTCCTCAACAATGCGCAC

...

the result file is :

>NM_052854
---------------------------------------AGAAGACGTGCGGAGGGAGAC
GCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGGGGGAGCACTTAGCTCCCCCGC
CCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTCAGCCCCAACCCCGGGCTCCCC
ATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGTGGAGTCGGCTGAATGCCCACG
GTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCGCTGCCCTAAGGCCCCCGCGCG
CCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCGCCCCTCCCCCGGGGCTTCGCC
CCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAGGAGCTCTGGACTGGGCGCGCC
GCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCCCGGGAGCCGGCTGCGATGGAC

...

?So i lost the other informations provided by the header...

?Is there any option to keep these informations?

Here is a part of my code with my options :


 my $seq_ref=\@seq;
 my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'quiet' => 1,
		'output' => 'FASTA');
 my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
 my $aln = $factory->align($seq_ref);


Thank you.


From jason at bioperl.org  Wed Apr  9 20:55:13 2008
From: jason at bioperl.org (Jason Stajich)
Date: Wed, 9 Apr 2008 13:55:13 -0700
Subject: [Bioperl-l] question about clustalw module.
In-Reply-To: <1207773950.483.13.camel@kiss-laptop>
References: <1207773950.483.13.camel@kiss-laptop>
Message-ID: <C126E560-1A36-461E-ADAD-774446B9DB9E@bioperl.org>

the clustal alignment format does not allow for the description - if  
you want to preserve it you'll have to add it back, make a hash  
indexed by sequence ID and store the description, then when you get  
your alignment back you can update the description field before  
writing it out with AlignIO.

-jason
On Apr 9, 2008, at 1:45 PM, Fr?d?ric Romagn? wrote:

> Hello,
>
> i have a problem when using Bio::Tools::Run::Alignment::Clustalw :
>
> I give it an array_ref scalar (the array contains some fasta  
> sequences)
> and all the good parameters and i write the result via  Bio::SeqIO.
>
> The fact is that my result file only contains the Accession number in
> the header... An example :
>
> the initial stream is :
>
>> NM_052854 Homo sapiens cAMP responsive element binding protein 3- 
>> like 1
> (CREB3L1), mRNA.
> AGAAGACGTGCGGAGGGAGACGCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGG
> GGGAGCACTTAGCTCCCCCGCCCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTC
> AGCCCCAACCCCGGGCTCCCCATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGT
> GGAGTCGGCTGAATGCCCACGGTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCG
> CTGCCCTAAGGCCCCCGCGCGCCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCG
> CCCCTCCCCCGGGGCTTCGCCCCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAG
> GAGCTCTGGACTGGGCGCGCCGCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCC
> CGGGAGCCGGCTGCGATGGACGCCGTCTTGGAACCCTTCCCGGCCGACAGGCTGTTCCCC
> GGATCCAGCTTCCTGGACTTGGGGGATCTGAACGAGTCGGACTTCCTCAACAATGCGCAC
>
> ...
>
> the result file is :
>
>> NM_052854
> ---------------------------------------AGAAGACGTGCGGAGGGAGAC
> GCAGAGACAGAGGAGAGGCCGGCAGCCACCCAGTCTCGGGGGAGCACTTAGCTCCCCCGC
> CCCGGCTCCCACCCTGTCCGGGGGGCTCCTGAAGCCCTCAGCCCCAACCCCGGGCTCCCC
> ATGGAAGCCAGCTGTGCCCCAGGAGGAGCAGGAGGAGGTGGAGTCGGCTGAATGCCCACG
> GTGCGCCCGGGGCCCCTGAGCCCATCCCGCTCCTAGCCGCTGCCCTAAGGCCCCCGCGCG
> CCCCGCGCCCCCCACCCGGGGCCGCGCCGCCTCCGTCCGCCCCTCCCCCGGGGCTTCGCC
> CCGGACCTGCCCCCCGCCCGTTTGCCAGCGCTCAGGCAGGAGCTCTGGACTGGGCGCGCC
> GCCGCCCTGGAGTGAGGGAAGCCCAGTGGAAGGGGGTCCCGGGAGCCGGCTGCGATGGAC
>
> ...
>
> So i lost the other informations provided by the header...
>
> Is there any option to keep these informations?
>
> Here is a part of my code with my options :
>
>
>  my $seq_ref=\@seq;
>  my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'quiet' => 1,
> 		'output' => 'FASTA');
>  my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);
>  my $aln = $factory->align($seq_ref);
>
>
> Thank you.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From lamq at usal.es  Thu Apr 10 15:52:24 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Thu, 10 Apr 2008 17:52:24 +0200
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
Message-ID: <47FE37B8.9090404@usal.es>

I am not able to add xyplot glyphs to one panel because I have some
problems with the aggregations.

Using that GFF file:

##sequence-region chr1 1 5578650
chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
. . .


And this source code for preparing the aggregated features necessary for
the xyplot glyph:

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,
                            -adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}'
                           );
my $segment  = $db->segment('chr1');
my @features1 = $db->features('atpc');
print "$#features1 \n";
my @features2 = $segment->features('atpc');
print "$#features2 \n";
my @features3 = $db->features('at');
print "$#features3 \n";
my @features4 = $segment->features('at');
print "$#features4 \n";

I obtain:

111572
111572
0
0

What I am doing wrong with the aggregator?

Many thanks.


From lamq at usal.es  Thu Apr 10 15:45:55 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Thu, 10 Apr 2008 17:45:55 +0200
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
Message-ID: <47FE3633.70908@usal.es>

I am not able to add xyplot glyphs to one panel because I have some 
problems with the aggregations.

Using that GFF file:

##sequence-region chr1 1 5578650
chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
. . .


And this source code for preparing the aggregated features necessary for 
the xyplot glyph:

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,
                            -adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}'
                           );
my $segment  = $db->segment('chr1');                           
my @features1 = $db->features('atpc');
print "$#features1 \n";
my @features2 = $segment->features('atpc');
print "$#features2 \n";
my @features3 = $db->features('at');
print "$#features3 \n";
my @features4 = $segment->features('at');
print "$#features4 \n";

I obtain:

111572
111572
0
0

What I am doing wrong with the aggregator?

Many thanks.


From lincoln.stein at gmail.com  Thu Apr 10 17:55:06 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Thu, 10 Apr 2008 13:55:06 -0400
Subject: [Bioperl-l] xyplot glyph problem with previous aggregation
In-Reply-To: <47FE37B8.9090404@usal.es>
References: <47FE37B8.9090404@usal.es>
Message-ID: <6dce9a0b0804101055w65e22abfgaa4f155751fef40f@mail.gmail.com>

Hi Luis,

When you aggregate the atpc 1 features together, you end up with one
feature. Thus @features3 is an array of size 1. The $# operator returns the
index of the last element, which is 0. If @features3 were empty, $#features3
would return -1.

Lincoln

On Thu, Apr 10, 2008 at 11:52 AM, Luis A. M. Quintales <lamq at usal.es> wrote:

> I am not able to add xyplot glyphs to one panel because I have some
> problems with the aggregations.
>
> Using that GFF file:
>
> ##sequence-region chr1 1 5578650
> chr1  atfreq  atpc    1  50   58.8000   .  .  atpc 1
> chr1  atfreq  atpc   51 100   58.4000   .  .  atpc 1
> chr1  atfreq  atpc  101 150   57.6000   .  .  atpc 1
> chr1  atfreq  atpc  151 200   57.8000   .  .  atpc 1
> . . .
>
>
> And this source code for preparing the aggregated features necessary for
> the xyplot glyph:
>
> my $filin  = $ARGV[0];
> my $db = Bio::DB::GFF->new( -dsn => $filin,
>                           -adaptor => 'memory',
>                           -aggregator => 'at{atpc:atfreq}'
>                          );
> my $segment  = $db->segment('chr1');
> my @features1 = $db->features('atpc');
> print "$#features1 \n";
> my @features2 = $segment->features('atpc');
> print "$#features2 \n";
> my @features3 = $db->features('at');
> print "$#features3 \n";
> my @features4 = $segment->features('at');
> print "$#features4 \n";
>
> I obtain:
>
> 111572
> 111572
> 0
> 0
>
> What I am doing wrong with the aggregator?
>
> Many thanks.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From adsj at novozymes.com  Fri Apr 11 08:53:23 2008
From: adsj at novozymes.com (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Fri, 11 Apr 2008 10:53:23 +0200
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
Message-ID: <87d4owixh8.fsf@topper.koldfront.dk>

  Hi.

I am trying to make Bio::SeqIO return objects of my own type (a small
extension of Bio::Seq::RichSeq), by setting -seqfactory. I am having a
little trouble creating the correct object to pass with -seqfactory:

Following the example given in SYNOPSIS of Bio::Factory::SequenceFactoryI,
I get this error:

 $ perl -e '
 >            use Bio::Seq::SeqFactory;
 >            my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq');
 > 
 >            my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 >                                          -display_id => 'exampleseq');
 > 
 >            print "seq is a ", ref($seq), "\n";
 > '

 ------------- EXCEPTION: Bio::Root::Exception -------------
 MSG: Can't locate type.pm in @INC (@INC contains: /z/bio/biotools/bioinfperlmodules/ /z/bio/adm/modules /etc/perl /usr/local/lib/perl/5.8.8 /usr/local/share/perl/5.8.8 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.8 /usr/share/perl/5.8 /usr/local/lib/site_perl .) at (eval 13) line 3.
 : Unrecognized Sequence type for SeqFactory 'type'
 STACK: Error::throw
 STACK: Bio::Root::Root::throw /usr/share/perl/5.8/Bio/Root/Root.pm:357
 STACK: Bio::Seq::SeqFactory::type /usr/share/perl/5.8/Bio/Seq/SeqFactory.pm:134
 STACK: Bio::Seq::SeqFactory::new /usr/share/perl/5.8/Bio/Seq/SeqFactory.pm:93
 STACK: -e:3
 -----------------------------------------------------------
 $ 

If I go "Bio::Seq::SeqFactory('Bio::PrimarySeq'=>1)" instead, for
instance, it seems to work:

 $ perl -e '
 >            use Bio::Seq::SeqFactory;
 >            my $seqbuilder = Bio::Seq::SeqFactory->new('Bio::PrimarySeq'=>1);
 > 
 >            my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 >                                          -display_id => 'exampleseq');
 > 
 >            print "seq is a ", ref($seq), "\n";
 > '
 seq is a Bio::PrimarySeq
 $ 

I was about to write a patch for the pod, when I realized that I'd
better start by asking: Is this a buglet in the pod or the code?

  Best regards,

    Adam

-- 
                                                          Adam Sj?gren
                                                    adsj at novozymes.com


From hlapp at gmx.net  Fri Apr 11 15:35:54 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 11 Apr 2008 11:35:54 -0400
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <87d4owixh8.fsf@topper.koldfront.dk>
References: <87d4owixh8.fsf@topper.koldfront.dk>
Message-ID: <0037240B-F469-4388-972A-324101B11621@gmx.net>


On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:
>  $ perl -e '
>>            use Bio::Seq::SeqFactory;
>>            my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>  
>> 'Bio::PrimarySeq');


You need to prefix the argument with a dash: '-type', not 'type'.  
Otherwise, it assumes that the class you want instantiated is 'type.pm'.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From 1zoujing at 163.com  Thu Apr 10 05:08:52 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 22:08:52 -0700 (PDT)
Subject: [Bioperl-l]  Bio::ASN1::EntrezGene parse so slowly?
Message-ID: <16602210.post@talk.nabble.com>


  I want to parse a file "gene_info" from NCBI. The format of Gene in NCBI is
ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
properly/too slow. The file is about 500M. 
  The code is following:
  use Bio::ASN1::EntrezGene;
  my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
  my $i = 0;
  while(my $result = $parser->next_seq)
  { last; #something to do there, here use last for test}

  When it goes to the "while" part, it is processing on and on, it does not
went out, even I used "last" in the "while" part. 
   So I wonder whether it is too slow or the module is not fit for this job,
or I did something wrong?

  Thank you!
-- 
View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 06:17:41 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 23:17:41 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl Sus_scrofa.ags"
Message-ID: <16602770.post@talk.nabble.com>


   I am a geen hand in Bioperl. When I run perl with
"parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
information:
     Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
  
   But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
should be the same as Homo_sapiens in the example. So it should be no error
as the code is the example from Mingyi.
   I wonder why this happen, and should I change something about the file? 
    
-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16602770.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 06:56:52 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 9 Apr 2008 23:56:52 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16602770.post@talk.nabble.com>
References: <16602770.post@talk.nabble.com>
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 07:03:56 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:03:56 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
per GeneID, Column header line is the first line in the file
) is not the right format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 07:04:32 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:04:32 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
per GeneID, Column header line is the first line in the file) is not the
right format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 07:09:40 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:09:40 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz. It doesn't work.Is
that means "gene_info.gz"( tab-delimited,one line per GeneID, Column header
line is the first line in the file) is not the right format for
Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 10 07:10:26 2008
From: 1zoujing at 163.com (zoujing)
Date: Thu, 10 Apr 2008 00:10:26 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
Message-ID: <16603225.post@talk.nabble.com>


Seached  the web and found the answer now, quote the answer as following:
   The error was thrown by my Bio::ASN1::EntrezGene module because it 
expects a text file, while you fed it with a binary file.  To use 
gzipped ASN binary file from NCBI, download the NCBI gene2xml 
(ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml), 
then use this syntax to run my parser on the binary files: 

my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i 
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped 
binary file directly downloaded from NCBI 

Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene). 
Mingyi

   But there is still one thing, I want to parse "gene_info.gz" in Gene of
NCBI. ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz.
   It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line per
GeneID, Column header line is the first line in the file) is not the right
format for Bio::ASN1::EntrezGene?
 
      
zoujing wrote:
> 
>    I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>   
>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no
> error as the code is the example from Mingyi.
>    I wonder why this happen, and should I change something about the file? 
>     
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From stefan.kirov at bms.com  Fri Apr 11 19:59:29 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Fri, 11 Apr 2008 15:59:29 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16602770.post@talk.nabble.com>
References: <16602770.post@talk.nabble.com>
Message-ID: <Pine.WNT.4.64.0804111557210.2384@A161887.one.ads.bms.com>

AGS is a binary ASN.1 format and WILL NOT be parsed! You have to use 
gene2xml( weird, but this is NCBI) with these flags: -c -x -b -i. This 
will spit out text ASN which can be parsed.
Stefan

On Wed, 9 Apr 2008, zoujing wrote:

>
>   I am a geen hand in Bioperl. When I run perl with
> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
> information:
>     Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>
>   But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
> should be the same as Homo_sapiens in the example. So it should be no error
> as the code is the example from Mingyi.
>   I wonder why this happen, and should I change something about the file?
>
> -- 
> View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16602770.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From stefan.kirov at bms.com  Fri Apr 11 20:01:30 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Fri, 11 Apr 2008 16:01:30 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <16603225.post@talk.nabble.com>
References: <16603225.post@talk.nabble.com>
Message-ID: <Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>

It is not. If you use this file, why would you need a parser for it 
anyway? Just split on \t or read with OpenOffice or equiv.
Stefan

On Thu, 10 Apr 2008, zoujing wrote:

>
> Seached  the web and found the answer now, quote the answer as following:
>   The error was thrown by my Bio::ASN1::EntrezGene module because it
> expects a text file, while you fed it with a binary file.  To use
> gzipped ASN binary file from NCBI, download the NCBI gene2xml
> (ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml),
> then use this syntax to run my parser on the binary files:
>
> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped
> binary file directly downloaded from NCBI
>
> Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene).
> Mingyi
>
>   But there still one thing, I want to parse "gene_info.gz" in Gene of
> NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one line
> per GeneID, Column header line is the first line in the file
> ) is not the right format for Bio::ASN1::EntrezGene?
>
>
>
> zoujing wrote:
>>
>>    I am a geen hand in Bioperl. When I run perl with
>> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
>> information:
>>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>>
>>    But the Sus_scrofa.ags is download from NCBI, with the format of ASN1,
>> should be the same as Homo_sapiens in the example. So it should be no
>> error as the code is the example from Mingyi.
>>    I wonder why this happen, and should I change something about the file?
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From asjo at koldfront.dk  Fri Apr 11 19:39:59 2008
From: asjo at koldfront.dk (Adam =?iso-8859-1?Q?Sj=F8gren?=)
Date: Fri, 11 Apr 2008 21:39:59 +0200
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <0037240B-F469-4388-972A-324101B11621@gmx.net> (Hilmar Lapp's
	message of "Fri, 11 Apr 2008 11:35:54 -0400")
References: <87d4owixh8.fsf@topper.koldfront.dk>
	<0037240B-F469-4388-972A-324101B11621@gmx.net>
Message-ID: <877if4i3jk.fsf@topper.koldfront.dk>

On Fri, 11 Apr 2008 11:35:54 -0400, Hilmar wrote:

> On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:

>>> my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>
>>> 'Bio::PrimarySeq');

> You need to prefix the argument with a dash: '-type', not 'type'. 
> Otherwise, it assumes that the class you want instantiated is
> 'type.pm'.

I guess that means I should submit a patch for the SYNOPSIS. Attached.


   Thanks,

    Adam


Index: Bio/Factory/SequenceFactoryI.pm
===================================================================
--- Bio/Factory/SequenceFactoryI.pm	(revision 14654)
+++ Bio/Factory/SequenceFactoryI.pm	(working copy)
@@ -20,7 +20,7 @@
 # get a Bio::Factory::SequenceFactoryI object like
 
     use Bio::Seq::SeqFactory;
-    my $seqbuilder = Bio::Seq::SeqFactory->new('type' => 'Bio::PrimarySeq');
+    my $seqbuilder = Bio::Seq::SeqFactory->new('-type' => 'Bio::PrimarySeq');
 
     my $seq = $seqbuilder->create(-seq => 'ACTGAT',
 				  -display_id => 'exampleseq');

-- 
 "Well, I'm a moon around you"                                Adam Sj?gren
                                                         asjo at koldfront.dk


From bamboowarrior at gmail.com  Fri Apr 11 23:10:35 2008
From: bamboowarrior at gmail.com (Arkady)
Date: Fri, 11 Apr 2008 18:10:35 -0500
Subject: [Bioperl-l] Nucleotide Links in Gene DB (GenBank)
Message-ID: <91656c3f0804111610r24c8fa5es5bcb56b7a59e0208@mail.gmail.com>

Hi everyone, I'm a bioperl n00b. Actually, kind of a genbank n00b,
too, as I'm from a CS background and just started bio things last
June.

I'm trying to set up an analysis pipeline of primate protein CDSs (the
nucleotide seqs). I've written a script which does a pretty decent job
of downloading these from GenBank--but it's inconsistent, because a
lot of sequences in nucleotide are 'predicted' and named LOCthisorthat
instead of by gene name.

So what I was thinking was this (assume ANKRD43 is the gene for this example):

1. Search 'gene' database for ANKRD43 AND (PRI*[ORGN])
On NCBI, there's an option to show all nucleotide links. How do I get
a list of those in bioperl? Can bioperl even search 'gene', or just
'nucleotide'?

2. Search 'nucleotide' for the referenced items from #1, and also for
ANKRD43[TITL] AND (PRI*[ORGN]), save CDSes.

3. BLAST mRNA for one of those CDSes, see if we pick up any other matches.

4. BLAT other primates for CDSes, see if we find anything not in GenBank.


On the other hand, I always get the feeling I'm doing things the hard
way--especially here, with #1 and #2. Is there a much more obvious,
simple way to do this?

Thanks, folks.


Cheers,
John Woods

Institute for Cellular and Molecular Biology
The University of Texas at Austin


From hlapp at gmx.net  Fri Apr 11 23:19:44 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Fri, 11 Apr 2008 19:19:44 -0400
Subject: [Bioperl-l] Bio::Factory::SequenceFactoryI SYNOPSIS example
In-Reply-To: <877if4i3jk.fsf@topper.koldfront.dk>
References: <87d4owixh8.fsf@topper.koldfront.dk>
	<0037240B-F469-4388-972A-324101B11621@gmx.net>
	<877if4i3jk.fsf@topper.koldfront.dk>
Message-ID: <B4B3CAD0-C346-470C-98D7-D6CBFE116109@gmx.net>

Thanks, applied. -hilmar

On Apr 11, 2008, at 3:39 PM, Adam Sj?gren wrote:
> On Fri, 11 Apr 2008 11:35:54 -0400, Hilmar wrote:
>
>> On Apr 11, 2008, at 4:53 AM, Adam Sj?gren wrote:
>
>>>> my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>
>>>> 'Bio::PrimarySeq');
>
>> You need to prefix the argument with a dash: '-type', not 'type'.
>> Otherwise, it assumes that the class you want instantiated is
>> 'type.pm'.
>
> I guess that means I should submit a patch for the SYNOPSIS. Attached.
>
>
>    Thanks,
>
>     Adam
>
>
> Index: Bio/Factory/SequenceFactoryI.pm
> ===================================================================
> --- Bio/Factory/SequenceFactoryI.pm	(revision 14654)
> +++ Bio/Factory/SequenceFactoryI.pm	(working copy)
> @@ -20,7 +20,7 @@
>  # get a Bio::Factory::SequenceFactoryI object like
>
>      use Bio::Seq::SeqFactory;
> -    my $seqbuilder = Bio::Seq::SeqFactory->new('type' =>  
> 'Bio::PrimarySeq');
> +    my $seqbuilder = Bio::Seq::SeqFactory->new('-type' =>  
> 'Bio::PrimarySeq');
>
>      my $seq = $seqbuilder->create(-seq => 'ACTGAT',
>  				  -display_id => 'exampleseq');
>
> -- 
>  "Well, I'm a moon around you"                                Adam  
> Sj?gren
>                                                           
> asjo at koldfront.dk
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From mmokrejs at ribosome.natur.cuni.cz  Sat Apr 12 01:32:14 2008
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Sat, 12 Apr 2008 03:32:14 +0200
Subject: [Bioperl-l] [BioSQL-l] Loading sequences with novel NCBI
	taxon_id
In-Reply-To: <CE3675B2-2AFD-46AA-A348-16C9FEA51E0E@uiuc.edu>
References: <320fb6e00803130806w46148bacm54c3ead9a50b038f@mail.gmail.com>	<32EB5B0C-4CC8-4C33-9F41-5D4465B6AC48@gmx.net>	<320fb6e00803131613o20eae2b7y325814ef26d2738f@mail.gmail.com>	<CEA4F4E7-A66B-4C62-AE32-511E177BC485@gmx.net>	<93b45ca50803140648s5098a7d0sec621f448ef03040@mail.gmail.com>
	<CE3675B2-2AFD-46AA-A348-16C9FEA51E0E@uiuc.edu>
Message-ID: <4800111E.3030802@ribosome.natur.cuni.cz>

Chris Fields wrote:
> The counter to that perspective (using new sequences with old tax info) 
> would be to regularly update NCBI taxonomy, particularly in 
> circumstances prior to adding new sequences.  Hilmar mentioned that once 
> tax is loaded it doesn't take as long to update, so you could set up a 
> cron job to update regularly.
> 
> I remember someone mentioning weekly or monthly updates on the list 
> quite a while ago, but I'm unsure how often NCBI updates tax information 
> (i.e. with every release, monthly, weekly, etc).  I can see instances 
> popping up where you used the an up-to-date taxonomy but a new sequence 
> contains a tax ID not present.  I think bioperl-db handles these but I'm 
> not sure what other Bio* do.
> 

I spent some time benchmarking this and inspecting the mysql log files.
The current load_ncbi_taxonomy.pl script with minor modification to
show timestamps does this on initial import into mysql and then update
of the database using exactly same dataset (but anyway it has to walk
through all the data):

$ ./load_ncbi_taxonomy.pl --dbname=biosqldb --driver=mysql --host=127.0.01 \
  --port=3306 --directory=/home/mmokrejs/bioinformatics/databases/ncbitax/dump \
  --chunksize=0 --verbose=2 --mycnf=~/.my.cnf
Sat Apr 12 01:58:43 MEST 2008
Loading NCBI taxon database in /home/mmokrejs/bioinformatics/databases/ncbitax/dump:
       ... retrieving all taxon nodes in the database
Sat Apr 12 01:58:43 MEST 2008
       ... reading in taxon nodes from nodes.dmp
Sat Apr 12 01:58:58 MEST 2008
       ... insert / update / delete taxon nodes
                10000/421098 done (in 5 secs, 2000.0 rows/s)
                20000/421098 done (in 4 secs, 2500.0 rows/s)
...
                420000/421098 done (in 4 secs, 2500.0 rows/s)
Sat Apr 12 02:02:21 MEST 2008
       ... (committing nodes)
Sat Apr 12 02:02:21 MEST 2008
       ... rebuilding nested set left/right values
                10000 done (in 24 secs, 416.7 rows/s)
                20000 done (in 26 secs, 384.6 rows/s)
                30000 done (in 24 secs, 416.7 rows/s)
...
                420004 done (in 23 secs, 434.8 rows/s)
Sat Apr 12 02:19:25 MEST 2008
       ... reading in taxon names from names.dmp
Sat Apr 12 02:19:25 MEST 2008
       ... deleting old taxon names
Sat Apr 12 02:19:25 MEST 2008
       ... inserting new taxon names
                10000 done (in 8 secs, 1250.0 rows/s)
                20000 done (in 8 secs, 1250.0 rows/s)
...
                580000 done (in 5 secs, 2000.0 rows/s)
Sat Apr 12 02:24:48 MEST 2008
       ... cleaning up
Sat Apr 12 02:24:49 MEST 2008
Done.
$


I decided to re-import the same data to mimic at least somehow
the future updates, although no record should be UPDATEd,
except zapping left and right values with NULL. :((

$ ./load_ncbi_taxonomy.pl --dbname=biosqldb --driver=mysql --host=127.0.01
  --port=3306 --directory=/home/mmokrejs/bioinformatics/databases/ncbitax/dump \
  --chunksize=0 --verbose=2 --mycnf=~/.my.cnf
Sat Apr 12 02:35:20 MEST 2008
Loading NCBI taxon database in /home/mmokrejs/bioinformatics/databases/ncbitax/dump:
        ... retrieving all taxon nodes in the database
Sat Apr 12 02:35:26 MEST 2008
       ... reading in taxon nodes from nodes.dmp
Sat Apr 12 02:35:46 MEST 2008
       ... insert / update / delete taxon nodes
                10000/421098 done (in 0 secs, 10000.0 rows/s)
                20000/421098 done (in 0 secs, 10000.0 rows/s)
...
                410000/421098 done (in 0 secs, 10000.0 rows/s)
                420000/421098 done (in 0 secs, 10000.0 rows/s)
Sat Apr 12 02:35:55 MEST 2008
       ... (committing nodes)
Sat Apr 12 02:35:55 MEST 2008
       ... rebuilding nested set left/right values
                10000 done (in 9 secs, 1111.1 rows/s)
                20000 done (in 9 secs, 1111.1 rows/s)
...
                410004 done (in 8 secs, 1250.0 rows/s)
                420004 done (in 9 secs, 1111.1 rows/s)
Sat Apr 12 02:41:54 MEST 2008
       ... reading in taxon names from names.dmp
Sat Apr 12 02:41:54 MEST 2008
       ... deleting old taxon names
Sat Apr 12 02:41:55 MEST 2008
       ... inserting new taxon names
                10000 done (in 5 secs, 2000.0 rows/s)
                20000 done (in 5 secs, 2000.0 rows/s)
...
                570000 done (in 6 secs, 1666.7 rows/s)
                580000 done (in 5 secs, 2000.0 rows/s)
Sat Apr 12 02:47:27 MEST 2008
       ... cleaning up
Sat Apr 12 02:47:27 MEST 2008
Done.
$ ls -la /var/log/mysql/mysql.log 
-rw-rw---- 1 mysql mysql 483443314 Apr 12 03:15 /var/log/mysql/mysql.log
$

Pentium4 M laptop, 1.8GHz, 1 GB RAM, mysql-5.0.56 with enabled
SQL text logging, the slow version of logging all SQL commands
compared to binary logging. The log was cleared before the tests.
I could provide some bits from the log or upload it somewhere
if anybody else would like to dig into the details.


I believe the recalculation step could be made faster. See what
happens:

                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '1' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '10239' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12333' ORDER BY ncbi_taxon_id
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12335' ORDER BY ncbi_taxon_id
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE left_value = '4'
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE right_value = '5'
                     31 Query       UPDATE taxon SET left_value = '4', right_value = '5' WHERE taxon_id = '12335'
                     31 Query       SELECT taxon_id, left_value, right_value FROM taxon WHERE parent_taxon_id = '12340' ORDER BY ncbi_taxon_id
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE left_value = '6'
                     31 Query       UPDATE taxon SET left_value = NULL, right_value = NULL WHERE right_value = '7'
                     31 Query       UPDATE taxon SET left_value = '6', right_value = '7' WHERE taxon_id = '12340'


The columns left_value and right_value have NULL value upon
the table is created, so no need to write again NULL into
them. This would mean writing a wrapper function which would
mimic update() but before doing that it would do 'SELECT * FROM',
compare the values with those to be written and include in the
final UPDATE statement only those columns for which values have
been changed. We use such a smart wrapper for our code in python.
;-)

When the columns for left and right are to be made NULL during
update of an existing database, I think it would be much faster
to drop the columns and re-create them again with NULL values.


I think it could be investigated more the possibility to create
empty taxon and taxon_name tables as MyISAM tables and only after
all the import and updates they could be converted into InnoDB
tables. One would have to probably think a bit more of the foreign
keys but it might be they would not even be lost during the conversion
back and forth.

Actually, easy to check. Dump your current taxon and taxon_name
tables (maybe even without sql data using --without-data), run
'ALTER TABLE taxon ... type=MyISAM'
followed by
'ALTER TABLE taxon ... type=InnoDB'
dump again the database structure and compare by diff with
the original.

But, time for sleep here.
Martin


From sdavis2 at mail.nih.gov  Sat Apr 12 03:50:44 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Fri, 11 Apr 2008 23:50:44 -0400
Subject: [Bioperl-l] Bio::ASN1::EntrezGene parse so slowly?
In-Reply-To: <16602210.post@talk.nabble.com>
References: <16602210.post@talk.nabble.com>
Message-ID: <264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>

gene_info is a tab-delimited text file, if I recall correctly.  Have
you looked at it?  If it is, you should be able to parse it in a few
seconds with just a couple lines of code.

Sean


On Thu, Apr 10, 2008 at 1:08 AM, zoujing <1zoujing at 163.com> wrote:
>
>   I want to parse a file "gene_info" from NCBI. The format of Gene in NCBI is
>  ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
>  properly/too slow. The file is about 500M.
>   The code is following:
>   use Bio::ASN1::EntrezGene;
>   my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
>   my $i = 0;
>   while(my $result = $parser->next_seq)
>   { last; #something to do there, here use last for test}
>
>   When it goes to the "while" part, it is processing on and on, it does not
>  went out, even I used "last" in the "while" part.
>    So I wonder whether it is too slow or the module is not fit for this job,
>  or I did something wrong?
>
>   Thank you!
>  --
>  View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
>  Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
>  _______________________________________________
>  Bioperl-l mailing list
>  Bioperl-l at lists.open-bio.org
>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From david at burt7259.freeserve.co.uk  Sat Apr 12 17:01:57 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sat, 12 Apr 2008 18:01:57 +0100
Subject: [Bioperl-l] bioperl-db
Message-ID: <BFCB174E-B59E-4249-BDF8-4B0F2E2273C9@burt7259.freeserve.co.uk>

Hi Hilmar,

Hope you can help ? I am using bioperl-db to create a biosql database

I have used scripts load_seqdatabase.pl and load_ontology.pl to  
install human swissprot entries, gene ontology, sequence ontology and  
now want to load interpro

Here?s the command line I have tried

perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
root --dbpass chicken --driver mysql \
--namespace "InterPro" --format InterPro interpro.xml

But I get this message

Can't call method "identifier" on an undefined value at  /cygdrive/c/ 
Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
SimpleOntologyEngine.pm line 395

Any ideas?

Dave

PS: here?s the top of the interpro.xml file

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE interprodb SYSTEM "interpro.dtd">


<interprodb>
     <release>
       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
file_date="04-OCT-2006 00:00:00" />
       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
file_date="22-NOV-2006 00:00:00" />
       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
file_date="12-JUN-2007 00:00:00" />
       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
file_date="22-SEP-2005 00:00:00" />
       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
file_date="23-APR-2004 00:00:00" />
       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
file_date="14-NOV-2006 00:00:00" />
       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
file_date="27-JUL-2007 00:00:00" />
       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
file_date="28-SEP-2007 00:00:00" />
       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
file_date="11-SEP-2006 00:00:00" />
       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
file_date="30-NOV-2006 00:00:00" />
       <dbinfo dbname="SWISSPROT" version="55.1" entry_count="359942"  
file_date="18-MAR-2008 00:00:00" />
       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
file_date="18-MAR-2008 00:00:00" />
       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
file_date="19-MAR-2008 00:00:00" />
       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
file_date="27-MAR-2007 00:00:00" />
       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
file_date="12-JUL-2007 16:56:17" />
     </release>
   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
protein_count="352">
     <name>Kringle</name>
     <abstract>

  
From hlapp at gmx.net  Sat Apr 12 18:10:44 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 14:10:44 -0400
Subject: [Bioperl-l] personal vs list email
Message-ID: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>

I'm not sure why but I have received several Bioperl or BioSQL- 
related email inquiries directed to me *personally* over the past few  
weeks.

I have been responding as I get to them, but I feel that I am doing  
both the senders and this community a poor service, because sometimes  
someone else on the list could have responded much faster, and when I  
respond, others on the list who happen to be interested in the same  
question don't get to see the answer.

So from now on as a policy I will redirect *every* email sent to me  
personally and that asks a question related to one of the projects to  
the respective mailing list. If you don't want this, please  
conspicuously say so at the top of your email, and in that case if  
you do ask a project-related question be prepared to wait and to  
possibly needing to follow up.

As an aside, it's a pretty safe assumption to make that all other  
core developers, and quite possibly *all* developers are following a  
similar policy, whether expressly or not.

Isn't this somewhere in the FAQ too?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Apr 12 18:16:13 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 14:16:13 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
Message-ID: <C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>

Hi Burt,

can you try format interprosax instead of interpro? That variant is  
also much more graceful regarding required space.

	-hilmar

On Apr 12, 2008, at 1:01 PM, David Burt wrote:

> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From cjfields at uiuc.edu  Sat Apr 12 20:17:43 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Sat, 12 Apr 2008 15:17:43 -0500
Subject: [Bioperl-l] [BioSQL-l] personal vs list email
In-Reply-To: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
References: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
Message-ID: <E7962E90-8309-4ADA-B002-950793B61D74@uiuc.edu>


On Apr 12, 2008, at 1:10 PM, Hilmar Lapp wrote:

> I'm not sure why but I have received several Bioperl or BioSQL- 
> related email inquiries directed to me *personally* over the past  
> few weeks.
>
> I have been responding as I get to them, but I feel that I am doing  
> both the senders and this community a poor service, because  
> sometimes someone else on the list could have responded much faster,  
> and when I respond, others on the list who happen to be interested  
> in the same question don't get to see the answer.
>
> So from now on as a policy I will redirect *every* email sent to me  
> personally and that asks a question related to one of the projects  
> to the respective mailing list. If you don't want this, please  
> conspicuously say so at the top of your email, and in that case if  
> you do ask a project-related question be prepared to wait and to  
> possibly needing to follow up.
>
> As an aside, it's a pretty safe assumption to make that all other  
> core developers, and quite possibly *all* developers are following a  
> similar policy, whether expressly or not.

I agree; I'm sure several other core devs feel the same way.  I always  
try to forward these to the list if I feel it is more relevant there.

> Isn't this somewhere in the FAQ too?
>
> 	-hilmar

No, but I've added it to the bioperl FAQ; might be worth checking over  
and editing.

chris


From hlapp at gmx.net  Sat Apr 12 22:40:53 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 18:40:53 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89ce2$5400a710$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce2$5400a710$0202a8c0@STUDYPC>
Message-ID: <3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>

Burt - please keep your replies on the list. Others may have input  
too, or benefit from the answer too.

As there is no name() method call on line 914 in the current version  
let's check first that you run a current version of BioPerl. It will  
need to be at least 1.5.2.

However, I do suspect a problem in either the InterPro file itself  
(wouldn't be the first time), or the InterPro parser.

	-hilmar

On Apr 12, 2008, at 5:15 PM, David Burt wrote:

> Hilmar
>
> Many thanks seems to be working
>
> But got this output ? any comments/ideas what it means ?
>
> Dave
>
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> > --namespace "InterPro" --format interprosax interpro.xml
>         ...deleting all relationships for InterPro
>         ...parsing and loading InterPro
> Can't call method "name" on an undefined value at load_ontology.pl  
> line 914.
>
> HERE?S the name and definition in the ontology table
>
> Name = InterPro
>
> Definition =
>
> PANTHER version 6.1, 30128 entries, 04-OCT-2006
> PFAM version 21.0, 8957 entries, 22-NOV-2006
> PIRSF version 2.70, 2877 entries, 12-JUN-2007
> PRINTS version 38.0, 1900 entries, 22-SEP-2005
> PRODOM version 2005.1, 1522 entries, 23-APR-2004
> PROSITE version 20.0, 2006 entries, 14-NOV-2006
> SMART version 5.1, 724 entries, 27-JUL-2007
> TIGRFAMs version 7.0, 3423 entries, 28-SEP-2007
> GENE3D version 3.0.0, 2147 entries, 11-SEP-2006
> SSF version 1.69, 1538 entries, 30-NOV-2006
> SWISSPROT version 55.1, 359942 entries, 18-MAR-2008
> TREMBL version 38.1, 5443281 entries, 18-MAR-2008
> INTERPRO version 17.0, 16175 entries, 19-MAR-2008
> GO version N/A, 23937 entries, 27-MAR-2007
> MEROPS version 7.8, 2831 entries, 12-JUL-2007 |
>
>
>
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: 12 April 2008 19:16
> To: David Burt
> Cc: Bioperl BioPerl
> Subject: Re: bioperl-db
>
> Hi Burt,
>
> can you try format interprosax instead of interpro? That variant is  
> also much more graceful regarding required space.
>
>             -hilmar
>
> On Apr 12, 2008, at 1:01 PM, David Burt wrote:
>
>
> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Sat Apr 12 22:43:25 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Sat, 12 Apr 2008 18:43:25 -0400
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
Message-ID: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>

I'm not sure what you mean by 'Check interpro.xml', but you can use  
the --safe command-line option to keep going if an individual term  
fails to load for whatever reason.

Can you post the data for the seemingly offending record? (and please  
cc the list)

	-hilmar

On Apr 12, 2008, at 5:39 PM, David Burt wrote:

> Hi Hilmar
>
> Just checked mysql database and only have 39 entries under interpro  
> and loaded up to IPR000035
>
> Check unterpro.xml looks OK from IPR000036 and onwards
>
> So seems to have crashed at IPR000035 ?
>
> dave
>
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: 12 April 2008 19:16
> To: David Burt
> Cc: Bioperl BioPerl
> Subject: Re: bioperl-db
>
> Hi Burt,
>
> can you try format interprosax instead of interpro? That variant is  
> also much more graceful regarding required space.
>
>             -hilmar
>
> On Apr 12, 2008, at 1:01 PM, David Burt wrote:
>
>
> Hi Hilmar,
>
> Hope you can help ? I am using bioperl-db to create a biosql database
>
> I have used scripts load_seqdatabase.pl and load_ontology.pl to  
> install human swissprot entries, gene ontology, sequence ontology  
> and now want to load interpro
>
> Here?s the command line I have tried
>
> perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser  
> root --dbpass chicken --driver mysql \
> --namespace "InterPro" --format InterPro interpro.xml
>
> But I get this message
>
> Can't call method "identifier" on an undefined value at  /cygdrive/ 
> c/Bioinformatics/Ensembl/src/bioperl-live/Bio/Ontology/ 
> SimpleOntologyEngine.pm line 395
>
> Any ideas?
>
> Dave
>
> PS: here?s the top of the interpro.xml file
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <!DOCTYPE interprodb SYSTEM "interpro.dtd">
>
>
>
> <interprodb>
>     <release>
>       <dbinfo dbname="PANTHER" version="6.1" entry_count="30128"  
> file_date="04-OCT-2006 00:00:00" />
>       <dbinfo dbname="PFAM" version="21.0" entry_count="8957"  
> file_date="22-NOV-2006 00:00:00" />
>       <dbinfo dbname="PIRSF" version="2.70" entry_count="2877"  
> file_date="12-JUN-2007 00:00:00" />
>       <dbinfo dbname="PRINTS" version="38.0" entry_count="1900"  
> file_date="22-SEP-2005 00:00:00" />
>       <dbinfo dbname="PRODOM" version="2005.1" entry_count="1522"  
> file_date="23-APR-2004 00:00:00" />
>       <dbinfo dbname="PROSITE" version="20.0" entry_count="2006"  
> file_date="14-NOV-2006 00:00:00" />
>       <dbinfo dbname="SMART" version="5.1" entry_count="724"  
> file_date="27-JUL-2007 00:00:00" />
>       <dbinfo dbname="TIGRFAMs" version="7.0" entry_count="3423"  
> file_date="28-SEP-2007 00:00:00" />
>       <dbinfo dbname="GENE3D" version="3.0.0" entry_count="2147"  
> file_date="11-SEP-2006 00:00:00" />
>       <dbinfo dbname="SSF" version="1.69" entry_count="1538"  
> file_date="30-NOV-2006 00:00:00" />
>       <dbinfo dbname="SWISSPROT" version="55.1"  
> entry_count="359942" file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="TREMBL" version="38.1" entry_count="5443281"  
> file_date="18-MAR-2008 00:00:00" />
>       <dbinfo dbname="INTERPRO" version="17.0" entry_count="16175"  
> file_date="19-MAR-2008 00:00:00" />
>       <dbinfo dbname="GO" version="N/A" entry_count="23937"  
> file_date="27-MAR-2007 00:00:00" />
>       <dbinfo dbname="MEROPS" version="7.8" entry_count="2831"  
> file_date="12-JUL-2007 16:56:17" />
>     </release>
>   <interpro id="IPR000001" type="Domain" short_name="Kringle"  
> protein_count="352">
>     <name>Kringle</name>
>     <abstract>
>
>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From Russell.Smithies at agresearch.co.nz  Mon Apr 14 02:51:41 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 14 Apr 2008 14:51:41 +1200
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC><C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net><000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>

Has anyone tried TRF? 
I notice UCSC is using it for all their simple repeat annotations and thought it might be better than what we're currently using (Sputnik)

And is there a BioPerl parser for it's output or am I going to have to write my own ?

Thanx,


Russell Smithies 

Bioinformatics Applications Developer 
T +64 3 489 9085 
E? russell.smithies at agresearch.co.nz 

Invermay? Research Centre 
Puddle Alley, 
Mosgiel, 
New Zealand 
T? +64 3 489 3809?? 
F? +64 3 489 9174? 
www.agresearch.co.nz 


=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From Russell.Smithies at agresearch.co.nz  Mon Apr 14 02:53:46 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 14 Apr 2008 14:53:46 +1200
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C03B09DE9@imail.agresearch.co.nz>
References: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
	<D5DBA313349A4B458528BE63B387F36C03B09DE9@imail.agresearch.co.nz>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEA881@imail.agresearch.co.nz>

Scratch the need for a parser.
I turned off html output and it's all nice white-space separated text  :-)

Russell

> -----Original Message-----
> From: Smithies, Russell
> Sent: Monday, 14 April 2008 2:52 p.m.
> To: 'Bioperl BioPerl'
> Subject: Tandem Repeats Finder?
> 
> Has anyone tried TRF?
> I notice UCSC is using it for all their simple repeat annotations and thought it might
> be better than what we're currently using (Sputnik)
> 
> And is there a BioPerl parser for it's output or am I going to have to write my own ?
> 
> Thanx,
> 
> 
> Russell Smithies
> 
> Bioinformatics Applications Developer
> T +64 3 489 9085
> E? russell.smithies at agresearch.co.nz
> 
> Invermay? Research Centre
> Puddle Alley,
> Mosgiel,
> New Zealand
> T? +64 3 489 3809
> F? +64 3 489 9174
> www.agresearch.co.nz
> 

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From csaba.ortutay at gmail.com  Mon Apr 14 04:15:22 2008
From: csaba.ortutay at gmail.com (Ortutay Csaba =?iso-8859-1?q?P=E9ter?=)
Date: Mon, 14 Apr 2008 07:15:22 +0300
Subject: [Bioperl-l] Tandem Repeats Finder?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
	<D5DBA313349A4B458528BE63B387F36C06BEA87E@imail.agresearch.co.nz>
Message-ID: <200804140715.22702.csaba.ortutay@gmail.com>

Hello, I have used TRF in my earlier projects. It is nice and quick tool.

There was not ready made parsers those times (5-6 years ago) so we have 
written our own.

Csaba

> Has anyone tried TRF?
> I notice UCSC is using it for all their simple repeat annotations and
> thought it might be better than what we're currently using (Sputnik)
>
> And is there a BioPerl parser for it's output or am I going to have to
> write my own ?
>
> Thanx,


-- 
Csaba Ortutay PhD
IMT Bioinformatics
University of Tampere
Finland


From avilella at gmail.com  Mon Apr 14 11:13:26 2008
From: avilella at gmail.com (Albert Vilella)
Date: Mon, 14 Apr 2008 12:13:26 +0100
Subject: [Bioperl-l] how can I print a Bio::Tree newick sortby given list?
Message-ID: <358f4d650804140413x4271f18bx40af1b9054306df8@mail.gmail.com>

Hi,

I have a newick file that I want to sort by a given order and print again as
newick.
For example, if I have

(((ENSPTRG00000013811:0.0011,ENSG00000142192:0.0021):0.0033,ENSPPYG00000003902:0.0326):0.0000,ENSMMUG00000014384:0.0366):0.3638;

I want to sort it by "ENSG:ENSPTRG:ENSPPYG:ENSMMUG".

Any suggestions on how to do this in bioperl?

Cheers,

    Albert.


From lamq at usal.es  Mon Apr 14 15:01:51 2008
From: lamq at usal.es (Luis A. M. Quintales)
Date: Mon, 14 Apr 2008 17:01:51 +0200
Subject: [Bioperl-l] xyplot glyph: scale problems
Message-ID: <480371DF.7040900@usal.es>

I have some problem with the xyplot scale numbers calculated by the glyph.

The shape of the graph looks fine, but the scale number 10 and his 
position in the ouput is not correct.

I send the source code, simplified input file and the png output.

Thank you


Source code

ex1.pl  (also in http://avellano.usal.es/~luis/bioperl-l/ex1.pl)
============================
#!/usr/bin/perl
use Bio::DB::GFF;
use Bio::Graphics::Panel;
use strict;

my $filin  = $ARGV[0];
my $db = Bio::DB::GFF->new( -dsn => $filin,-adaptor => 'memory',
                            -aggregator => 'at{atpc:atfreq}' );
my $segment  = $db->segment('chr1');
my @features = $segment->features('at');
my $panel = Bio::Graphics::Panel->new(
       -offset    => 0, -grid    => 100,                               
       -length    => 500, -width     => 800,
       -pad_left  => 50, -pad_right => 50 );
$panel->add_track($segment, -glyph   => 'generic',
                           -bgcolor => 'blue', -label   => 
1);                                    
$panel->add_track(\@features,
                    -glyph => 'xyplot',
                    -graph_type=>'boxes',
                    -scale=>'left',
                    -height=>200,
 );
open (FI,"> sal.png");
============================

in1.gff file (also in http://avellano.usal.es/~luis/bioperl-l/in1.gff)
============================
##sequence-region chr1 1 5578650
chr1    atfreq    atpc    1    10       64.0000    .    .    atpc 1
chr1    atfreq    atpc    11    20       63.0000    .    .    atpc 1
chr1    atfreq    atpc    21    30       62.0000    .    .    atpc 1
chr1    atfreq    atpc    31    40       59.0000    .    .    atpc 1
chr1    atfreq    atpc    41    50       59.0000    .    .    atpc 1
chr1    atfreq    atpc    51    60       59.0000    .    .    atpc 1
chr1    atfreq    atpc    61    70       59.0000    .    .    atpc 1
chr1    atfreq    atpc    71    80       59.0000    .    .    atpc 1
chr1    atfreq    atpc    81    90       61.0000    .    .    atpc 1
chr1    atfreq    atpc    91    100       60.0000    .    .    atpc 1
chr1    atfreq    atpc    101    110       60.0000    .    .    atpc 1
chr1    atfreq    atpc    111    120       64.0000    .    .    atpc 1
chr1    atfreq    atpc    121    130       64.0000    .    .    atpc 1
chr1    atfreq    atpc    131    140       60.0000    .    .    atpc 1
chr1    atfreq    atpc    141    150       60.0000    .    .    atpc 1
chr1    atfreq    atpc    151    160       63.0000    .    .    atpc 1
chr1    atfreq    atpc    161    170       62.0000    .    .    atpc 1
chr1    atfreq    atpc    171    180       59.0000    .    .    atpc 1
chr1    atfreq    atpc    181    190       54.0000    .    .    atpc 1
chr1    atfreq    atpc    191    200       53.0000    .    .    atpc 1
chr1    atfreq    atpc    201    210       54.0000    .    .    atpc 1
chr1    atfreq    atpc    211    220       50.0000    .    .    atpc 1
chr1    atfreq    atpc    221    230       51.0000    .    .    atpc 1
chr1    atfreq    atpc    231    240       56.0000    .    .    atpc 1
chr1    atfreq    atpc    241    250       58.0000    .    .    atpc 1
chr1    atfreq    atpc    251    260       55.0000    .    .    atpc 1
chr1    atfreq    atpc    261    270       54.0000    .    .    atpc 1
chr1    atfreq    atpc    271    280       56.0000    .    .    atpc 1
chr1    atfreq    atpc    281    290       59.0000    .    .    atpc 1
chr1    atfreq    atpc    291    300       58.0000    .    .    atpc 1
chr1    atfreq    atpc    301    310       60.0000    .    .    atpc 1
chr1    atfreq    atpc    311    320       59.0000    .    .    atpc 1
chr1    atfreq    atpc    321    330       59.0000    .    .    atpc 1
chr1    atfreq    atpc    331    340       57.0000    .    .    atpc 1
chr1    atfreq    atpc    341    350       56.0000    .    .    atpc 1
chr1    atfreq    atpc    351    360       57.0000    .    .    atpc 1
chr1    atfreq    atpc    361    370       57.0000    .    .    atpc 1
chr1    atfreq    atpc    371    380       58.0000    .    .    atpc 1
chr1    atfreq    atpc    381    390       56.0000    .    .    atpc 1
chr1    atfreq    atpc    391    400       58.0000    .    .    atpc 1
chr1    atfreq    atpc    401    410       56.0000    .    .    atpc 1
chr1    atfreq    atpc    411    420       59.0000    .    .    atpc 1
chr1    atfreq    atpc    421    430       58.0000    .    .    atpc 1
chr1    atfreq    atpc    431    440       59.0000    .    .    atpc 1
chr1    atfreq    atpc    441    450       58.0000    .    .    atpc 1
chr1    atfreq    atpc    451    460       58.0000    .    .    atpc 1
chr1    atfreq    atpc    461    470       56.0000    .    .    atpc 1
chr1    atfreq    atpc    471    480       57.0000    .    .    atpc 1
chr1    atfreq    atpc    481    490       59.0000    .    .    atpc 1
============================


The sal.png :
http://avellano.usal.es/~luis/bioperl-l/sal.png

Thank you.


-- 
==================================================
 Luis Antonio Miguel Quintales
 Departamento de Inform?tica y Autom?tica
 Facultad de Ciencias
 Universidad de Salamanca
 Plaza de la Merced s/n
 37008-SALAMANCA
 SPAIN
==================================================
 Tel.: +34-923-294400(ext.1513)
 Fax.: +34-923-294584
 E-mail: lamq at usal.es
==================================================


From aaron.j.mackey at gsk.com  Mon Apr 14 13:00:52 2008
From: aaron.j.mackey at gsk.com (aaron.j.mackey at gsk.com)
Date: Mon, 14 Apr 2008 09:00:52 -0400
Subject: [Bioperl-l] personal vs list email
In-Reply-To: <E5D49A1A-24F0-4224-9980-30F418EED978@gmx.net>
Message-ID: <OF3ED0BD19.1CBA005A-ON8525742B.00473A95-8525742B.00477DEC@gsk.com>

I try to take it even one step further: I require the person to re-ask 
their question on the mailing list (and then try to answer it there). This 
has the added benefit of causing the person to pause a moment to reflect 
on their question, and (sometimes) to spend a bit more time preparing the 
question for more broader public consumption.

-Aaron


From sutripa at vbi.vt.edu  Mon Apr 14 16:54:47 2008
From: sutripa at vbi.vt.edu (Sucheta Tripathy)
Date: Mon, 14 Apr 2008 12:54:47 -0400 (EDT)
Subject: [Bioperl-l] Error installing XML::Parser
Message-ID: <1285.99.152.150.87.1208192087.squirrel@webmail.vbi.vt.edu>


Hello List,

I have recently installed bioperl using the following command. The
installation was successful. Now I am trying to install XML::Parser but it
returns with  error messages. Any clue what I may be doing wrong?

Thanks

Sucheta

Following is the last part of the error message:

### Error Message #######

Expat.c: In function ??~XS_XML__Parser__Expat_SkipUntil??T:
Expat.c:2664: error: ??~XML_Parser??T undeclared (first use in this function)
Expat.c:2664: error: expected ??~;??T before ??~parser??T
Expat.c:2665: warning: ISO C90 forbids mixed declarations and code
Expat.xs:2179: error: ??~parser??T undeclared (first use in this function)
Expat.xs:2179: warning: cast to pointer from integer of different size
Expat.xs:2180: error: ??~CallbackVector??T has no member named
??~st_serial??T
Expat.xs:2182: error: ??~CallbackVector??T has no member named
??~skip_until??T
Expat.c: In function ??~XS_XML__Parser__Expat_Do_External_Parse??T:
Expat.c:2687: error: ??~XML_Parser??T undeclared (first use in this function)
Expat.c:2687: error: expected ??~;??T before ??~parser??T
Expat.c:2688: warning: ISO C90 forbids mixed declarations and code
Expat.xs:2194: error: ??~parser??T undeclared (first use in this function)
Expat.xs:2194: warning: cast to pointer from integer of different size
Expat.xs:2205: warning: unused variable ??~pret??T
Expat.xs:2194: warning: unused variable ??~cbv??T
Expat.xs:2192: warning: unused variable ??~type??T
make[1]: *** [Expat.o] Error 1
make[1]: Leaving directory `/root/.cpan/build/XML-Parser-2.36/Expat'
make: *** [subdirs] Error 2
  /usr/bin/make  -- NOT OK
Running make test
  Can't test without successful make
Running make install
  make had returned bad status, install seems impossible

#####

-- 
Sucheta Tripathy, Ph.D.
Virginia Bioinformatics Institute Phase-I
Washington street.
Virginia Tech.
Blacksburg,VA 24061-0447
phone:(540)231-8138
Fax:  (540) 231-2606


From mmokrejs at ribosome.natur.cuni.cz  Tue Apr 15 10:45:48 2008
From: mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGluIE1PS1JFSsWg?=)
Date: Tue, 15 Apr 2008 12:45:48 +0200
Subject: [Bioperl-l] GenBank entries creation dates
In-Reply-To: <CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>
References: <fb5dae380804011914u69e94bbbrca469d24a094157d@mail.gmail.com>	<264855a00804020433i2a260561x883189cc4fa0c58f@mail.gmail.com>	<47F9F3AA.2090003@uv.es>
	<200804071448.34769.heikki@sanbi.ac.za>	<2BA9950D-F106-4420-B128-A2AE2F46A020@uiuc.edu>	<47FA4AD2.5030206@uv.es>
	<CA410982-12F9-4289-8B54-87BE33A38085@uiuc.edu>
Message-ID: <4804875C.80506@ribosome.natur.cuni.cz>

Chris Fields wrote:
> Note in the example I gave that, during the revision history, the 
> DBSOURCE changed at the point of the creation date (the original nuc.
>  record was a M. tuberculosis contig sequence, which later changed to
> an updated full M. tuberculosis genome record at the time of the
> 'create date').
> 
> Couldn't find anything specific in the GenBank docs on this, but it 
> appears (at least for a protein record) the creation date reflects
> the date in which the sequence was either originally deposited or
> originally derived from the nucleotide source record present in the
> record.  In other words, it may not reflect the original date of
> deposition (which could have come from a different record, as in this
> case).
> 
> chris

Hi,
I have few answers from the past from NCBI staff to my similar questions
regarding DATE issues and VERSION numbers not being increased upon
"changes" in a record.
I tried below to put into a more readable form my former correspondence.
Hope this helps everybody to understand what happens in the black box. ;)
Martin


Date: Thu, 17 Jan 2002 15:40:07 -0500 (EST)
From: David Wheeler
Subject: Brucella_melitensis on ftp site

> Hi, I'd like to point you to the fact, that the descriptions of 
> Brucella_melitensis differ in 
> ftp.ncbi.nih.nlm.gov/genomes/Bacteria/Brucella_melitensis and 
> ftp.ncbi.nih.nlm.gov/genbank/genomes/Bacteria/Brucella_melitensis
> 
> Namely, the description of the strain is retained in *.gbk files
> under /genomes/Bacteria/Brucella_melitensis only under the strain
> description field, but not in the DEFINITION line, where it is
> present in *.gbk files under
> /genbank/genomes/Bacteria/Brucella_melitensis.
> 
> LOCUS       NC_003318 1177787 bp    DNA   circular  BCT
> 13-NOV-2001 DEFINITION  Brucella melitensis chromosome II, complete
> sequence. ACCESSION   NC_003318 VERSION     NC_003318.1  GI:17988344
> 
> compared to
> 
> LOCUS       AE008918  1177787 bp    DNA   circular  BCT
> 27-DEC-2001 DEFINITION  Brucella melitensis strain 16M chromosome II,
> complete sequence. ACCESSION   AE008918 VERSION     AE008918
> 
> This makes me worried about the data. Why is the release date of 
> NON-curated files (AE008918) newer than the release data of CURATED
> data (NC_003318)? Is it expected case? Could someone explain me the
> difference between them (i.e. CURATED vs. NONCURATED)?

The curated record is initially a copy of the non-curated record with certain 
changes in documentation made in order to comply with the NCBI standard for 
reference genomes. One change which you have noticed is the difference in 
Definition line format.  Curated genomic records are created in order to 
standardize annotation for genomes in the Entrez Genomes database while leaving 
editorial control for the parent GenBank records in the hands of the original 
submitters.

Regardles of the date you see on the record, the curated version is derived from 
the non-curated one.  In this case, it appears that the processing of the 
non-curated version lagged a little bit relative to that of the curated version. 
Normally, however, the non-curated version will have the earlier date.


Date: Sun, 27 Jan 2002 00:16:55 -0500 (EST)
From: David Wheeler
Subject: Re: CONSULT: Brucella_melitensis on ftp site

> Are the raw sequence data always same in non-curated and curated 
> flatfiles?
> 
> Is the annotation of orf's/proteins different between them?
> 
> Are there any new or withdrawn orf's or proteins in the curated
> flatfiles compared to non-curated ones?
> 
> My feeling is that no-one except original submitters can modify
> submitted data, so you cannot modify non-curated files, i.e. cannot
> modify them and increase the version number.
> 
> Because of that, you've introduced curated versions, which are just
> copies of original but public data so you are free to modify it. So
> once again, are the differences between non-curated and curated
> flatfiles only in structure of the file? I don't think so. Examples
> would be Listeria genomes or the 2 Agrobacterium's, if I remember
> right.

Initially, there should be no or very few differences, however, as time
goes by, differences in the annotation will materialize.  There may also
be differences in the sequence, if errors in the original sequence come to
light, but these differences should be very rare.

So, practically speaking, you will probably find few differences but,
since the purpose of the Refseq is to curate, there may well be some
differences.


Date: Mon, 17 Dec 2001 11:57:06 -0500 (EST)
From: Dawn Lipshultz
Subject: Re: Buggy date in Staphylococcus aureus N315

>>>> Hi, I've found there has been released Staphylococcus aureus
>>>> N315 on 01-JAN-1900, which is nonsense. I guss you had y2K bug.
>>>> 
>>>> 
>>>> Please see
>>>> 
>> ftp://ncbi.nlm.nih.gov/genbank/genomes/Bacteria/Staphylococcus_aureus_N315/BA000018.gbk
>> 
>>>> 
>>>> Can you please tell me the real release date?
>>>> 
>>>> Also, is newer the NC_xxxx for Staphylococcus aureus N315 under
>>>>  
>>>> ftp://ncbi.nlm.nih.gov/genomes/Bacteria/Staphylococcus_aureus_N315/
>>>>  or this BA000018 non-cured version?
>>>> 
>>>> 
>>>> LOCUS       BA000018  2814816 bp    DNA   circular  BCT
>>>> 01-JAN-1900 DEFINITION  Staphylococcus aureus strain N315,
>>>> complete genome.

>>> AP003129-AP003138. They are all dated June 2001.
>>> 
>>> The date for the record in the ftp file is April 2001. The record
>>> in GenBank (NC_002745) is dated October 2001. This version is
>>> apparently more updated than the one on the ftp site. Therefore,
>>> you may want to download the sequence from GenBank rather than
>>> the ftp site.
>>> 
>>> Regards, Dawn S. Lipshultz

>> I cannot find the record to which you refer in your message. When I
>>  did a search for accession number BA000018, I received results for
>>  accession numbers AP003129-AP003138. They are all dated June 2001.
>> 
>> 
>> The date for the record in the ftp file is April 2001. The record
>> in GenBank (NC_002745) is dated October 2001. This version is
>> apparently more updated than the one on the ftp site. Therefore,
>> you may want to download the sequence from GenBank rather than the
>> ftp site. Regards, Dawn S. Lipshultz

> 
> Hmm, but I do get: 
> http://www.ncbi.nlm.nih.gov:80/cgi-bin/Entrez/framik?db=genome&gi=179
> 
> 
> look at the "GenBank: NC_002745" text in left upper part of the
> window, it points to that OLD ftp file. The "RefSeq: NC_002745"
> points to the April 2001 version. So what is the right way to get the
> October 2001 release?
> 
> Where can I find the difference between NC_002745 from April compared
>  to NC_002745 from October?
> 
> What do you mean with "you may want to download the sequence from 
> GenBank rather than the ftp site."?
> 
> BOTH ftp directories at ftp://ncbi.nlm.nih.gov are outdated. I mean 
> the genomes/Bacteria/Staphylococcus_aureus_N315/NC_002745.* version 
> and also the 
> genbank/genomes/Bacteria/Staphylococcus_aureus_N315/BA000018.* 
> version.
> 
> The web links from www.ncbi.nlm.nih.gov:80/cgi-bin/Entrez/ point 
> anyway to the ftp site. Do you want to say that the ftp version
> aren't updated anymore?

The genome was originally released into the database on 4/20/2001
as 10 pieces with secondary accession number BA000018.  You can 
find these pieces in Entrez nucleotides by querying with BA000018.

The Genomes group here will fix the date on the record that is available
from Entrez genomes.

Regards,
Dawn


Date: Fri, 16 Nov 2001 16:09:59 -0500 (EST)
From: Susan Dombrowski
Subject: Re: Agrobacterium tumefaciens C58

> Dear colleague, I've noticed that there're somehow updated on Oct 17
> the genomic flatfiles of Agrobacterium tumefaciens C58 at 
> ftp://ncbi.nlm.nih.gov/genbank/genomes/Bacteria/Agrobacterium_tumefaciens/.
>  However, for example the AE007869.gbs does NOT self-explain what has
> been changed and also the VERSION number is not increased. Would you
> please explain what's the change, when can I find such information
> next time on web?
> 
> I've used the published sequence from your ftp site on 2001-08-29
> with same ID and would like to know, what differs.
> 
> LOCUS       AE007869  2841581 bp    DNA   circular  CON
> 17-OCT-2001 DEFINITION  Agrobacterium tumefaciens strain C58 circular
> chromosome, complete sequence. ACCESSION   AE007869 VERSION
> AE007869

Dear Colleague,
The version number of a sequence will *only* change if the content of the actual 
sequence has changed in any way since it was first made available. Although the 
date has changed, this date refers to the last time the actual record was 
manipulated by an NCBI staff member. Even if there is something simple, like 
adding a reference, changing a spelling mistake, etc., this will cause a change 
in the date field of the record. 

Thus, since the version has not changed, there are no differences to report.
Best Regards,
Susan


Date: Wed, 26 Jun 2002 11:04:48 -0400 (EDT)
From: Eric Sayers
Subject: Re: Mesorhizobium_loti flatfiles

>>>>> Hi,
>>>>>   I've found that you again silently changed flatfiles lying on your ftp
>>>>> some time ago without changing the revision number. Please apologize me,
>>>>> but this really causes troubles to other people working in this so called
>>>>> bioinformatics. :(
>>>>> 
>>>>> A week ago there was:
>>>>> 
>>>>> LOCUS       NC_002678            7036074 bp    DNA     circular BCT 10-SEP-2001
>>>>> DEFINITION  Mesorhizobium loti, complete genome.
>>>>> ACCESSION   NC_002678
>>>>> VERSION     NC_002678.1  GI:13470324
>>>>> 
>>>>> 
>>>>> and two other plasmid sequences. This yelds 7275 proteins.
>>>>> 
>>>>> But, last autumn there was:
>>>>> 
>>>>> LOCUS       NC_002678 7036074 bp    DNA   circular  BCT       28-MAR-2001
>>>>> DEFINITION  Mesorhizobium loti, complete genome.
>>>>> ACCESSION   NC_002678
>>>>> VERSION     NC_002678.1  GI:13470324
>>>>> 
>>>>> 
>>>>> That version had 7281 proteins in total.
>>>>> I have simple questions: "Why was NOT changed the VERSION number?".
>>>>>
>>>>> Do I understand it wrong, that it should get updated whenever a single
>>>>> character in the file contents is changed?
>>> 
>>>> The version number of a sequence only changes if the sequence itself is
>>>> modified. If anything else in the flat file is changed (ie spelling, authors,
>>>> annotations, etc) the version will not change. However, the modification date in
>>> 
>>> Sorry, do you under annotation also mean number of predicted genes, their
>>> coordinates(position) etc?
>>> 
>>>> the top line of the flat file will change for any of these modifications. (Note
>>>> that the dates are different in the file you display: Mar 28, 2001 vs Sept 10,
>>>> 2001.) I would track the modification date rather than or as well as the version
>>>> number to catch all changes in the files.
>>>> Regards,
>>>> Eric W. Sayers, Ph.D.
>>> 
>>> OK, but unless some of our programs have been buggy before or now (in
>>> either of those cases have failed to extract genes from flatfiles), I do
>>> not have an explanation for the differencies in amount of
>>> predicted/annotated genes.
>>> 
>>> I do not have anymore available the old flatfiles from Mar 28, but it
>>> seems to me that these were newly introduced in the Sept. 10 version:
>>> gi_15600768, gi_15600770, gi_15600769, gi_15600766, gi_15600767
>> 
>> Dear Colleague,
>> Again, the only reason the version number will change is if the sequence itself 
>> changes. The number of annotated/predicted genes is merely an annotation on the 
>> sequence, and does not change the sequence itself. Therefore, the version will 
>> not change when the number of annotations changes. The modification date on the 
>> flat file will (and did) change, of course.
>> 
>> Regards,
>> Eric W. Sayers, Ph.D.
> 
> Finally I've heard that from someone, thanks!
> Now just tell me, how can I figure out what changed between those
> different "date" releases? Is there a changelog available?
> I consider annotations changes very important.

We do not provide the details of flat file changes on our public websites, 
except for changes in the version number (ie actual sequence changes). In that 
particular case, all of the previous versions are linked to the current one. My 
advice to you if you want to chronicle non-sequence changes would be to check 
the flat files of interest periodically (by a script, for example) and look for 
changes in the modification dates. You could then simply compare the before and 
after flat files.

Regards,
Eric W. Sayers, Ph.D.


> Hi, Miguel:
> 
> id1_fetch can do it. Detailed instruction can be found at:  
> 
> http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=toolkit.section.ch_demo.id1_fetch.html
> 
> Here is an example:
> 
>> >id1_fetch -lt revisions -flat '12:74311105' -fmt fasta
> GI        Loaded      DB    Retrieval No.
> --        ------      --    -------------
> 74311105  12/07/2007  NCBI  19766263
> 74311105  01/23/2007  NCBI  16325656
> 74311105  03/30/2006  NCBI  13131204
> 74311105  03/03/2006  NCBI  12915541
> 74311105  03/02/2006  NCBI  12885275
> 74311105  12/03/2005  NCBI  12259793
> 74311105  09/09/2005  NCBI  11257262
> 74311105  09/09/2005  NCBI  11242667
> 
> Wenwu Cui PhD


From david at burt7259.freeserve.co.uk  Sun Apr 13 14:32:31 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sun, 13 Apr 2008 15:32:31 +0100
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce2$5400a710$0202a8c0@STUDYPC>
	<3F77F49A-9C9E-4450-AE28-46F00CADBC8B@gmx.net>
Message-ID: <000001c89d73$3b49eec0$0202a8c0@STUDYPC>

Hi Hilmar

 
Many thanks for info - tried a few things

 
1. First tried --safe flag

 
perl load_ontology.pl --host 127.0.0.1 --dbname bioseqdb --dbuser root
--dbpass chicken --driver mysql --safe \

 --namespace "InterPro" --format interprosax interpro.xml

 
Still got same output as before

 
        ...deleting all relationships for InterPro

        ...parsing and loading InterPro

 
Can't call method "name" on an undefined value at load_ontology.pl line 914

 
Only 35 interpro entries entered into database

 
2. I am using bioperl 1.5.2

 
3. I downloaded Release 17.0, 20 March 2008 of the interpro.xml file from
ftp://ftp.ebi.ac.uk/pub/databases/interpro/

 
I did not send this file, sine it was ~10Mb gzipped

 
Dave

 
From david at burt7259.freeserve.co.uk  Sun Apr 13 14:53:43 2008
From: david at burt7259.freeserve.co.uk (David Burt)
Date: Sun, 13 Apr 2008 15:53:43 +0100
Subject: [Bioperl-l] bioperl-db
In-Reply-To: <FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
References: <000001c89cbe$f2b92b80$0202a8c0@STUDYPC>
	<C20F53B0-7519-4ED4-993F-4E5B5B39AA73@gmx.net>
	<000001c89ce5$a5df2e50$0202a8c0@STUDYPC>
	<FA154FA7-76E2-47D5-8F4A-5BA55E566F0B@gmx.net>
Message-ID: <000001c89d76$319be060$0202a8c0@STUDYPC>

Hilmar

 
Also updated copy of bioperl - see output below

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ perl -MBio::Perl -le 'print Bio::Perl->VERSION;'

1.005002101

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ cvs -d :pserver:cvs at cvs.bioperl.org:/home/repository/bioperl login

Logging in to :pserver:cvs at cvs.bioperl.org:2401/home/repository/bioperl

CVS password:

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src

$ cd bioperl-live

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src/bioperl-live

$ cvs -q update -d -P -r bioperl-release-1-5-2

P Build.PL

P ModuleBuildBioperl.pm

P Bio/Root/Version.pm

cvs update: warning: t/data/taxdump/names.dmp was lost

U t/data/taxdump/names.dmp

cvs update: warning: t/data/taxdump/nodes.dmp was lost

U t/data/taxdump/nodes.dmp

 
root at STUDY_PC /cygdrive/c/Bioinformatics/Ensembl/src/bioperl-live

$ perl -MBio::Perl -le 'print Bio::Perl->VERSION;'

1.0050021

 
Why is the VERSION 1.0050021 rather than 1.5.2 ?

 
Dave


From heikki at sanbi.ac.za  Wed Apr 16 11:36:16 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 16 Apr 2008 13:36:16 +0200
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
Message-ID: <200804161336.16879.heikki@sanbi.ac.za>

FYI,

Christoper Jones has just published 
[http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an 
article in Bioinformatics] about his 
[http://search.cpan.org/perldoc?Microarray Microarray perl module] in CPAN.

(The text added into BioPerl wiki.)

	-Heikki


On Friday 26 January 2007 16:05:01 Chris Fields wrote:
> Don't know if it's worth it, but could the microarray package be
> modified so that it deals with data generated from or interacts
> directly with Bioconductor (i.e. maybe including some specialized
> bioperl-run set of classes to run Bioconductor tasks, return
> lightweight bioperl microarray classes)?  Allen pointed out in a
> previous post that Bioconductor is the best pick for certain tasks,
> while Perl excels at others:
>
> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>
> Might be nice if we could merge both strengths together in some way.
>
> chris
>
> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
> >> Eh, there is some discussion activity on the list, but not much.  You
> >> are really better off moving to Bioconductor.
> >
> > Ok, thanks. I added that to the wiki page:
> >
> >     http://www.bioperl.org/wiki/Microarray_package
> >
> > j
> > seqlab.net
> > http://www.bioperl.org/wiki/User:Jhannah
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From heikki at sanbi.ac.za  Wed Apr 16 11:36:16 2008
From: heikki at sanbi.ac.za (Heikki Lehvaslaiho)
Date: Wed, 16 Apr 2008 13:36:16 +0200
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
Message-ID: <200804161336.16879.heikki@sanbi.ac.za>

FYI,

Christoper Jones has just published 
[http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an 
article in Bioinformatics] about his 
[http://search.cpan.org/perldoc?Microarray Microarray perl module] in CPAN.

(The text added into BioPerl wiki.)

	-Heikki


On Friday 26 January 2007 16:05:01 Chris Fields wrote:
> Don't know if it's worth it, but could the microarray package be
> modified so that it deals with data generated from or interacts
> directly with Bioconductor (i.e. maybe including some specialized
> bioperl-run set of classes to run Bioconductor tasks, return
> lightweight bioperl microarray classes)?  Allen pointed out in a
> previous post that Bioconductor is the best pick for certain tasks,
> while Perl excels at others:
>
> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>
> Might be nice if we could merge both strengths together in some way.
>
> chris
>
> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
> >> Eh, there is some discussion activity on the list, but not much.  You
> >> are really better off moving to Bioconductor.
> >
> > Ok, thanks. I added that to the wiki page:
> >
> >     http://www.bioperl.org/wiki/Microarray_package
> >
> > j
> > seqlab.net
> > http://www.bioperl.org/wiki/User:Jhannah
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


-- 
______ _/      _/_____________________________________________________
      _/      _/
     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
   _/  _/  _/  SANBI, South African National Bioinformatics Institute
  _/  _/  _/  University of Western Cape, South Africa
     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
___ _/_/_/_/_/________________________________________________________


From pan.mueller at yahoo.de  Wed Apr 16 12:34:51 2008
From: pan.mueller at yahoo.de (=?iso-8859-1?Q?Peter_M=FCller?=)
Date: Wed, 16 Apr 2008 12:34:51 +0000 (GMT)
Subject: [Bioperl-l] load_seqdatabase.pl --pipeline
Message-ID: <297809.47580.qm@web28203.mail.ukl.yahoo.com>

Dear list,

a want to add gene symbols to unigene-cluster which were in a biosql database and lacks this information.

So one way is to make a post-update script:
my $adp = $db->get_object_adaptor('Bio::ClusterI');
my $pseq = $adp->find_by_primary_key(n);
$adp->remove($pseq);
$pseq->gene('symbol');
$adp->store($pseq);
$adp->commit();

O.k., this works (I ask me why to remove the cluster first - bug or feature...?)

Second way - perhaps:
Using the --pipeline option, but it looks like useable only for seq-objects (Bio::Factory::SeqProcessoI) right?

regards
pan


      Machen Sie Yahoo! zu Ihrer Startseite. Los geht's: 
http://de.yahoo.com/set


From cjfields at uiuc.edu  Wed Apr 16 15:00:51 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 16 Apr 2008 10:00:51 -0500
Subject: [Bioperl-l] bioperl-microarray: status?
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <479BD5A4-9C9A-4733-889D-65942F24A7F3@uiuc.edu>

That would be worth looking into at some point, if anyone's interested  
(though it may be best to build a 'bridging' module).  Wonder if it  
uses BioConductor and, if not, how performance is vs BioConductor?

chris

On Apr 16, 2008, at 6:36 AM, Heikki Lehvaslaiho wrote:

> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/ 
> 24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module]  
> in CPAN.
>
> (The text added into BioPerl wiki.)
>
> 	-Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>>> On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>>>> Eh, there is some discussion activity on the list, but not much.   
>>>> You
>>>> are really better off moving to Bioconductor.
>>>
>>> Ok, thanks. I added that to the wiki page:
>>>
>>>    http://www.bioperl.org/wiki/Microarray_package
>>>
>>> j
>>> seqlab.net
>>> http://www.bioperl.org/wiki/User:Jhannah
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From j-keller2 at md.northwestern.edu  Wed Apr 16 16:12:27 2008
From: j-keller2 at md.northwestern.edu (Jacob Keller)
Date: Wed, 16 Apr 2008 11:12:27 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>

Hello All,

I am new to this list, so am not totally sure this is the right forum, so 
please forgive if this is not the right place to asl the following question: 
I am seeking to get all sequences that have a given domain architecture, or 
at least that contain two given domains. I have thought of a few ways to do 
this.

1. Blast/Psi-blast for each domain, then compare the results for common 
sequences between the two lists, and fetch those. I would need to write a 
(simple) script to do this, but would prefer not to re-invent the wheel.

2. Search with a paradigm sequence of desired architecture/domain 
composition, somehow tweaking the psiblast parameters to find only matches 
over the whole search sequence, thereby finding both desired domains. I am 
not sure how to tweak blast to do this, though.

3. Pfam has this capability, i.e. to show all domains with a given 
architecture, but it is difficult to get at the actual sequences or even a 
list of accession numbers.

Does anybody have any suggestions as to how optimally to get these seq's?

Thanks for your consideration,

Jacob

*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-keller2 at northwestern.edu
*******************************************

----- Original Message ----- 
From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za>
To: <bioperl-l at lists.open-bio.org>
Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay Hannah" 
<jay at jays.net>; <bioperl-l at bioperl.org>
Sent: Wednesday, April 16, 2008 6:36 AM
Subject: Re: [Bioperl-l] bioperl-microarray: status?


> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module] in 
> CPAN.
>
> (The text added into BioPerl wiki.)
>
> -Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>> >> Eh, there is some discussion activity on the list, but not much.  You
>> >> are really better off moving to Bioconductor.
>> >
>> > Ok, thanks. I added that to the wiki page:
>> >
>> >     http://www.bioperl.org/wiki/Microarray_package
>> >
>> > j
>> > seqlab.net
>> > http://www.bioperl.org/wiki/User:Jhannah
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From j-keller2 at md.northwestern.edu  Wed Apr 16 16:12:27 2008
From: j-keller2 at md.northwestern.edu (Jacob Keller)
Date: Wed, 16 Apr 2008 11:12:27 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <200804161336.16879.heikki@sanbi.ac.za>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
Message-ID: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>

Hello All,

I am new to this list, so am not totally sure this is the right forum, so 
please forgive if this is not the right place to asl the following question: 
I am seeking to get all sequences that have a given domain architecture, or 
at least that contain two given domains. I have thought of a few ways to do 
this.

1. Blast/Psi-blast for each domain, then compare the results for common 
sequences between the two lists, and fetch those. I would need to write a 
(simple) script to do this, but would prefer not to re-invent the wheel.

2. Search with a paradigm sequence of desired architecture/domain 
composition, somehow tweaking the psiblast parameters to find only matches 
over the whole search sequence, thereby finding both desired domains. I am 
not sure how to tweak blast to do this, though.

3. Pfam has this capability, i.e. to show all domains with a given 
architecture, but it is difficult to get at the actual sequences or even a 
list of accession numbers.

Does anybody have any suggestions as to how optimally to get these seq's?

Thanks for your consideration,

Jacob

*******************************************
Jacob Pearson Keller
Northwestern University
Medical Scientist Training Program
Dallos Laboratory
F. Searle 1-240
2240 Campus Drive
Evanston IL 60208
lab: 847.491.2438
cel: 773.608.9185
email: j-keller2 at northwestern.edu
*******************************************

----- Original Message ----- 
From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za>
To: <bioperl-l at lists.open-bio.org>
Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay Hannah" 
<jay at jays.net>; <bioperl-l at bioperl.org>
Sent: Wednesday, April 16, 2008 6:36 AM
Subject: Re: [Bioperl-l] bioperl-microarray: status?


> FYI,
>
> Christoper Jones has just published
> [http://bioinformatics.oxfordjournals.org/cgi/content/short/24/8/1102 an
> article in Bioinformatics] about his
> [http://search.cpan.org/perldoc?Microarray Microarray perl module] in 
> CPAN.
>
> (The text added into BioPerl wiki.)
>
> -Heikki
>
>
> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>> Don't know if it's worth it, but could the microarray package be
>> modified so that it deals with data generated from or interacts
>> directly with Bioconductor (i.e. maybe including some specialized
>> bioperl-run set of classes to run Bioconductor tasks, return
>> lightweight bioperl microarray classes)?  Allen pointed out in a
>> previous post that Bioconductor is the best pick for certain tasks,
>> while Perl excels at others:
>>
>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>
>> Might be nice if we could merge both strengths together in some way.
>>
>> chris
>>
>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>> >> Eh, there is some discussion activity on the list, but not much.  You
>> >> are really better off moving to Bioconductor.
>> >
>> > Ok, thanks. I added that to the wiki page:
>> >
>> >     http://www.bioperl.org/wiki/Microarray_package
>> >
>> > j
>> > seqlab.net
>> > http://www.bioperl.org/wiki/User:Jhannah
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
> -- 
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From frederic.romagne at gmail.com  Wed Apr 16 17:25:18 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 16 Apr 2008 12:25:18 -0500
Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
Message-ID: <1208366718.19084.15.camel@kiss-laptop>

Hello,
i made a program which use Bio::Index::GenBank and i tested it under
unix, that worked well.

But i have to launch it under windows and it seems not to work on.

Here is the problem : 

my $dbobj = Bio::Index::Abstract->new("Data/$db");
?my $seq = $dbobj->get_Seq_by_acc($id);
print $seq->display_id."\n";

did not print the same number than $id !!! So i don't work on the
sequence expected...

I use the SVN sources on unix and the Perl package manager for
windows...

Thanks.


From cjfields at uiuc.edu  Wed Apr 16 17:52:59 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 16 Apr 2008 12:52:59 -0500
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net><D6030075-C999-464B-A998-3C69346C7FB0@jays.net><AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
Message-ID: <BAA878A0-94B4-481F-B01C-A12086FD41E3@uiuc.edu>

You can try CDART:

http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi?cmd=rps

There are probably other tools out there as well.

If you want to roll your own, you can use bioperl wrappers for all of  
these (Bio::Tools::Run::StandAloneBlast is in bioperl-live,  
Bio::Tools::Run::Hmmer in bioperl-run), tweaking the parameters as you  
see fit, and either parse while running them or store the file for  
parsing later using Bio::SearchIO.  Personally, I wouldn't go with (2)  
unless you are absolutely sure the domains are found only once per  
sequence, are spatially conserved, and don't overlap.  For instance,  
with many proteins you could have a domain structure like dom1-dom2,  
dom2-dom1, dom1-dom1-dom2, etc.

If you just want accessions from Pfam's Stockholm format (which are  
UniProt, I believe) you can get at accessions using  
Bio::AlignIO::stockholm (using perl 5.10):

use Bio::AlignIO;
use feature 'say';

my $file = shift || die "Must pass file as argument\n";

my $in = Bio::AlignIO->new(-format => 'stockholm',
                            -file => $file);

while (my $aln = $in->next_aln) {
     my @accs;
     for my $seq ($aln->each_seq) {
         push @accs, $seq->accession_number;
     }
     say join(',', at accs);
}

chris

On Apr 16, 2008, at 11:12 AM, Jacob Keller wrote:

> Hello All,
>
> I am new to this list, so am not totally sure this is the right  
> forum, so please forgive if this is not the right place to asl the  
> following question: I am seeking to get all sequences that have a  
> given domain architecture, or at least that contain two given  
> domains. I have thought of a few ways to do this.
>
> 1. Blast/Psi-blast for each domain, then compare the results for  
> common sequences between the two lists, and fetch those. I would  
> need to write a (simple) script to do this, but would prefer not to  
> re-invent the wheel.
>
> 2. Search with a paradigm sequence of desired architecture/domain  
> composition, somehow tweaking the psiblast parameters to find only  
> matches over the whole search sequence, thereby finding both desired  
> domains. I am not sure how to tweak blast to do this, though.
>
> 3. Pfam has this capability, i.e. to show all domains with a given  
> architecture, but it is difficult to get at the actual sequences or  
> even a list of accession numbers.
>
> Does anybody have any suggestions as to how optimally to get these  
> seq's?
>
> Thanks for your consideration,
>
> Jacob
>
> *******************************************
> Jacob Pearson Keller
> Northwestern University
> Medical Scientist Training Program
> Dallos Laboratory
> F. Searle 1-240
> 2240 Campus Drive
> Evanston IL 60208
> lab: 847.491.2438
> cel: 773.608.9185
> email: j-keller2 at northwestern.edu
> *******************************************
>
> ----- Original Message ----- From: "Heikki Lehvaslaiho" <heikki at sanbi.ac.za 
> >
> To: <bioperl-l at lists.open-bio.org>
> Cc: <allenday at ucla.edu>; "Chris Fields" <cjfields at uiuc.edu>; "Jay  
> Hannah" <jay at jays.net>; <bioperl-l at bioperl.org>
> Sent: Wednesday, April 16, 2008 6:36 AM
> Subject: Re: [Bioperl-l] bioperl-microarray: status?
>
>
>> FYI,
>>
>> Christoper Jones has just published
>> [http://bioinformatics.oxfordjournals.org/cgi/content/short/ 
>> 24/8/1102 an
>> article in Bioinformatics] about his
>> [http://search.cpan.org/perldoc?Microarray Microarray perl module]  
>> in CPAN.
>>
>> (The text added into BioPerl wiki.)
>>
>> -Heikki
>>
>>
>> On Friday 26 January 2007 16:05:01 Chris Fields wrote:
>>> Don't know if it's worth it, but could the microarray package be
>>> modified so that it deals with data generated from or interacts
>>> directly with Bioconductor (i.e. maybe including some specialized
>>> bioperl-run set of classes to run Bioconductor tasks, return
>>> lightweight bioperl microarray classes)?  Allen pointed out in a
>>> previous post that Bioconductor is the best pick for certain tasks,
>>> while Perl excels at others:
>>>
>>> http://article.gmane.org/gmane.comp.lang.perl.bio.general/13993
>>>
>>> Might be nice if we could merge both strengths together in some way.
>>>
>>> chris
>>>
>>> On Jan 26, 2007, at 7:26 AM, Jay Hannah wrote:
>>> > On Jan 25, 2007, at 2:30 AM, Allen Day wrote:
>>> >> Eh, there is some discussion activity on the list, but not  
>>> much.  You
>>> >> are really better off moving to Bioconductor.
>>> >
>>> > Ok, thanks. I added that to the wiki page:
>>> >
>>> >     http://www.bioperl.org/wiki/Microarray_package
>>> >
>>> > j
>>> > seqlab.net
>>> > http://www.bioperl.org/wiki/User:Jhannah
>>> >
>>> > _______________________________________________
>>> > Bioperl-l mailing list
>>> > Bioperl-l at lists.open-bio.org
>>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> -- 
>> ______ _/      _/ 
>> _____________________________________________________
>>     _/      _/
>>    _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>>   _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>>  _/  _/  _/  SANBI, South African National Bioinformatics Institute
>> _/  _/  _/  University of Western Cape, South Africa
>>    _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
>> ___ _/_/_/_/_/ 
>> ________________________________________________________
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From David.Messina at sbc.su.se  Wed Apr 16 18:23:27 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Wed, 16 Apr 2008 20:23:27 +0200
Subject: [Bioperl-l] Finding seqs of given domain architecture
In-Reply-To: <B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
Message-ID: <628aabb70804161123s453bd96bqd2213b938dfdb3a2@mail.gmail.com>

Hey Jacob,

This forum is mostly geared toward the BioPerl software package rather than
general bioinformatics assistance.

That being said, I would recommend using Pfam's Sequence Search to determine
the domain content of your sequences and then simply looking at those which
have the same two domains of interest.

If there are more sequences matching this criterion than can be examined
manually, you could write up something (potentially using BioPerl) to then
look at the relative order and number of those domains in your sequences.

However, if these sequences have UniProt IDs, you can start with the domains
and Pfam will hand you a list of all the UniProt seqs having those domains.
On the Pfam website's main page, click on "Help" (right side of menu at the
top of the page) and then "Tools and Services" (left side menu).


Dave


From Russell.Smithies at agresearch.co.nz  Wed Apr 16 20:49:49 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Thu, 17 Apr 2008 08:49:49 +1200
Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
In-Reply-To: <1208366718.19084.15.camel@kiss-laptop>
References: <1208366718.19084.15.camel@kiss-laptop>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>

Did you check the format of your input file?
i.e. DOS or UNIX line endings?

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-
> bio.org] On Behalf Of Fr?d?ric Romagn?
> Sent: Thursday, 17 April 2008 5:25 a.m.
> To: bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> 
> Hello,
> i made a program which use Bio::Index::GenBank and i tested it under
> unix, that worked well.
> 
> But i have to launch it under windows and it seems not to work on.
> 
> Here is the problem :
> 
> my $dbobj = Bio::Index::Abstract->new("Data/$db");
> ?my $seq = $dbobj->get_Seq_by_acc($id);
> print $seq->display_id."\n";
> 
> did not print the same number than $id !!! So i don't work on the
> sequence expected...
> 
> I use the SVN sources on unix and the Perl package manager for
> windows...
> 
> Thanks.
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From frederic.romagne at gmail.com  Wed Apr 16 21:39:07 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Wed, 16 Apr 2008 16:39:07 -0500
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
Message-ID: <1208381947.16620.6.camel@kiss-laptop>

Well, if with input file you mean the database used, it's created
with ?Bio::Index::GenBank from a ncbi FTP's genbank file.

$id is an accession number read from a file but i chomp the line...

I am trying to install the svn version of bioperl under windows to see
if there is an improvement.

Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
> Did you check the format of your input file?
> i.e. DOS or UNIX line endings?
> 
> > -----Original Message-----
> > From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-
> > bio.org] On Behalf Of Fr?d?ric Romagn?
> > Sent: Thursday, 17 April 2008 5:25 a.m.
> > To: bioperl-l at lists.open-bio.org
> > Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> > 
> > Hello,
> > i made a program which use Bio::Index::GenBank and i tested it under
> > unix, that worked well.
> > 
> > But i have to launch it under windows and it seems not to work on.
> > 
> > Here is the problem :
> > 
> > my $dbobj = Bio::Index::Abstract->new("Data/$db");
> > ?my $seq = $dbobj->get_Seq_by_acc($id);
> > print $seq->display_id."\n";
> > 
> > did not print the same number than $id !!! So i don't work on the
> > sequence expected...
> > 
> > I use the SVN sources on unix and the Perl package manager for
> > windows...
> > 
> > Thanks.
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


From hubert.gaynor at yahoo.com  Thu Apr 17 06:19:11 2008
From: hubert.gaynor at yahoo.com (Hubert Gaynor)
Date: Wed, 16 Apr 2008 23:19:11 -0700 (PDT)
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
Message-ID: <657734.41592.qm@web46008.mail.sp1.yahoo.com>

Hi,

As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier? 

Thanks!

Hubert.


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


From sdavis2 at mail.nih.gov  Thu Apr 17 10:36:32 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Thu, 17 Apr 2008 06:36:32 -0400
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
In-Reply-To: <657734.41592.qm@web46008.mail.sp1.yahoo.com>
References: <657734.41592.qm@web46008.mail.sp1.yahoo.com>
Message-ID: <264855a00804170336o2a2bcff9xfcb05a33bac4c8dc@mail.gmail.com>

On Thu, Apr 17, 2008 at 2:19 AM, Hubert Gaynor <hubert.gaynor at yahoo.com> wrote:
> Hi,
>
>  As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier?
>

formatdb is used to make a representation that can be used efficiently
by blast.  That representation already makes blast faster.  MySQL
can't be used for such things.  As for speeding blast, if you have a
multiprocessor machine, you can take advantage of those using blast
and increasing the number of processors.  Also, while blast is a very
versatile program, it is not the only alignment program available.
Depending on your needs, you could look at other programs such as blat
or gmap that can be 2-3 orders of magnitude faster than blast.

Sean


From stefan.kirov at bms.com  Thu Apr 17 13:40:29 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 09:40:29 -0400
Subject: [Bioperl-l] bioperl-db woes
Message-ID: <4807534D.80105@bms.com>

I'm having problems passing all the tests for bioperl-db. There are 2
distinct errors, first one:
Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
   ***Which by the way is embed deep into several layers of eval, so I
am getting the actual error from the test:
    ***t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"
via package "Bio::Ontology::Term" at    
       
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
line 552, <GEN0> line 78.
       or
       ------------- EXCEPTION: Bio::Root::Exception -------------

    MSG: Annotation of class Bio::Annotation::Collection not
    type-mapped. Internal error?
    STACK: Error::throw
    STACK: Bio::Root::Root::throw
    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
    STACK:
    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
    STACK: Bio::DB::Persistent::PersistentObject::store
    Bio/DB/Persistent/PersistentObject.pm:271
    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
    Bio/DB/BioSQL/SeqAdaptor.pm:224
    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
    STACK: Bio::DB::Persistent::PersistentObject::create
    Bio/DB/Persistent/PersistentObject.pm:244
    STACK: t/04swiss.t:36
    -----------------------------------------------------------

It turns out the adaptor is really not there???
My bioperl-db is from
dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
bioperl-db (revision 14661)
Is this module being deprecated (I am sure it is not) my download
incomplete....?
The other problem was:
DBD::Oracle::st execute failed: ORA-02292: integrity constraint
(BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
ParamValues: :p1=9606] at
/home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
line 320.
not ok 76
# Test 76 got: <UNDEF> (t/02species.t at line 71)
I have not tried to debug this one....
Thanks!
Stefan


From stefan.kirov at bms.com  Thu Apr 17 14:18:30 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 10:18:30 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
Message-ID: <Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>


On Thu, 17 Apr 2008, Chris Fields wrote:

> The 'get_dbxrefs' problem looks related to recent changes I made when rolling 
> back the significant feature/annotation changes introduced just prior to the 
> 1.5 release, none which were fully implemented.  I can check that one out. 
> Odd though; these passed for me, but I'm using MySQL not oracle.
get_dbxref is not the problem- I think the error message is misleading:
kirovs at horta:~/bioperl-db> grep get_dbxrefs 
/home/kirovs/bioperl-live/Bio/Ontology/Term.pm
            get_dbxrefs() instead, which handles both strings and DBLink
                       "Use get_dbxrefs() instead");
     $self->get_dbxrefs($context);
=head2 get_dbxrefs
  Title   : get_dbxrefs()
  Usage   : @ds = $term->get_dbxrefs();
sub get_dbxrefs {
} # get_dbxrefs
     my @old = $self->get_dbxrefs($context);
sub each_dblink {shift->throw("use of each_dblink() is deprecated; use 
get_dbxrefs() instead")}

So it is there.
In any case I debugged and tracked that down to the RichSeq adaptor module 
missing. It is not in the distro I downloaded, so I think this is my 
problem. It is a different question why...
I looked at different repos (SVN, CVS, trunk, different tags) and I did 
not see RichSeq.pm. I am not sure what is going on. Perhaps Hilmar will be 
able to help when he is around.
Thanks for the help Chris.... 
Stefan

>
> You may want to make sure you are using bioperl-live and that there isn't an 
> older bioperl installation getting into the mix.
>
> chris
>
> On Apr 17, 2008, at 8:40 AM, Stefan Kirov wrote:
>
>> I'm having problems passing all the tests for bioperl-db. There are 2
>> distinct errors, first one:
>> Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
>>  ***Which by the way is embed deep into several layers of eval, so I
>> am getting the actual error from the test:
>>   ***t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::Term" at
>> 
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 78.
>>      or
>>      ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>   MSG: Annotation of class Bio::Annotation::Collection not
>>   type-mapped. Internal error?
>>   STACK: Error::throw
>>   STACK: Bio::Root::Root::throw
>>   /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>   STACK:
>>   Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>   STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>   STACK: Bio::DB::Persistent::PersistentObject::store
>>   Bio/DB/Persistent/PersistentObject.pm:271
>>   STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>   Bio/DB/BioSQL/SeqAdaptor.pm:224
>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>   STACK: Bio::DB::Persistent::PersistentObject::create
>>   Bio/DB/Persistent/PersistentObject.pm:244
>>   STACK: t/04swiss.t:36
>>   -----------------------------------------------------------
>> 
>> It turns out the adaptor is really not there???
>> My bioperl-db is from
>> dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
>> bioperl-db (revision 14661)
>> Is this module being deprecated (I am sure it is not) my download
>> incomplete....?
>> The other problem was:
>> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
>> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
>> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
>> ParamValues: :p1=9606] at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/BasePersistenceAdaptor.pm
>> line 320.
>> not ok 76
>> # Test 76 got: <UNDEF> (t/02species.t at line 71)
>> I have not tried to debug this one....
>> Thanks!
>> Stefan
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>


From cjfields at uiuc.edu  Thu Apr 17 13:59:57 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 08:59:57 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <4807534D.80105@bms.com>
References: <4807534D.80105@bms.com>
Message-ID: <82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>

The 'get_dbxrefs' problem looks related to recent changes I made when  
rolling back the significant feature/annotation changes introduced  
just prior to the 1.5 release, none which were fully implemented.  I  
can check that one out.  Odd though; these passed for me, but I'm  
using MySQL not oracle.

You may want to make sure you are using bioperl-live and that there  
isn't an older bioperl installation getting into the mix.

chris

On Apr 17, 2008, at 8:40 AM, Stefan Kirov wrote:

> I'm having problems passing all the tests for bioperl-db. There are 2
> distinct errors, first one:
> Can't locate Bio/DB/BioSQL/RichSeqAdaptor.pm
>   ***Which by the way is embed deep into several layers of eval, so I
> am getting the actual error from the test:
>    ***t/04swiss.........ok 3/52Can't locate object method  
> "get_dbxrefs"
> via package "Bio::Ontology::Term" at
>
> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
> line 552, <GEN0> line 78.
>       or
>       ------------- EXCEPTION: Bio::Root::Exception -------------
>
>    MSG: Annotation of class Bio::Annotation::Collection not
>    type-mapped. Internal error?
>    STACK: Error::throw
>    STACK: Bio::Root::Root::throw
>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>    STACK:
>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>    STACK: Bio::DB::Persistent::PersistentObject::store
>    Bio/DB/Persistent/PersistentObject.pm:271
>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>    STACK: Bio::DB::Persistent::PersistentObject::create
>    Bio/DB/Persistent/PersistentObject.pm:244
>    STACK: t/04swiss.t:36
>    -----------------------------------------------------------
>
> It turns out the adaptor is really not there???
> My bioperl-db is from
> dev.open-bio.org/home/svn-repositories/bioperl/bioperl-db/trunk
> bioperl-db (revision 14661)
> Is this module being deprecated (I am sure it is not) my download
> incomplete....?
> The other problem was:
> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
> ParamValues: :p1=9606] at
> /home/kirovs/bioperl-db/blib/lib/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm
> line 320.
> not ok 76
> # Test 76 got: <UNDEF> (t/02species.t at line 71)
> I have not tried to debug this one....
> Thanks!
> Stefan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stefan.kirov at bms.com  Thu Apr 17 14:52:32 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 10:52:32 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>
References: <4807534D.80105@bms.com>
	<9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>
Message-ID: <Pine.WNT.4.64.0804171052070.2732@A161887.one.ads.bms.com>

That is correct and I assumed I should not be concerned with this error.
Thanks
Stefan

On Thu, 17 Apr 2008, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 9:40 AM, Stefan Kirov wrote:
>> The other problem was:
>> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
>> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
>> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
>> ParamValues: :p1=9606] at
>
>
> This sounds like you are running the tests against a non-empty database?
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>
>


From hlapp at gmx.net  Thu Apr 17 14:47:58 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Apr 2008 10:47:58 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
Message-ID: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>


On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
> In any case I debugged and tracked that down to the RichSeq adaptor  
> module missing.


That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
Bio::Seq and a SeqAdaptor is present.

I'm afraid it gets stuck somewhere else and frankly I didn't see the  
RichSeqAdaptor failing to load in your stack trace:

>        ------------- EXCEPTION: Bio::Root::Exception -------------
>
>     MSG: Annotation of class Bio::Annotation::Collection not
>     type-mapped. Internal error?
>     STACK: Error::throw
>     STACK: Bio::Root::Root::throw
>     /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>     STACK:
>     Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>     Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>     STACK:  
> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>     Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>     STACK: Bio::DB::Persistent::PersistentObject::store
>     Bio/DB/Persistent/PersistentObject.pm:271
>     STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>     Bio/DB/BioSQL/SeqAdaptor.pm:224
>     STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>     Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>     STACK: Bio::DB::Persistent::PersistentObject::create
>     Bio/DB/Persistent/PersistentObject.pm:244
>     STACK: t/04swiss.t:36
>     -----------------------------------------------------------

What that tells me is that when bioperl-db tries to store the  
annotation bundle of the (SwissProt) sequence, one of the annotations  
that it encounters is of type Bio::Annotation::Collection. At present  
bioperl-db doesn't know what to do with it; i.e., bioperl-db can't  
yet handle hierarchical annotation collections (collections within  
collections).

I believe this is due to recent changes in how the GN line is parsed  
in BioPerl - Chris does this ring the right bell? I thought though  
you had built in a method would allow flattening out?

It's worth noting that BioSQL itself can't really represent nested  
annotation collections other than by using ontology terms and their  
hierarchy, which at present I think isn't really appropriate, but I  
have to think through the issue more. In other words, in BioSQL you  
can't directly tie together a bunch of qualifier value pairs into a  
"bag" and then nest this bag within another. The way to make this  
work with the current schema is to flatten out the nesting.

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From hlapp at gmx.net  Thu Apr 17 14:48:52 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Thu, 17 Apr 2008 10:48:52 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <4807534D.80105@bms.com>
References: <4807534D.80105@bms.com>
Message-ID: <9ECDEB39-95F3-4A94-9AF7-FFEBBDEFF0FA@gmx.net>


On Apr 17, 2008, at 9:40 AM, Stefan Kirov wrote:
> The other problem was:
> DBD::Oracle::st execute failed: ORA-02292: integrity constraint
> (BIOSQL.FKTAX_ENT) violated - child record found (DBD ERROR:
> OCIStmtExecute) [for Statement "DELETE FROM taxon WHERE oid = ?" with
> ParamValues: :p1=9606] at


This sounds like you are running the tests against a non-empty database?

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From stefan.kirov at bms.com  Thu Apr 17 15:28:42 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 11:28:42 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
Message-ID: <Pine.WNT.4.64.0804171052430.2732@A161887.one.ads.bms.com>

Hilmar,
I think I saw what happens with this adaptor-
In Bio::DB::BioSQL::DBAdaptor::_load_object_adaptor (call from 
create_persistent) there is request that this module is loaded:
Bio/DB/BioSQL/RichSeqAdaptor.pm
There is no such module... This always fails, but since it is evaled, 
there is no actual error- instead. Perhaps this is leftover...?
This got me fooled...

I guess Chris could be right-
  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key is 
being passed Bio::Annotation::Collection as a value for $obj->obj(). Or 
recursing too far?
Anyway, I am just guessing here- I do not know the architecture of 
bioperl-db...
Thanks again for the help...
Stefan

  On Thu, 17 Apr 2008, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>> In any case I debugged and tracked that down to the RichSeq adaptor module 
>> missing.
>
>
> That almost can't be the problem. Every Bio::Seq::RichSeq is-a Bio::Seq and a 
> SeqAdaptor is present.
>
> I'm afraid it gets stuck somewhere else and frankly I didn't see the 
> RichSeqAdaptor failing to load in your stack trace:
>
>>       ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>    MSG: Annotation of class Bio::Annotation::Collection not
>>    type-mapped. Internal error?
>>    STACK: Error::throw
>>    STACK: Bio::Root::Root::throw
>>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>    STACK:
>>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>    STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>    STACK: Bio::DB::Persistent::PersistentObject::store
>>    Bio/DB/Persistent/PersistentObject.pm:271
>>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::Persistent::PersistentObject::create
>>    Bio/DB/Persistent/PersistentObject.pm:244
>>    STACK: t/04swiss.t:36
>>    -----------------------------------------------------------
>
> What that tells me is that when bioperl-db tries to store the annotation 
> bundle of the (SwissProt) sequence, one of the annotations that it encounters 
> is of type Bio::Annotation::Collection. At present bioperl-db doesn't know 
> what to do with it; i.e., bioperl-db can't yet handle hierarchical annotation 
> collections (collections within collections).
>
> I believe this is due to recent changes in how the GN line is parsed in 
> BioPerl - Chris does this ring the right bell? I thought though you had built 
> in a method would allow flattening out?
>
> It's worth noting that BioSQL itself can't really represent nested annotation 
> collections other than by using ontology terms and their hierarchy, which at 
> present I think isn't really appropriate, but I have to think through the 
> issue more. In other words, in BioSQL you can't directly tie together a bunch 
> of qualifier value pairs into a "bag" and then nest this bag within another. 
> The way to make this work with the current schema is to flatten out the 
> nesting.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>


From cjfields at uiuc.edu  Thu Apr 17 16:26:41 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 11:26:41 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
Message-ID: <AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>


On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:

>
> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>> In any case I debugged and tracked that down to the RichSeq adaptor  
>> module missing.
>
>
> That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
> Bio::Seq and a SeqAdaptor is present.
>
> I'm afraid it gets stuck somewhere else and frankly I didn't see the  
> RichSeqAdaptor failing to load in your stack trace:
>
>>       ------------- EXCEPTION: Bio::Root::Exception -------------
>>
>>    MSG: Annotation of class Bio::Annotation::Collection not
>>    type-mapped. Internal error?
>>    STACK: Error::throw
>>    STACK: Bio::Root::Root::throw
>>    /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>    STACK:
>>    Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>    STACK:  
>> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>    Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>    STACK: Bio::DB::Persistent::PersistentObject::store
>>    Bio/DB/Persistent/PersistentObject.pm:271
>>    STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>    Bio/DB/BioSQL/SeqAdaptor.pm:224
>>    STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>    Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>    STACK: Bio::DB::Persistent::PersistentObject::create
>>    Bio/DB/Persistent/PersistentObject.pm:244
>>    STACK: t/04swiss.t:36
>>    -----------------------------------------------------------
>
> What that tells me is that when bioperl-db tries to store the  
> annotation bundle of the (SwissProt) sequence, one of the  
> annotations that it encounters is of type  
> Bio::Annotation::Collection. At present bioperl-db doesn't know what  
> to do with it; i.e., bioperl-db can't yet handle hierarchical  
> annotation collections (collections within collections).
>
> I believe this is due to recent changes in how the GN line is parsed  
> in BioPerl - Chris does this ring the right bell? I thought though  
> you had built in a method would allow flattening out

This appears to be using an older bioperl-live checkout, one where  
Heikki changed GN parsing to use a nested Annotation::Collection.  I  
reverted that back in a later commit to svn specifically b/c of  
bioperl-db problems.  bioperl-live's swiss.pm now uses a new subclass  
of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that  
represents nested values via Data::Stag's itext output (we can change  
that to alternatives if needed).

Here are the last few relevant revisions in bioperl-live's main trunk  
(mine is the latest):

------------------------------------------------------------------------
r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1  
line

bug 1825: updating swiss.pm/tests to try out TagTree (passes all  
tests).  Need to update Handler.t and related modules still...
------------------------------------------------------------------------
r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line

documentation for the GN line parsing and management
------------------------------------------------------------------------
r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line

GN (Gene Name) line parsing rewrite. Breaks backward compatibility.  
Can now deal with >1 gene per entry and four categories of names per  
gene. Parses old style syntax (...OR ... OR ... ) into one gene name  
and synonyms for each gene. Docs to follow.

....

I just updated all code from dev and reran bioperl-db tests w/o  
problems.  Maybe someone else could do the same to see what happens?

> It's worth noting that BioSQL itself can't really represent nested  
> annotation collections other than by using ontology terms and their  
> hierarchy, which at present I think isn't really appropriate, but I  
> have to think through the issue more. In other words, in BioSQL you  
> can't directly tie together a bunch of qualifier value pairs into a  
> "bag" and then nest this bag within another. The way to make this  
> work with the current schema is to flatten out the nesting.
>
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================

Might be worth looking into for a future BioSQL release, but we have a  
decent workaround in place for now, as long as it works cross-platform  
and cross-RDB.

chris


From stefan.kirov at bms.com  Thu Apr 17 16:40:14 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 12:40:14 -0400 (Eastern Daylight Time)
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
Message-ID: <Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>

Hilmar,
sorry, I missed the part after the stack trace... In any case this is 
still problem for me after I updated bioperl-live.
I see this with a number of other tests:
t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 78.
t/04swiss.........dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 6-52
         Failed 47/52 tests, 9.62% okay
t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 72.
t/05seqfeature....FAILED tests 9-48
         Failed 40/48 tests, 16.67% okay
t/06comment.......ok
t/07dblink........ok
t/08genbank.......ok
t/09fuzzy2........ok
t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1420.
t/10ensembl.......dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 3-15
         Failed 13/15 tests, 13.33% okay
t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs" via 
package "Bio::Annotation::OntologyTerm" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1.
t/11locuslink.....dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 5-110
         Failed 106/110 tests, 3.64% okay
t/12ontology......ok 1/738Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::GOterm" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 98.
t/12ontology......dubious
         Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 5-738
         Failed 734/738 tests, 0.54% okay
t/13remove........ok 2/59Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 145.
t/13remove........FAILED tests 11-59
         Failed 49/59 tests, 16.95% okay
t/14query.........ok
t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs" via 
package "Bio::Ontology::Term" at 
/home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm 
line 552, <GEN0> line 1.
t/15cluster.......dubious
         Test returned status 2 (wstat 512, 0x200)
DIED. FAILED tests 6-160
         Failed 155/160 tests, 3.12% okay
t/16obda..........ok

On Thu, 17 Apr 2008, Chris Fields wrote:

>
> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>
>> 
>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>> In any case I debugged and tracked that down to the RichSeq adaptor module 
>>> missing.
>> 
>> 
>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a Bio::Seq and 
>> a SeqAdaptor is present.
>> 
>> I'm afraid it gets stuck somewhere else and frankly I didn't see the 
>> RichSeqAdaptor failing to load in your stack trace:
>>
>>>      ------------- EXCEPTION: Bio::Root::Exception -------------
>>>
>>>   MSG: Annotation of class Bio::Annotation::Collection not
>>>   type-mapped. Internal error?
>>>   STACK: Error::throw
>>>   STACK: Bio::Root::Root::throw
>>>   /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>   STACK:
>>>   Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>   STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>   Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>   STACK: Bio::DB::Persistent::PersistentObject::store
>>>   Bio/DB/Persistent/PersistentObject.pm:271
>>>   STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>   Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>   STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>   Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>   STACK: Bio::DB::Persistent::PersistentObject::create
>>>   Bio/DB/Persistent/PersistentObject.pm:244
>>>   STACK: t/04swiss.t:36
>>>   -----------------------------------------------------------
>> 
>> What that tells me is that when bioperl-db tries to store the annotation 
>> bundle of the (SwissProt) sequence, one of the annotations that it 
>> encounters is of type Bio::Annotation::Collection. At present bioperl-db 
>> doesn't know what to do with it; i.e., bioperl-db can't yet handle 
>> hierarchical annotation collections (collections within collections).
>> 
>> I believe this is due to recent changes in how the GN line is parsed in 
>> BioPerl - Chris does this ring the right bell? I thought though you had 
>> built in a method would allow flattening out
>
> This appears to be using an older bioperl-live checkout, one where Heikki 
> changed GN parsing to use a nested Annotation::Collection.  I reverted that 
> back in a later commit to svn specifically b/c of bioperl-db problems. 
> bioperl-live's swiss.pm now uses a new subclass of 
> Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that represents 
> nested values via Data::Stag's itext output (we can change that to 
> alternatives if needed).
>
> Here are the last few relevant revisions in bioperl-live's main trunk (mine 
> is the latest):
>
> ------------------------------------------------------------------------
> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1 line
>
> bug 1825: updating swiss.pm/tests to try out TagTree (passes all tests). 
> Need to update Handler.t and related modules still...
> ------------------------------------------------------------------------
> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line
>
> documentation for the GN line parsing and management
> ------------------------------------------------------------------------
> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line
>
> GN (Gene Name) line parsing rewrite. Breaks backward compatibility. Can now 
> deal with >1 gene per entry and four categories of names per gene. Parses old 
> style syntax (...OR ... OR ... ) into one gene name and synonyms for each 
> gene. Docs to follow.
>
> ....
>
> I just updated all code from dev and reran bioperl-db tests w/o problems. 
> Maybe someone else could do the same to see what happens?
>
>> It's worth noting that BioSQL itself can't really represent nested 
>> annotation collections other than by using ontology terms and their 
>> hierarchy, which at present I think isn't really appropriate, but I have to 
>> think through the issue more. In other words, in BioSQL you can't directly 
>> tie together a bunch of qualifier value pairs into a "bag" and then nest 
>> this bag within another. The way to make this work with the current schema 
>> is to flatten out the nesting.
>>
>> 	-hilmar
>> -- 
>> ===========================================================
>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>> ===========================================================
>
> Might be worth looking into for a future BioSQL release, but we have a decent 
> workaround in place for now, as long as it works cross-platform and 
> cross-RDB.
>
> chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Thu Apr 17 17:06:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Thu, 17 Apr 2008 12:06:39 -0500
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
	<Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
Message-ID: <C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>

Stefan,

'get_dbxrefs' was introduced in bioperl-live a while back during the  
feature/annotation rollback detailed here:

http://www.bioperl.org/wiki/Feature_Annotation_rollback

I still think this is an interfering old bioperl (and maybe bioperl- 
db) installation causing the problems; I had similar issues at one  
point and had to find and remove the old installation.  It might be  
worth (1) checking 'perldoc -l Bio::Root::Root', which will give the  
location of the Bio::Root::Root in lib path being used, and (2) using  
'./Build install uninst=1' to remove any old bioperl/bioperl-db  
installations.

chris

On Apr 17, 2008, at 11:40 AM, Stefan Kirov wrote:

> Hilmar,
> sorry, I missed the part after the stack trace... In any case this  
> is still problem for me after I updated bioperl-live.
> I see this with a number of other tests:
> t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 78.
> t/04swiss.........dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 6-52
>        Failed 47/52 tests, 9.62% okay
> t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 72.
> t/05seqfeature....FAILED tests 9-48
>        Failed 40/48 tests, 16.67% okay
> t/06comment.......ok
> t/07dblink........ok
> t/08genbank.......ok
> t/09fuzzy2........ok
> t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 1420.
> t/10ensembl.......dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 3-15
>        Failed 13/15 tests, 13.33% okay
> t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs"  
> via package "Bio::Annotation::OntologyTerm" at /home/kirovs/bioperl- 
> db/blib/lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0>  
> line 1.
> t/11locuslink.....dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 5-110
>        Failed 106/110 tests, 3.64% okay
> t/12ontology......ok 1/738Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::GOterm" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 98.
> t/12ontology......dubious
>        Test returned status 255 (wstat 65280, 0xff00)
> DIED. FAILED tests 5-738
>        Failed 734/738 tests, 0.54% okay
> t/13remove........ok 2/59Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 145.
> t/13remove........FAILED tests 11-59
>        Failed 49/59 tests, 16.95% okay
> t/14query.........ok
> t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs"  
> via package "Bio::Ontology::Term" at /home/kirovs/bioperl-db/blib/ 
> lib/Bio/DB/Persistent/PersistentObject.pm line 552, <GEN0> line 1.
> t/15cluster.......dubious
>        Test returned status 2 (wstat 512, 0x200)
> DIED. FAILED tests 6-160
>        Failed 155/160 tests, 3.12% okay
> t/16obda..........ok
>
> On Thu, 17 Apr 2008, Chris Fields wrote:
>
>>
>> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>>
>>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>>> In any case I debugged and tracked that down to the RichSeq  
>>>> adaptor module missing.
>>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a  
>>> Bio::Seq and a SeqAdaptor is present.
>>> I'm afraid it gets stuck somewhere else and frankly I didn't see  
>>> the RichSeqAdaptor failing to load in your stack trace:
>>>
>>>>     ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>
>>>>  MSG: Annotation of class Bio::Annotation::Collection not
>>>>  type-mapped. Internal error?
>>>>  STACK: Error::throw
>>>>  STACK: Bio::Root::Root::throw
>>>>  /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>>  STACK:
>>>>  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>>  STACK:  
>>>> Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>>  STACK: Bio::DB::Persistent::PersistentObject::store
>>>>  Bio/DB/Persistent/PersistentObject.pm:271
>>>>  STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>>  Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>  STACK: Bio::DB::Persistent::PersistentObject::create
>>>>  Bio/DB/Persistent/PersistentObject.pm:244
>>>>  STACK: t/04swiss.t:36
>>>>  -----------------------------------------------------------
>>> What that tells me is that when bioperl-db tries to store the  
>>> annotation bundle of the (SwissProt) sequence, one of the  
>>> annotations that it encounters is of type  
>>> Bio::Annotation::Collection. At present bioperl-db doesn't know  
>>> what to do with it; i.e., bioperl-db can't yet handle hierarchical  
>>> annotation collections (collections within collections).
>>> I believe this is due to recent changes in how the GN line is  
>>> parsed in BioPerl - Chris does this ring the right bell? I thought  
>>> though you had built in a method would allow flattening out
>>
>> This appears to be using an older bioperl-live checkout, one where  
>> Heikki changed GN parsing to use a nested Annotation::Collection.   
>> I reverted that back in a later commit to svn specifically b/c of  
>> bioperl-db problems. bioperl-live's swiss.pm now uses a new  
>> subclass of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree)  
>> that represents nested values via Data::Stag's itext output (we can  
>> change that to alternatives if needed).
>>
>> Here are the last few relevant revisions in bioperl-live's main  
>> trunk (mine is the latest):
>>
>> ------------------------------------------------------------------------
>> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) |  
>> 1 line
>>
>> bug 1825: updating swiss.pm/tests to try out TagTree (passes all  
>> tests). Need to update Handler.t and related modules still...
>> ------------------------------------------------------------------------
>> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1  
>> line
>>
>> documentation for the GN line parsing and management
>> ------------------------------------------------------------------------
>> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1  
>> line
>>
>> GN (Gene Name) line parsing rewrite. Breaks backward compatibility.  
>> Can now deal with >1 gene per entry and four categories of names  
>> per gene. Parses old style syntax (...OR ... OR ... ) into one gene  
>> name and synonyms for each gene. Docs to follow.
>>
>> ....
>>
>> I just updated all code from dev and reran bioperl-db tests w/o  
>> problems. Maybe someone else could do the same to see what happens?
>>
>>> It's worth noting that BioSQL itself can't really represent nested  
>>> annotation collections other than by using ontology terms and  
>>> their hierarchy, which at present I think isn't really  
>>> appropriate, but I have to think through the issue more. In other  
>>> words, in BioSQL you can't directly tie together a bunch of  
>>> qualifier value pairs into a "bag" and then nest this bag within  
>>> another. The way to make this work with the current schema is to  
>>> flatten out the nesting.
>>>
>>> 	-hilmar
>>> -- 
>>> ===========================================================
>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>> ===========================================================
>>
>> Might be worth looking into for a future BioSQL release, but we  
>> have a decent workaround in place for now, as long as it works  
>> cross-platform and cross-RDB.
>>
>> chris
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From stefan.kirov at bms.com  Thu Apr 17 17:52:22 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Thu, 17 Apr 2008 13:52:22 -0400
Subject: [Bioperl-l] bioperl-db woes
In-Reply-To: <C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>
References: <4807534D.80105@bms.com>
	<82B3844B-A133-4AF3-9F08-774730F9B44C@uiuc.edu>
	<Pine.WNT.4.64.0804171008391.2732@A161887.one.ads.bms.com>
	<2D6AEAD9-286C-4F3F-8992-1778847708A8@gmx.net>
	<AD4E6AA1-454F-4AFE-A1D9-4DC5AEF820FB@uiuc.edu>
	<Pine.WNT.4.64.0804171236290.2732@A161887.one.ads.bms.com>
	<C7A53063-2126-40E2-8A79-BED49D7FE98A@uiuc.edu>
Message-ID: <48078E56.9000404@bms.com>

Chris Fields wrote:
> Stefan,
>
> 'get_dbxrefs' was introduced in bioperl-live a while back during the
> feature/annotation rollback detailed here:
>
> http://www.bioperl.org/wiki/Feature_Annotation_rollback
>
Chris was right!
> I still think this is an interfering old bioperl (and maybe
> bioperl-db) installation causing the problems; I had similar issues at
> one point and had to find and remove the old installation.  It might
> be worth (1) checking 'perldoc -l Bio::Root::Root',
This is the first thing I did and it seemed fine from command line.
So I checked a new copy (vs. updating), set PERL5LIB to the minimum
which is necessary (Build changes INC), which is
/home/kirovs/bioperl-db/bplive:/stf/sysdev/perl/newlib/perl/lib/5.8/ia64-linux-multi/
(/home/kirovs/bioperl-db/bplive being the fresh copy and the other
having Module::Build, etc., but definitely no bioperl).
This fixed the problem. I still do not see where the old module came
from, but that was a really good guess.
Thanks
Stefan
> which will give the location of the Bio::Root::Root in lib path being
> used, and (2) using './Build install uninst=1' to remove any old
> bioperl/bioperl-db installations.
Unfortunately this is not an option for me.
>
> chris
>
> On Apr 17, 2008, at 11:40 AM, Stefan Kirov wrote:
>
>> Hilmar,
>> sorry, I missed the part after the stack trace... In any case this is
>> still problem for me after I updated bioperl-live.
>> I see this with a number of other tests:
>> t/04swiss.........ok 3/52Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 78.
>> t/04swiss.........dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 6-52
>>        Failed 47/52 tests, 9.62% okay
>> t/05seqfeature....ok 4/48Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 72.
>> t/05seqfeature....FAILED tests 9-48
>>        Failed 40/48 tests, 16.67% okay
>> t/06comment.......ok
>> t/07dblink........ok
>> t/08genbank.......ok
>> t/09fuzzy2........ok
>> t/10ensembl.......ok 1/15Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1420.
>> t/10ensembl.......dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 3-15
>>        Failed 13/15 tests, 13.33% okay
>> t/11locuslink.....ok 4/110Can't locate object method "get_dbxrefs"
>> via package "Bio::Annotation::OntologyTerm" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1.
>> t/11locuslink.....dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 5-110
>>        Failed 106/110 tests, 3.64% okay
>> t/12ontology......ok 1/738Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::GOterm" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 98.
>> t/12ontology......dubious
>>        Test returned status 255 (wstat 65280, 0xff00)
>> DIED. FAILED tests 5-738
>>        Failed 734/738 tests, 0.54% okay
>> t/13remove........ok 2/59Can't locate object method "get_dbxrefs" via
>> package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 145.
>> t/13remove........FAILED tests 11-59
>>        Failed 49/59 tests, 16.95% okay
>> t/14query.........ok
>> t/15cluster.......ok 3/160Can't locate object method "get_dbxrefs"
>> via package "Bio::Ontology::Term" at
>> /home/kirovs/bioperl-db/blib/lib/Bio/DB/Persistent/PersistentObject.pm
>> line 552, <GEN0> line 1.
>> t/15cluster.......dubious
>>        Test returned status 2 (wstat 512, 0x200)
>> DIED. FAILED tests 6-160
>>        Failed 155/160 tests, 3.12% okay
>> t/16obda..........ok
>>
>> On Thu, 17 Apr 2008, Chris Fields wrote:
>>
>>>
>>> On Apr 17, 2008, at 9:47 AM, Hilmar Lapp wrote:
>>>
>>>> On Apr 17, 2008, at 10:18 AM, Stefan Kirov wrote:
>>>>> In any case I debugged and tracked that down to the RichSeq
>>>>> adaptor module missing.
>>>> That almost can't be the problem. Every Bio::Seq::RichSeq is-a
>>>> Bio::Seq and a SeqAdaptor is present.
>>>> I'm afraid it gets stuck somewhere else and frankly I didn't see
>>>> the RichSeqAdaptor failing to load in your stack trace:
>>>>
>>>>>     ------------- EXCEPTION: Bio::Root::Exception -------------
>>>>>
>>>>>  MSG: Annotation of class Bio::Annotation::Collection not
>>>>>  type-mapped. Internal error?
>>>>>  STACK: Error::throw
>>>>>  STACK: Bio::Root::Root::throw
>>>>>  /home/kirovs/bioperl-live/Bio/Root/Root.pm:357
>>>>>  STACK:
>>>>>  Bio::DB::BioSQL::AnnotationCollectionAdaptor::_annotation_map_key
>>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:695
>>>>>  STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children
>>>>>  Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:204
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
>>>>>  STACK: Bio::DB::Persistent::PersistentObject::store
>>>>>  Bio/DB/Persistent/PersistentObject.pm:271
>>>>>  STACK: Bio::DB::BioSQL::SeqAdaptor::store_children
>>>>>  Bio/DB/BioSQL/SeqAdaptor.pm:224
>>>>>  STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create
>>>>>  Bio/DB/BioSQL/BasePersistenceAdaptor.pm:214
>>>>>  STACK: Bio::DB::Persistent::PersistentObject::create
>>>>>  Bio/DB/Persistent/PersistentObject.pm:244
>>>>>  STACK: t/04swiss.t:36
>>>>>  -----------------------------------------------------------
>>>> What that tells me is that when bioperl-db tries to store the
>>>> annotation bundle of the (SwissProt) sequence, one of the
>>>> annotations that it encounters is of type
>>>> Bio::Annotation::Collection. At present bioperl-db doesn't know
>>>> what to do with it; i.e., bioperl-db can't yet handle hierarchical
>>>> annotation collections (collections within collections).
>>>> I believe this is due to recent changes in how the GN line is
>>>> parsed in BioPerl - Chris does this ring the right bell? I thought
>>>> though you had built in a method would allow flattening out
>>>
>>> This appears to be using an older bioperl-live checkout, one where
>>> Heikki changed GN parsing to use a nested Annotation::Collection.  I
>>> reverted that back in a later commit to svn specifically b/c of
>>> bioperl-db problems. bioperl-live's swiss.pm now uses a new subclass
>>> of Bio::Annotation::SimpleValue (Bio::Annotation::TagTree) that
>>> represents nested values via Data::Stag's itext output (we can
>>> change that to alternatives if needed).
>>>
>>> Here are the last few relevant revisions in bioperl-live's main
>>> trunk (mine is the latest):
>>>
>>> ------------------------------------------------------------------------
>>>
>>> r14562 | cjfields | 2008-02-28 08:30:05 -0600 (Thu, 28 Feb 2008) | 1
>>> line
>>>
>>> bug 1825: updating swiss.pm/tests to try out TagTree (passes all
>>> tests). Need to update Handler.t and related modules still...
>>> ------------------------------------------------------------------------
>>>
>>> r14541 | heikki | 2008-02-25 00:10:48 -0600 (Mon, 25 Feb 2008) | 1 line
>>>
>>> documentation for the GN line parsing and management
>>> ------------------------------------------------------------------------
>>>
>>> r14538 | heikki | 2008-02-23 08:48:23 -0600 (Sat, 23 Feb 2008) | 1 line
>>>
>>> GN (Gene Name) line parsing rewrite. Breaks backward compatibility.
>>> Can now deal with >1 gene per entry and four categories of names per
>>> gene. Parses old style syntax (...OR ... OR ... ) into one gene name
>>> and synonyms for each gene. Docs to follow.
>>>
>>> ....
>>>
>>> I just updated all code from dev and reran bioperl-db tests w/o
>>> problems. Maybe someone else could do the same to see what happens?
>>>
>>>> It's worth noting that BioSQL itself can't really represent nested
>>>> annotation collections other than by using ontology terms and their
>>>> hierarchy, which at present I think isn't really appropriate, but I
>>>> have to think through the issue more. In other words, in BioSQL you
>>>> can't directly tie together a bunch of qualifier value pairs into a
>>>> "bag" and then nest this bag within another. The way to make this
>>>> work with the current schema is to flatten out the nesting.
>>>>
>>>>     -hilmar
>>>> --===========================================================
>>>> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
>>>> ===========================================================
>>>
>>> Might be worth looking into for a future BioSQL release, but we have
>>> a decent workaround in place for now, as long as it works
>>> cross-platform and cross-RDB.
>>>
>>> chris
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
>


From hubert.gaynor at yahoo.com  Fri Apr 18 00:53:16 2008
From: hubert.gaynor at yahoo.com (Hubert Gaynor)
Date: Thu, 17 Apr 2008 17:53:16 -0700 (PDT)
Subject: [Bioperl-l] Can I use BLAST against a database like MySQL
Message-ID: <130971.67684.qm@web46007.mail.sp1.yahoo.com>

Hi Sean,

I got it. Thank you so much!

Hubert

----- Original Message ----
From: Sean Davis <sdavis2 at mail.nih.gov>
To: Hubert Gaynor <hubert.gaynor at yahoo.com>
Sent: Thursday, April 17, 2008 6:36:02 PM
Subject: Re: [Bioperl-l] Can I use BLAST against a database like MySQL

On Thu, Apr 17, 2008 at 2:19 AM, Hubert Gaynor <hubert.gaynor at yahoo.com> wrote:
> Hi,
>
>  As far as I know, before using BLAST to do the alignment the first thing should be done is typing formatdb to construct a database. But I was wondering whether it is possible to construct a database with MySQL which probably will grant the BLAST search a higher speed and make the database management much easier?
>

formatdb is used to make a representation that can be used efficiently
by blast.  That representation already makes blast faster.  MySQL
can't be used for such things.  As for speeding blast, if you have a
multiprocessor machine, you can take advantage of those using blast
and increasing the number of processors.  Also, while blast is a very
versatile program, it is not the only alignment program available.
Depending on your needs, you could look at other programs such as blat
or gmap that can be 2-3 orders of magnitude faster than blast.

Sean


      ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


From Russell.Smithies at agresearch.co.nz  Fri Apr 18 01:39:23 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Fri, 18 Apr 2008 13:39:23 +1200
Subject: [Bioperl-l] accessing params for custom glyphs?
In-Reply-To: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
References: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>

This is probably more of a Perl OO problem I'm having, but can anyone
tell me how to access a parameter when I create a custom glyph?

I've created a panel in the usual way then I add a feature with
'my_glyph' and want to pass the value of -new_parameter to the glyph
drawing code.

    $panel->add_track( $feature,
    			-font => gdSmallFont,
			-glyph => 'my_glyph' ,
			-height => 10,
                		-label  => 1,
                		-strand => "forward",
                		-new_parameter => "test",


In my_glyph.pm, I have the usual draw_component sub:

sub draw_component {
  my $self = shift;
  my $gd = shift;
  my ($x1,$y1,$x2,$y2) = $self->bounds(@_);
  my $fg = $self->fgcolor;
  my $params = $self->??????????   <<--- how do I access the value of
"new_parameter" set in the panel drawing code?

  $gd->line($x1,$y1,$x2,$y2,$fg);
  $gd->line($x1,$y2,$x2,$y1,$fg);

}

Any ideas?

Thanx,

Russell	Smithies			
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From David.Messina at sbc.su.se  Fri Apr 18 09:31:59 2008
From: David.Messina at sbc.su.se (Dave Messina)
Date: Fri, 18 Apr 2008 11:31:59 +0200
Subject: [Bioperl-l]  Finding seqs of given domain architecture
In-Reply-To: <628aabb70804170155n4e5dfd81r7020c3e9e11094ff@mail.gmail.com>
References: <829F02EC-F827-485E-82F8-9EFEA0332C77@jays.net>
	<D6030075-C999-464B-A998-3C69346C7FB0@jays.net>
	<AAF485B9-3E64-4C3E-A43E-880F738C9E19@uiuc.edu>
	<200804161336.16879.heikki@sanbi.ac.za>
	<B0DB3E97861C46AFBE4E76CC8FCBF89F@leah>
	<628aabb70804161112o6610ee1fkfb4b08e74730237d@mail.gmail.com>
	<1208420674.23342.15.camel@razor.sbc.su.se>
	<628aabb70804170155n4e5dfd81r7020c3e9e11094ff@mail.gmail.com>
Message-ID: <628aabb70804180231p2b9cef9dwd5441e85c31531fd@mail.gmail.com>

Jacob,

I talked about your question with a colleague of mine who has been working
in this area. Below is his reply.

[I'm reposting this *without* the attachment mentioned since the mailing
list wouldn't accept it otherwise. If anyone wants a copy of the code, just
email me.]

Dave

-------

> 3. Pfam has this capability, i.e. to show all domains with a given
> architecture, but it is difficult to get at the actual sequences or
> even a list of accession numbers.

First, this should be available right away in PfamAlyser:

http://pfamalyzer.sbc.su.se/pfamalyzer/index.html

although you might need to upgrade your browser to Java 1.6 to get it to
work.

If this does not work as suggested (an upgraded version is coming
eventually), have a look at the file:

ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/swisspfam.gz

which contains the Pfam architectures for all UniProt sequences. You can
parse that to get a file of <accession number>-<list of domain>
correspondences and just filter that to get the accession numbers.
(Please find attached a Perl script to do just that.)

Under UNIX, you can then just grep this for the domain IDs,

(like grep domainArchitectureFile.txt PF00008 | grep PF00456 >
resultFile.txt)

but I am sure there are solutions under other operating systems as well.
You could then write a script to parse out the corresponding sequences
from the UniProt fasta flatfile, if you wanted, or (again under UNIX) a
script to wget them of the webpage.

In case your sequences are not in UniProt, consider using HMMER and the
Pfam HMM files to assign domains to all sequences in your dataset. I
would then parse the HMMER output into the same format as the above, and
use the same approach following that.

Hope this helps,

Yours sincerely,

Kristoffer Forslund
krifo at sbc.su.se


From lincoln.stein at gmail.com  Fri Apr 18 19:16:19 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Fri, 18 Apr 2008 15:16:19 -0400
Subject: [Bioperl-l] [Gmod-gbrowse] accessing params for custom glyphs?
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>
References: <130971.67684.qm@web46007.mail.sp1.yahoo.com>
	<D5DBA313349A4B458528BE63B387F36C06C75E14@imail.agresearch.co.nz>
Message-ID: <6dce9a0b0804181216q6564e580u8a805ae96c78df2e@mail.gmail.com>

Hi Russell,

It's very simple:

   my $params = $self->option('new_parameter');

Lincoln

On Thu, Apr 17, 2008 at 9:39 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

> This is probably more of a Perl OO problem I'm having, but can anyone
> tell me how to access a parameter when I create a custom glyph?
>
> I've created a panel in the usual way then I add a feature with
> 'my_glyph' and want to pass the value of -new_parameter to the glyph
> drawing code.
>
>    $panel->add_track( $feature,
>                        -font => gdSmallFont,
>                        -glyph => 'my_glyph' ,
>                        -height => 10,
>                                -label  => 1,
>                                -strand => "forward",
>                                -new_parameter => "test",
>
>
> In my_glyph.pm, I have the usual draw_component sub:
>
> sub draw_component {
>  my $self = shift;
>  my $gd = shift;
>  my ($x1,$y1,$x2,$y2) = $self->bounds(@_);
>  my $fg = $self->fgcolor;
>  my $params = $self->??????????   <<--- how do I access the value of
> "new_parameter" set in the panel drawing code?
>
>  $gd->line($x1,$y1,$x2,$y2,$fg);
>  $gd->line($x1,$y2,$x2,$y1,$fg);
>
> }
>
> Any ideas?
>
> Thanx,
>
> Russell Smithies
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Don't miss this year's exciting event. There's still time to save $100.
> Use priority code J8TL2D2.
>
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From jason at bioperl.org  Sat Apr 19 02:35:10 2008
From: jason at bioperl.org (Jason Stajich)
Date: Fri, 18 Apr 2008 19:35:10 -0700
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <1208381947.16620.6.camel@kiss-laptop>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
	<1208381947.16620.6.camel@kiss-laptop>
Message-ID: <A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>

do you want the LOCUS or the ACCESSION?
Do you mean the result is the completely wrong record or just the  
wrong field?
accession number is available from the seq's accession_number() method.
-jason
On Apr 16, 2008, at 2:39 PM, Fr?d?ric Romagn? wrote:

> Well, if with input file you mean the database used, it's created
> with Bio::Index::GenBank from a ncbi FTP's genbank file.
>
> $id is an accession number read from a file but i chomp the line...
>
> I am trying to install the svn version of bioperl under windows to see
> if there is an improvement.
>
> Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
>> Did you check the format of your input file?
>> i.e. DOS or UNIX line endings?
>>
>>> -----Original Message-----
>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
>>> bounces at lists.open-
>>> bio.org] On Behalf Of Fr?d?ric Romagn?
>>> Sent: Thursday, 17 April 2008 5:25 a.m.
>>> To: bioperl-l at lists.open-bio.org
>>> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
>>>
>>> Hello,
>>> i made a program which use Bio::Index::GenBank and i tested it under
>>> unix, that worked well.
>>>
>>> But i have to launch it under windows and it seems not to work on.
>>>
>>> Here is the problem :
>>>
>>> my $dbobj = Bio::Index::Abstract->new("Data/$db");
>>> my $seq = $dbobj->get_Seq_by_acc($id);
>>> print $seq->display_id."\n";
>>>
>>> did not print the same number than $id !!! So i don't work on the
>>> sequence expected...
>>>
>>> I use the SVN sources on unix and the Perl package manager for
>>> windows...
>>>
>>> Thanks.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> ===================================================================== 
>> ==
>> Attention: The information contained in this message and/or  
>> attachments
>> from AgResearch Limited is intended only for the persons or entities
>> to which it is addressed and may contain confidential and/or  
>> privileged
>> material. Any review, retransmission, dissemination or other use  
>> of, or
>> taking of any action in reliance upon, this information by persons or
>> entities other than the intended recipients is prohibited by  
>> AgResearch
>> Limited. If you have received this message in error, please notify  
>> the
>> sender immediately.
>> ===================================================================== 
>> ==
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bioperlanand at yahoo.com  Mon Apr 21 07:44:00 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Mon, 21 Apr 2008 00:44:00 -0700 (PDT)
Subject: [Bioperl-l] a question on obtaining HTML formatted Blast output
	along with the Blast hits image
Message-ID: <372845.37134.qm@web36808.mail.mud.yahoo.com>


 Hi everybody,

I would like to obtain a HTML formatted blast report output along with a picture of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I have gotten the HTML output working using "Bio::SearchIO::Writer::HTMLResultWriter".

My question: How do I integrate it with Bio:Graphics to render the blast hits image at the correct position in my Bioperl reformatted html file.

I ultimately want to be able to display my blast output files on a browser. 

Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;

my $infile = shift or die $!;

my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile );
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                -file   => ">${infile}.html");

$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------

Thanks in advance,

Anand


---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.


From cjfields at uiuc.edu  Mon Apr 21 15:07:17 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:07:17 -0500
Subject: [Bioperl-l] [Proposed change] HSP::frame()
Message-ID: <ACE26E05-7C02-46E3-B973-E0529C0A0DEA@uiuc.edu>

I have noticed (in relation to bug 2485, http://bugzilla.open-bio.org/show_bug.cgi?id=2485) 
  that the Bio::Search::HSP::GenericHSP frame() method is implemented  
very differently from strand(), start(), end(), and most other HSP  
methods.  The current behavior is to return an array of two values  
(query and hit frame) under list conditions, the query frame if one  
value is passed, and the subject frame if no value is passed under  
scalar context and both under list context.  The latter behavior is  
unfortunately leading to the aforementioned bug above.  The method is  
also implied to be a getter/setter, but the implementation doesn't  
allow that; it always sets to the instantiated values (in fact,  
repeatedly so).

In order to fix that and make the interface more consistent I am  
changing frame() to behave like strand(), etc., in that the first  
argument is 'query/subject/hit/list' (default = 'query' if no arg  
specified) and the rest optional values for setting, in query/subject  
order.

One issue: I can catch and imitate most of the older behavior with a  
few additional checks, the one exception being the old frame() default  
return value which is now 'query' (not context-dependent).  If needed  
we can change the default to 'hit', but I believe method consistency  
is probably the better route, and I can always add a warning under old  
API circumstances indicating the change.

I am also modifying HSPTableWriter to print frame_hit and frame_query  
(previously it was only printing 'frame', which implied the hit  
frame).  I can see this being an issue with anyone expecting 'frame'  
instead of 'frame_hit';  I could hack in a fix for that if needed.

If there aren't any objections or suggestions, I'll commit this in the  
next day or two.

chris


From cjfields at uiuc.edu  Mon Apr 21 15:32:59 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:32:59 -0500
Subject: [Bioperl-l] Assembly.t test fails
Message-ID: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>

I'm getting some significant test failures in bioperl-live for  
Bio::Assembly:

t/Assembly......
1..35
ok 1 - use Bio::Assembly::IO;
ok 2 - The object isa Bio::Assembly::IO
ok 3 - The object isa Bio::Assembly::Scaffold
ok 4
not ok 5
ok 6 - The object isa Bio::AnnotationCollectionI
ok 7 - no annotations in Annotation collection?
ok 8

#   Failed test at t/Assembly.t line 35.
#          got: 'NoName'
#     expected: 'test'
Can't locate object method "get_contig_seq_ids" via package  
"Bio::Assembly::Contig" at /Users/cjfields/bioperl/bioperl-live/blib/ 
lib/Bio/Assembly/Scaffold.pm line 189, <GEN0> line 733.
# Looks like you planned 35 tests but only ran 8.
# Looks like you failed 1 test of 8 run.
# Looks like your test died just after 8.
  Dubious, test returned 255 (wstat 65280, 0xff00)
  Failed 28/35 subtests

Test Summary Report
-------------------
t/Assembly.t (Wstat: 65280 Tests: 8 Failed: 1)
   Failed test:  5
   Non-zero exit status: 255
   Parse errors: Bad plan.  You planned 35 tests but ran 8.
Files=1, Tests=8,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.22 cusr   
0.04 csys =  0.27 CPU)
Result: FAIL
Failed 1/1 test programs. 1/8 subtests failed.


chris

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Mon Apr 21 15:44:21 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 21 Apr 2008 10:44:21 -0500
Subject: [Bioperl-l] Assembly.t test fails
In-Reply-To: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>
References: <ABC6AB22-0AFD-4977-97DD-E2AE507E0330@uiuc.edu>
Message-ID: <2F199628-717E-4F88-85D7-408BD7BBE16D@uiuc.edu>

Scratch that, figured it out (easy fix).

chris

On Apr 21, 2008, at 10:32 AM, Chris Fields wrote:

> I'm getting some significant test failures in bioperl-live for  
> Bio::Assembly:
>
> t/Assembly......
> 1..35
> ok 1 - use Bio::Assembly::IO;
> ok 2 - The object isa Bio::Assembly::IO
> ok 3 - The object isa Bio::Assembly::Scaffold
> ok 4
> not ok 5
> ok 6 - The object isa Bio::AnnotationCollectionI
> ok 7 - no annotations in Annotation collection?
> ok 8
>
> #   Failed test at t/Assembly.t line 35.
> #          got: 'NoName'
> #     expected: 'test'
> Can't locate object method "get_contig_seq_ids" via package  
> "Bio::Assembly::Contig" at /Users/cjfields/bioperl/bioperl-live/blib/ 
> lib/Bio/Assembly/Scaffold.pm line 189, <GEN0> line 733.
> # Looks like you planned 35 tests but only ran 8.
> # Looks like you failed 1 test of 8 run.
> # Looks like your test died just after 8.
> Dubious, test returned 255 (wstat 65280, 0xff00)
> Failed 28/35 subtests
>
> Test Summary Report
> -------------------
> t/Assembly.t (Wstat: 65280 Tests: 8 Failed: 1)
>  Failed test:  5
>  Non-zero exit status: 255
>  Parse errors: Bad plan.  You planned 35 tests but ran 8.
> Files=1, Tests=8,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.22  
> cusr  0.04 csys =  0.27 CPU)
> Result: FAIL
> Failed 1/1 test programs. 1/8 subtests failed.
>
>
> chris
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From frederic.romagne at gmail.com  Mon Apr 21 15:53:11 2008
From: frederic.romagne at gmail.com (=?ISO-8859-1?Q?Fr=E9d=E9ric_Romagn=E9?=)
Date: Mon, 21 Apr 2008 10:53:11 -0500
Subject: [Bioperl-l] index::abstract on win and unix
In-Reply-To: <A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>
References: <1208366718.19084.15.camel@kiss-laptop>
	<D5DBA313349A4B458528BE63B387F36C06BEAFF2@imail.agresearch.co.nz>
	<1208381947.16620.6.camel@kiss-laptop>
	<A30B8E06-131C-445F-B692-92CAB845B13B@bioperl.org>
Message-ID: <1208793191.25906.9.camel@kiss-laptop>

In fact, i want the whole Bio::Seq object, but the i verified the
ACCESSION and the LOCUS are the same in my genbank files.
I saw that the program sometimes tells that it cannot find the entry :

 if( !defined $seq ) {
	warn("Sequence $id in Database $db is not present\n");
    }

i suspect the make_index function not to work properly on windows
instead of the ?get_Seq_by_acc function...

Le vendredi 18 avril 2008 ? 19:35 -0700, Jason Stajich a ?crit :
> do you want the LOCUS or the ACCESSION?
> Do you mean the result is the completely wrong record or just the  
> wrong field?
> accession number is available from the seq's accession_number() method.
> -jason
> On Apr 16, 2008, at 2:39 PM, Fr?d?ric Romagn? wrote:
> 
> > Well, if with input file you mean the database used, it's created
> > with Bio::Index::GenBank from a ncbi FTP's genbank file.
> >
> > $id is an accession number read from a file but i chomp the line...
> >
> > I am trying to install the svn version of bioperl under windows to see
> > if there is an improvement.
> >
> > Le jeudi 17 avril 2008 ? 08:49 +1200, Smithies, Russell a ?crit :
> >> Did you check the format of your input file?
> >> i.e. DOS or UNIX line endings?
> >>
> >>> -----Original Message-----
> >>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l- 
> >>> bounces at lists.open-
> >>> bio.org] On Behalf Of Fr?d?ric Romagn?
> >>> Sent: Thursday, 17 April 2008 5:25 a.m.
> >>> To: bioperl-l at lists.open-bio.org
> >>> Subject: [Bioperl-l] [bioperl-l] index::abstract on win and unix
> >>>
> >>> Hello,
> >>> i made a program which use Bio::Index::GenBank and i tested it under
> >>> unix, that worked well.
> >>>
> >>> But i have to launch it under windows and it seems not to work on.
> >>>
> >>> Here is the problem :
> >>>
> >>> my $dbobj = Bio::Index::Abstract->new("Data/$db");
> >>> my $seq = $dbobj->get_Seq_by_acc($id);
> >>> print $seq->display_id."\n";
> >>>
> >>> did not print the same number than $id !!! So i don't work on the
> >>> sequence expected...
> >>>
> >>> I use the SVN sources on unix and the Perl package manager for
> >>> windows...
> >>>
> >>> Thanks.
> >>>
> >>> _______________________________________________
> >>> Bioperl-l mailing list
> >>> Bioperl-l at lists.open-bio.org
> >>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >>
> >> ===================================================================== 
> >> ==
> >> Attention: The information contained in this message and/or  
> >> attachments
> >> from AgResearch Limited is intended only for the persons or entities
> >> to which it is addressed and may contain confidential and/or  
> >> privileged
> >> material. Any review, retransmission, dissemination or other use  
> >> of, or
> >> taking of any action in reliance upon, this information by persons or
> >> entities other than the intended recipients is prohibited by  
> >> AgResearch
> >> Limited. If you have received this message in error, please notify  
> >> the
> >> sender immediately.
> >> ===================================================================== 
> >> ==
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 


From ewijaya at gmail.com  Tue Apr 22 14:03:07 2008
From: ewijaya at gmail.com (Edward Wijaya)
Date: Tue, 22 Apr 2008 22:03:07 +0800
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
Message-ID: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>

Hi,

Is there any module that can parse the following output
of BLAT. This is taken from UCSC browser.

The idea is to parse it and then extract the conserved block
of aligned sequences.


__DATA__
Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
B D   D. melanogaster
tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
B D       D. simulans
tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
B D      D. sechellia
tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
B D         D. yakuba
tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
            D. erecta
tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
         D. ananassae
taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
     D. pseudoobscura
tata----ccagtacac-cttatatg------------tttttaaata--------------------
B D     D. persimilis
tata----ccagtacac-attatatg------------tttttaaata--------------------
        D. willistoni
aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
           D. virilis
-------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
        D. mojavensis
-------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
         D. grimshawi
====================================================================
         T. castaneum
====================================================================

Inserts between block 3 and 4 in window
    D. pseudoobscura 2008bp
B D    D. persimilis 1421bp
          D. virilis 5bp
       D. mojavensis 4640bp

Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
B D   D. melanogaster
----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
B D       D. simulans
----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
B D      D. sechellia
----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
B D         D. yakuba
----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
            D. erecta
----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
     D. pseudoobscura
====================================================================
B D     D. persimilis
====================================================================
        D. willistoni
----aggattacgaagttcctttat-------------------aaag--------------------
           D. virilis
gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
        D. mojavensis
====================================================================
         D. grimshawi
====================================================================
         T. castaneum
====================================================================

__ END__


From cjfields at uiuc.edu  Tue Apr 22 14:22:45 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 09:22:45 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
Message-ID: <766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>

A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!

chris

On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:

> Hi,
>
> Is there any module that can parse the following output
> of BLAT. This is taken from UCSC browser.
>
> The idea is to parse it and then extract the conserved block
> of aligned sequences.
>
>
> __DATA__
> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
> B D   D. melanogaster
> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
> B D       D. simulans
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
> B D      D. sechellia
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
> B D         D. yakuba
> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>            D. erecta
> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>         D. ananassae
> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>     D. pseudoobscura
> tata----ccagtacac-cttatatg------------tttttaaata--------------------
> B D     D. persimilis
> tata----ccagtacac-attatatg------------tttttaaata--------------------
>        D. willistoni
> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>           D. virilis
> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>        D. mojavensis
> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>         D. grimshawi
> ====================================================================
>         T. castaneum
> ====================================================================
>
> Inserts between block 3 and 4 in window
>    D. pseudoobscura 2008bp
> B D    D. persimilis 1421bp
>          D. virilis 5bp
>       D. mojavensis 4640bp
>
> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
> B D   D. melanogaster
> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
> B D       D. simulans
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D      D. sechellia
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D         D. yakuba
> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>            D. erecta
> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>     D. pseudoobscura
> ====================================================================
> B D     D. persimilis
> ====================================================================
>        D. willistoni
> ----aggattacgaagttcctttat-------------------aaag--------------------
>           D. virilis
> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>        D. mojavensis
> ====================================================================
>         D. grimshawi
> ====================================================================
>         T. castaneum
> ====================================================================
>
> __ END__
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 22 14:59:25 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 09:59:25 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
Message-ID: <4F3522BB-28F0-44A8-8DE1-7CF3F648402A@uiuc.edu>

A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!

chris

On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:

> Hi,
>
> Is there any module that can parse the following output
> of BLAT. This is taken from UCSC browser.
>
> The idea is to parse it and then extract the conserved block
> of aligned sequences.
>
>
> __DATA__
> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
> B D   D. melanogaster
> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
> B D       D. simulans
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
> B D      D. sechellia
> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
> B D         D. yakuba
> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>           D. erecta
> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>        D. ananassae
> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>    D. pseudoobscura
> tata----ccagtacac-cttatatg------------tttttaaata--------------------
> B D     D. persimilis
> tata----ccagtacac-attatatg------------tttttaaata--------------------
>       D. willistoni
> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>          D. virilis
> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>       D. mojavensis
> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>        D. grimshawi
> ====================================================================
>        T. castaneum
> ====================================================================
>
> Inserts between block 3 and 4 in window
>   D. pseudoobscura 2008bp
> B D    D. persimilis 1421bp
>         D. virilis 5bp
>      D. mojavensis 4640bp
>
> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
> B D   D. melanogaster
> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
> B D       D. simulans
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D      D. sechellia
> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
> B D         D. yakuba
> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>           D. erecta
> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>    D. pseudoobscura
> ====================================================================
> B D     D. persimilis
> ====================================================================
>       D. willistoni
> ----aggattacgaagttcctttat-------------------aaag--------------------
>          D. virilis
> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>       D. mojavensis
> ====================================================================
>        D. grimshawi
> ====================================================================
>        T. castaneum
> ====================================================================
>
> __ END__
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From jason at bioperl.org  Tue Apr 22 18:49:32 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 11:49:32 -0700
Subject: [Bioperl-l] Fwd: [blast-announce] New BLAST URL available at the
	NCBI
References: <EEEED756EF6626469B10653F745014389BAEAD@NIHCESMLBX15.nih.gov>
Message-ID: <F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>

Does anyone want to take a look at how to use these URLs in the  
RemoteBlast module, if the interface is the same?

-jason

Begin forwarded message:

> From: "Mcginnis, Scott (NIH/NLM/NCBI) [E]" <mcginnis at ncbi.nlm.nih.gov>
> Date: April 22, 2008 11:35:04 AM PDT
> To: <blast-announce at ncbi.nlm.nih.gov>
> Subject: [blast-announce] New BLAST URL available at the NCBI
>
> New BLAST URL available at the NCBI
>
>
>
> The NCBI has activated a new URL for BLAST searches at the NCBI:
> http://blast.ncbi.nlm.nih.gov.
>
>
>
> Searches sent to this URL can take advantage of a larger number of
> machines for searches and the system has a better overall fault
> tolerance.
>
>
>
> We recommend migration of all BLAST links and bookmarks (e.g.,
> http://www.ncbi.nlm.nih.gov/BLAST/ and
> http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) to the new URL.
>
>
>
> Links on the NCBI and BLAST home pages will start to change in the
> coming weeks.
>
>
>
> At this point in time the plans are to also maintain the current BLAST
> URL.
>
>
>
>
>


From jason at bioperl.org  Tue Apr 22 18:51:08 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 11:51:08 -0700
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
	<766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
Message-ID: <6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>

if you get it as axt it should parse fine in SearchIO but that is  
pairwise, if you can get an alignment blocks I can't remember what  
format this is from UCSC.
MSAs are going to be better handed through Bio::AlignIO though so it  
might be better to build a parser on that.

On Apr 22, 2008, at 7:22 AM, Chris Fields wrote:

> A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
> Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
> Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!
>
> chris
>
> On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:
>
>> Hi,
>>
>> Is there any module that can parse the following output
>> of BLAT. This is taken from UCSC browser.
>>
>> The idea is to parse it and then extract the conserved block
>> of aligned sequences.
>>
>>
>> __DATA__
>> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
>> B D   D. melanogaster
>> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
>> B D       D. simulans
>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
>> B D      D. sechellia
>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
>> B D         D. yakuba
>> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>>            D. erecta
>> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>>         D. ananassae
>> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>>     D. pseudoobscura
>> tata----ccagtacac-cttatatg------------tttttaaata--------------------
>> B D     D. persimilis
>> tata----ccagtacac-attatatg------------tttttaaata--------------------
>>        D. willistoni
>> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>>           D. virilis
>> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>>        D. mojavensis
>> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>>         D. grimshawi
>> ====================================================================
>>         T. castaneum
>> ====================================================================
>>
>> Inserts between block 3 and 4 in window
>>    D. pseudoobscura 2008bp
>> B D    D. persimilis 1421bp
>>          D. virilis 5bp
>>       D. mojavensis 4640bp
>>
>> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
>> B D   D. melanogaster
>> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
>> B D       D. simulans
>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>> B D      D. sechellia
>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>> B D         D. yakuba
>> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>>            D. erecta
>> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>>     D. pseudoobscura
>> ====================================================================
>> B D     D. persimilis
>> ====================================================================
>>        D. willistoni
>> ----aggattacgaagttcctttat-------------------aaag--------------------
>>           D. virilis
>> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>>        D. mojavensis
>> ====================================================================
>>         D. grimshawi
>> ====================================================================
>>         T. castaneum
>> ====================================================================
>>
>> __ END__
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From cjfields at uiuc.edu  Tue Apr 22 19:02:14 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 14:02:14 -0500
Subject: [Bioperl-l] Fwd: [blast-announce] New BLAST URL available at
	the NCBI
In-Reply-To: <F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>
References: <EEEED756EF6626469B10653F745014389BAEAD@NIHCESMLBX15.nih.gov>
	<F63EB743-F1FF-4612-B7D6-0EA1F73F487C@bioperl.org>
Message-ID: <13C2AD96-8297-40DD-ADCC-B2BEC923B9E0@uiuc.edu>

They work exactly the same as the old URL, at least on the surface; I  
haven't tried changing many URLAPI parameters.  I went ahead and  
changed the URL in RemoteBlast to http://blast.ncbi.nlm.nih.gov/Blast.cgi 
  as it works with RemoteBlast.t.

chris

On Apr 22, 2008, at 1:49 PM, Jason Stajich wrote:

> Does anyone want to take a look at how to use these URLs in the  
> RemoteBlast module, if the interface is the same?
>
> -jason
>
> Begin forwarded message:
>
>> From: "Mcginnis, Scott (NIH/NLM/NCBI) [E]"  
>> <mcginnis at ncbi.nlm.nih.gov>
>> Date: April 22, 2008 11:35:04 AM PDT
>> To: <blast-announce at ncbi.nlm.nih.gov>
>> Subject: [blast-announce] New BLAST URL available at the NCBI
>>
>> New BLAST URL available at the NCBI
>>
>>
>>
>> The NCBI has activated a new URL for BLAST searches at the NCBI:
>> http://blast.ncbi.nlm.nih.gov.
>>
>>
>>
>> Searches sent to this URL can take advantage of a larger number of
>> machines for searches and the system has a better overall fault
>> tolerance.
>>
>>
>>
>> We recommend migration of all BLAST links and bookmarks (e.g.,
>> http://www.ncbi.nlm.nih.gov/BLAST/ and
>> http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) to the new URL.
>>
>>
>>
>> Links on the NCBI and BLAST home pages will start to change in the
>> coming weeks.
>>
>>
>>
>> At this point in time the plans are to also maintain the current  
>> BLAST
>> URL.
>>
>>
>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 22 18:58:40 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 22 Apr 2008 13:58:40 -0500
Subject: [Bioperl-l] BioPerl Module to Parse BLAT alignment output
In-Reply-To: <6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>
References: <3521d3670804220703u4d8565c8q604036727aedf0a8@mail.gmail.com>
	<766FDF9E-9F7B-4826-B7FA-87DF3B074EBC@uiuc.edu>
	<6C812413-B375-427B-9AF8-5A0AA6167CC8@bioperl.org>
Message-ID: <43344C89-6B4D-4360-AF56-A6FDD065FFF3@uiuc.edu>

Related to that, I have thought about building a parser for some of  
the query-anchored alignments produced by blastall, just haven't had  
time to devote to it.  One of these days...

chris

On Apr 22, 2008, at 1:51 PM, Jason Stajich wrote:

> if you get it as axt it should parse fine in SearchIO but that is  
> pairwise, if you can get an alignment blocks I can't remember what  
> format this is from UCSC.
> MSAs are going to be better handed through Bio::AlignIO though so it  
> might be better to build a parser on that.
>
> On Apr 22, 2008, at 7:22 AM, Chris Fields wrote:
>
>> A quick grep of bioperl-live gets me Bio::SearchIO::blast,  
>> Bio::SearchIO::axt, Bio::SearchIO::psl, Bio::Tools::Blat, and  
>> Bio::Tools::WebBlat.  Haven't looked at the docs but it's a start!
>>
>> chris
>>
>> On Apr 22, 2008, at 9:03 AM, Edward Wijaya wrote:
>>
>>> Hi,
>>>
>>> Is there any module that can parse the following output
>>> of BLAT. This is taken from UCSC browser.
>>>
>>> The idea is to parse it and then extract the conserved block
>>> of aligned sequences.
>>>
>>>
>>> __DATA__
>>> Alignment block 3 of 135 in window, 5860248 - 5860300, 53 bps
>>> B D   D. melanogaster
>>> tgtg----tatttatgt-tttaaataaaggt-------tttctaaata---cgaaatttcaaatttaa
>>> B D       D. simulans
>>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cgcaattttaaatttaa
>>> B D      D. sechellia
>>> tgtg----tatttatgt-tttaaataaaggt-------tttttaaata---cccaattttaaatttaa
>>> B D         D. yakuba
>>> tgtg----tatttatgt-tcttaataaaggt-------ttcctaaataa-ttcaaaatttaaattaaa
>>>           D. erecta
>>> tgtg----tgtttatgt-ttttaataaaggt-------tttctaaataa--tcgaaattcatttcaaa
>>>        D. ananassae
>>> taag----tttttatgtattttaaaatatag-------aaaataaata---aaaaaaattgaact---
>>>    D. pseudoobscura
>>> tata----ccagtacac-cttatatg------------tttttaaata--------------------
>>> B D     D. persimilis
>>> tata----ccagtacac-attatatg------------tttttaaata--------------------
>>>       D. willistoni
>>> aaaaaagttatttgaat-ttggaata------------taccaaaacatgttggaaatt------gaa
>>>          D. virilis
>>> -------------gatt-ttataataaaattgcgctaatttctaa------------tttacgttaaa
>>>       D. mojavensis
>>> -------------tagt-ccttaatataaatataatattaaataaata-------cttttaagttaaa
>>>        D. grimshawi
>>> ====================================================================
>>>        T. castaneum
>>> ====================================================================
>>>
>>> Inserts between block 3 and 4 in window
>>>   D. pseudoobscura 2008bp
>>> B D    D. persimilis 1421bp
>>>         D. virilis 5bp
>>>      D. mojavensis 4640bp
>>>
>>> Alignment block 4 of 135 in window, 5860301 - 5860344, 44 bps
>>> B D   D. melanogaster
>>> ----tgggtagcagcgttgccagat--------------------aaagggacatgtttactggctga
>>> B D       D. simulans
>>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>>> B D      D. sechellia
>>> ----tgggaagcagcgttgccagat-------------------gaaacgggcatgtttgcaggctga
>>> B D         D. yakuba
>>> ----tgagtaccaatgctgccagat-------------ctttgtaaagcggtaatgtttgctggctga
>>>           D. erecta
>>> ----t-----ttaatgttgccagat-------------ctgcgtaaggcgctcatgttggctggctga
>>>    D. pseudoobscura
>>> ====================================================================
>>> B D     D. persimilis
>>> ====================================================================
>>>       D. willistoni
>>> ----aggattacgaagttcctttat-------------------aaag--------------------
>>>          D. virilis
>>> gactagtttaatatctcagcccgttaagctaactgttactttttacagtattcgcgccattttgc---
>>>       D. mojavensis
>>> ====================================================================
>>>        D. grimshawi
>>> ====================================================================
>>>        T. castaneum
>>> ====================================================================
>>>
>>> __ END__
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bioperlanand at yahoo.com  Wed Apr 23 06:02:30 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Tue, 22 Apr 2008 23:02:30 -0700 (PDT)
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
Message-ID: <946658.12337.qm@web36802.mail.mud.yahoo.com>

Hi everybody,

I would like to use Bio::Graphics in conjunction with Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted blast report output along with an image of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I am able to get the HTML output using  "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the image using the examples outlined in the Bio::Graphics HOWTO: http://www.bioperl.org/wiki/HOWTO:Graphics

My question: How do I integrate Bio::Graphics with Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits image at the correct position in my BioPerl reformatted html file.

I also found that someone else has asked something similar to whatever I am asking & is listed under the "Orphans, Leftovers" category in the ListSummary:April 26-May 9,2006 document: 
http://www.bioperl.org/wiki/ListSummary:April_26-May_9%2C2006#Orphans.2C_Leftovers

Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;

my $infile = shift or die $!;

my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile);
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                                  -file   => ">${infile}.html");

$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------

Thanks in advance,

Anand

       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.


From jason at bioperl.org  Wed Apr 23 06:15:28 2008
From: jason at bioperl.org (Jason Stajich)
Date: Tue, 22 Apr 2008 23:15:28 -0700
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <946658.12337.qm@web36802.mail.mud.yahoo.com>
References: <946658.12337.qm@web36802.mail.mud.yahoo.com>
Message-ID: <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org>


Basically you want to inject your own IMG tags into the file with  
these routines:

     $writerhtml->start_report(\&my_start_report);
     $writerhtml->title(\&my_title);
     $writerhtml->hit_link_align(\&my_hit_link_align);
     $writerhtml->hit_link_desc(\&my_hit_link_desc);

fgblast shows a way to do this in part. It relies on Gbrowse to  
generate the image but you can replace the gbrowse_img reference to  
your own image generating software.

http://people.genome.duke.edu/~jes12/software/scripts/fgblast

-jason
On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:

> Hi everybody,
>
> I would like to use Bio::Graphics in conjunction with  
> Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted  
> blast report output along with an image of the blast hits as shown  
> on Slide 60 in this pdf: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
> I am able to get the HTML output using   
> "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the  
> image using the examples outlined in the Bio::Graphics HOWTO:  
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
> My question: How do I integrate Bio::Graphics with  
> Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits  
> image at the correct position in my BioPerl reformatted html file.
>
> I also found that someone else has asked something similar to  
> whatever I am asking & is listed under the "Orphans, Leftovers"  
> category in the ListSummary:April 26-May 9,2006 document:
> http://www.bioperl.org/wiki/ListSummary:April_26-May_9% 
> 2C2006#Orphans.2C_Leftovers
>
> Here is my code so far:
> ----------------------------------------------------------------
> #!/usr/bin/perl -w
> # usage: $0 <blast_report>
> use strict;
> use Bio::SearchIO;
> use Bio::SearchIO::Writer::HTMLResultWriter;
>
> my $infile = shift or die $!;
>
> my $searchio = new Bio::SearchIO( -format => 'blast',-file   =>  
> $infile);
> my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
> my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
>                                                   -file   => ">$ 
> {infile}.html");
>
> $outhtml->write_result($searchio->next_result);
> ----------------------------------------------------------------
>
> Thanks in advance,
>
> Anand
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From bamboowarrior at gmail.com  Wed Apr 23 19:39:21 2008
From: bamboowarrior at gmail.com (Arkady)
Date: Wed, 23 Apr 2008 14:39:21 -0500
Subject: [Bioperl-l] WebBlat, where'd it go?
Message-ID: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>

Hi folks,

I'm trying to use BioPerl to run a BLAT search on the four primate
genomes on UCSC. I understand that the proper tool for this is
Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
bioperl distribution (nor do I even know how to figure out what
version that is, unfortunately, though it's a very recent install -- a
month ago?). I also can't find it on CPAN. Is this deprecated? Has
something else replaced it? Or are we always supposed to run local
BLAT?

Thanks.

John Woods

Institute for Cellular and Molecular Biology
The University of Texas at Austin


From spiros at lokku.com  Wed Apr 23 19:48:12 2008
From: spiros at lokku.com (Spiros Denaxas)
Date: Wed, 23 Apr 2008 20:48:12 +0100
Subject: [Bioperl-l] WebBlat, where'd it go?
In-Reply-To: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
References: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
Message-ID: <bba689ec0804231248s47034503y3cbf0512e4344843@mail.gmail.com>

Hey,

a quick look at the list of deprecated modules reveals that it has
indeed been removed,

http://www.bioperl.org/wiki/Deprecated_modules

Spiros

On Wed, Apr 23, 2008 at 8:39 PM, Arkady <bamboowarrior at gmail.com> wrote:
> Hi folks,
>
>  I'm trying to use BioPerl to run a BLAT search on the four primate
>  genomes on UCSC. I understand that the proper tool for this is
>  Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
>  bioperl distribution (nor do I even know how to figure out what
>  version that is, unfortunately, though it's a very recent install -- a
>  month ago?). I also can't find it on CPAN. Is this deprecated? Has
>  something else replaced it? Or are we always supposed to run local
>  BLAT?
>
>  Thanks.
>
>  John Woods
>
>  Institute for Cellular and Molecular Biology
>  The University of Texas at Austin
>  _______________________________________________
>  Bioperl-l mailing list
>  Bioperl-l at lists.open-bio.org
>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Wed Apr 23 19:56:14 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 23 Apr 2008 14:56:14 -0500
Subject: [Bioperl-l] WebBlat, where'd it go?
In-Reply-To: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
References: <91656c3f0804231239j159fb9d8q7bae51ba5cbcd442@mail.gmail.com>
Message-ID: <AF7BBBC2-6A6E-486A-872C-8BB8B0A7FC0C@uiuc.edu>

It's no longer maintained (deprecated); see the following for an  
explanation:

http://article.gmane.org/gmane.comp.lang.perl.bio.general/13545

Basically, only local BLAT searches are supported through BioPerl.

chris

On Apr 23, 2008, at 2:39 PM, Arkady wrote:

> Hi folks,
>
> I'm trying to use BioPerl to run a BLAT search on the four primate
> genomes on UCSC. I understand that the proper tool for this is
> Bio::Tools::WebBlat. Unfortunately, it doesn't appear to be in my
> bioperl distribution (nor do I even know how to figure out what
> version that is, unfortunately, though it's a very recent install -- a
> month ago?). I also can't find it on CPAN. Is this deprecated? Has
> something else replaced it? Or are we always supposed to run local
> BLAT?
>
> Thanks.
>
> John Woods
>
> Institute for Cellular and Molecular Biology
> The University of Texas at Austin
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From bioperlanand at yahoo.com  Wed Apr 23 23:05:27 2008
From: bioperlanand at yahoo.com (Anand Venkatraman)
Date: Wed, 23 Apr 2008 16:05:27 -0700 (PDT)
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <952B0A4E-8A14-4E8E-B36D-14596B20E330@bioperl.org>
Message-ID: <795696.39415.qm@web36804.mail.mud.yahoo.com>

Hi Jason,

Thanks for the reply.

I am a little lost with the solution suggested. Is that how slide 60 in the pdf is obtained: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf

I guess I am missing something quite obvious, I apologize.

What I have & want is this: I have a directory having say 100 different blast reports & hence I am looking to obtain 100 different bioperl formatted blast html outputs with the respective images just as it would appear in the blast report.

Thanks,

Anand

Jason Stajich <jason at bioperl.org> wrote: 

Basically you want to inject your own IMG tags into the file with these routines:


    $writerhtml->start_report(\&my_start_report);
    $writerhtml->title(\&my_title);
    $writerhtml->hit_link_align(\&my_hit_link_align);
    $writerhtml->hit_link_desc(\&my_hit_link_desc);


fgblast shows a way to do this in part. It relies on Gbrowse to generate the image but you can replace the gbrowse_img reference to your own image generating software.
http://people.genome.duke.edu/~jes12/software/scripts/fgblast


-jason
On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:

Hi everybody,


I would like to use Bio::Graphics in conjunction with Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted blast report output along with an image of the blast hits as shown on Slide 60 in this pdf: http://jason.open-bio.org/Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf


I am able to get the HTML output using  "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the image using the examples outlined in the Bio::Graphics HOWTO: http://www.bioperl.org/wiki/HOWTO:Graphics


My question: How do I integrate Bio::Graphics with Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits image at the correct position in my BioPerl reformatted html file.


I also found that someone else has asked something similar to whatever I am asking & is listed under the "Orphans, Leftovers" category in the ListSummary:April 26-May 9,2006 document: 
http://www.bioperl.org/wiki/ListSummary:April_26-May_9%2C2006#Orphans.2C_Leftovers


Here is my code so far:
----------------------------------------------------------------
#!/usr/bin/perl -w
# usage: $0 <blast_report>
use strict;
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;


my $infile = shift or die $!;


my $searchio = new Bio::SearchIO( -format => 'blast',-file   => $infile);
my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
                                                  -file   => ">${infile}.html");


$outhtml->write_result($searchio->next_result);
----------------------------------------------------------------


Thanks in advance,


Anand


---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
 

---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.


From jason at bioperl.org  Thu Apr 24 18:06:41 2008
From: jason at bioperl.org (Jason Stajich)
Date: Thu, 24 Apr 2008 11:06:41 -0700
Subject: [Bioperl-l] Question on integrating Bio::Graphics with
	Bio::SearchIO::Writer::HTMLResultWriter
In-Reply-To: <795696.39415.qm@web36804.mail.mud.yahoo.com>
References: <795696.39415.qm@web36804.mail.mud.yahoo.com>
Message-ID: <D47EBDB9-C15C-44A7-9376-89FA946270DD@bioperl.org>

The overview graphic is generated basically from the script in  
scripts/graphics/search_overview.PLS

So you'd have to run that on each report to generate the graphic,  
then use the other methods  to insert <img src="NAME"> images into  
each rendered HTML report.

-jason

On Apr 23, 2008, at 4:05 PM, Anand Venkatraman wrote:

> Hi Jason,
>
> Thanks for the reply.
>
> I am a little lost with the solution suggested. Is that how slide  
> 60 in the pdf is obtained: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
> I guess I am missing something quite obvious, I apologize.
>
> What I have & want is this: I have a directory having say 100  
> different blast reports & hence I am looking to obtain 100  
> different bioperl formatted blast html outputs with the respective  
> images just as it would appear in the blast report.
>
> Thanks,
>
> Anand
>
> Jason Stajich <jason at bioperl.org> wrote:
>
> Basically you want to inject your own IMG tags into the file with  
> these routines:
>
>
>     $writerhtml->start_report(\&my_start_report);
>     $writerhtml->title(\&my_title);
>     $writerhtml->hit_link_align(\&my_hit_link_align);
>     $writerhtml->hit_link_desc(\&my_hit_link_desc);
>
>
> fgblast shows a way to do this in part. It relies on Gbrowse to  
> generate the image but you can replace the gbrowse_img reference to  
> your own image generating software.
> http://people.genome.duke.edu/~jes12/software/scripts/fgblast
>
>
>
>
> -jason
> On Apr 22, 2008, at 11:02 PM, Anand Venkatraman wrote:
>
> Hi everybody,
>
>
> I would like to use Bio::Graphics in conjunction with  
> Bio::SearchIO::Writer::HTMLResultWriter to obtain a HTML formatted  
> blast report output along with an image of the blast hits as shown  
> on Slide 60 in this pdf: http://jason.open-bio.org/ 
> Bioperl_Tutorials/NESCENT_2007/CSHL_Bioperl_I.pdf
>
>
> I am able to get the HTML output using   
> "Bio::SearchIO::Writer::HTMLResultWriter" and I am able to get the  
> image using the examples outlined in the Bio::Graphics HOWTO:  
> http://www.bioperl.org/wiki/HOWTO:Graphics
>
>
> My question: How do I integrate Bio::Graphics with  
> Bio::SearchIO::Writer::HTMLResultWriter to render the blast hits  
> image at the correct position in my BioPerl reformatted html file.
>
>
> I also found that someone else has asked something similar to  
> whatever I am asking & is listed under the "Orphans, Leftovers"  
> category in the ListSummary:April 26-May 9,2006 document:
> http://www.bioperl.org/wiki/ListSummary:April_26-May_9% 
> 2C2006#Orphans.2C_Leftovers
>
>
> Here is my code so far:
> ----------------------------------------------------------------
> #!/usr/bin/perl -w
> # usage: $0 <blast_report>
> use strict;
> use Bio::SearchIO;
> use Bio::SearchIO::Writer::HTMLResultWriter;
>
>
> my $infile = shift or die $!;
>
>
> my $searchio = new Bio::SearchIO( -format => 'blast',-file   =>  
> $infile);
> my $writerhtml = new Bio::SearchIO::Writer::HTMLResultWriter();
> my $outhtml = new Bio::SearchIO(-writer => $writerhtml,
>                                                   -file   => ">$ 
> {infile}.html");
>
>
> $outhtml->write_result($searchio->next_result);
> ----------------------------------------------------------------
>
>
> Thanks in advance,
>
>
> Anand
>
>
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
>
>
>
>
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.   
> Try it now.


From 1zoujing at 163.com  Thu Apr 17 02:53:16 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 16 Apr 2008 19:53:16 -0700 (PDT)
Subject: [Bioperl-l] Error with "parse_entrez_gene_example.pl
 Sus_scrofa.ags"
In-Reply-To: <Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>
References: <16602770.post@talk.nabble.com> <16603225.post@talk.nabble.com>
	<Pine.WNT.4.64.0804111600310.2384@A161887.one.ads.bms.com>
Message-ID: <16737795.post@talk.nabble.com>


    Thank you very much!
I splited the file on \t directly.

   Zou Jing


Stefan Kirov-2 wrote:
> 
> It is not. If you use this file, why would you need a parser for it 
> anyway? Just split on \t or read with OpenOffice or equiv.
> Stefan
> 
> On Thu, 10 Apr 2008, zoujing wrote:
> 
>>
>> Seached  the web and found the answer now, quote the answer as following:
>>   The error was thrown by my Bio::ASN1::EntrezGene module because it
>> expects a text file, while you fed it with a binary file.  To use
>> gzipped ASN binary file from NCBI, download the NCBI gene2xml
>> (ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/converters/by_program/gene2xml),
>> then use this syntax to run my parser on the binary files:
>>
>> my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
>> Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped
>> binary file directly downloaded from NCBI
>>
>> Same syntax should be used when you're using SeqIO (thus
>> SeqIO::entrezgene).
>> Mingyi
>>
>>   But there still one thing, I want to parse "gene_info.gz" in Gene of
>> NCBI. It doesn't work.Is that means "gene_info.gz"( tab-delimited,one
>> line
>> per GeneID, Column header line is the first line in the file
>> ) is not the right format for Bio::ASN1::EntrezGene?
>>
>>
>>
>> zoujing wrote:
>>>
>>>    I am a geen hand in Bioperl. When I run perl with
>>> "parse_entrez_gene_example.pl Sus_scrofa.ags", it turned out the error
>>> information:
>>>      Data Error: none conforming data found on line 1 in Sus_scrofa.ags.
>>>
>>>    But the Sus_scrofa.ags is download from NCBI, with the format of
>>> ASN1,
>>> should be the same as Homo_sapiens in the example. So it should be no
>>> error as the code is the example from Mingyi.
>>>    I wonder why this happen, and should I change something about the
>>> file?
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16603225.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Error-with-%22parse_entrez_gene_example.pl-Sus_scrofa.ags%22-tp16602770p16737795.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From 1zoujing at 163.com  Thu Apr 17 02:55:47 2008
From: 1zoujing at 163.com (zoujing)
Date: Wed, 16 Apr 2008 19:55:47 -0700 (PDT)
Subject: [Bioperl-l] Bio::ASN1::EntrezGene parse so slowly?
In-Reply-To: <264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>
References: <16602210.post@talk.nabble.com>
	<264855a00804112050gf785c2ei66d9c7463597eccd@mail.gmail.com>
Message-ID: <16737804.post@talk.nabble.com>


Thank you vey much!
  Solved the problem now.

   Jing

Sean Davis-3 wrote:
> 
> gene_info is a tab-delimited text file, if I recall correctly.  Have
> you looked at it?  If it is, you should be able to parse it in a few
> seconds with just a couple lines of code.
> 
> Sean
> 
> 
> On Thu, Apr 10, 2008 at 1:08 AM, zoujing <1zoujing at 163.com> wrote:
>>
>>   I want to parse a file "gene_info" from NCBI. The format of Gene in
>> NCBI is
>>  ASN1, right? So I used Bio::ASN1::EntrezGene. But it didn't work
>>  properly/too slow. The file is about 500M.
>>   The code is following:
>>   use Bio::ASN1::EntrezGene;
>>   my $parser = Bio::ASN1::EntrezGene->new('file' => $ARGV[0]);
>>   my $i = 0;
>>   while(my $result = $parser->next_seq)
>>   { last; #something to do there, here use last for test}
>>
>>   When it goes to the "while" part, it is processing on and on, it does
>> not
>>  went out, even I used "last" in the "while" part.
>>    So I wonder whether it is too slow or the module is not fit for this
>> job,
>>  or I did something wrong?
>>
>>   Thank you!
>>  --
>>  View this message in context:
>> http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16602210.html
>>  Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>>  _______________________________________________
>>  Bioperl-l mailing list
>>  Bioperl-l at lists.open-bio.org
>>  http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Bio%3A%3AASN1%3A%3AEntrezGene-parse-so-slowly--tp16602210p16737804.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.


From sbassi at clubdelarazon.org  Sat Apr 26 17:49:20 2008
From: sbassi at clubdelarazon.org (Sebastian Bassi)
Date: Sat, 26 Apr 2008 14:49:20 -0300
Subject: [Bioperl-l] bioperl installation problem
Message-ID: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>

I tried to install bioperl because I need to install cviewer.
Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and sdterr outputs.

Here is one of the errors I get:

set_attribute: not a compat02 graph at
/usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
sleeping for 3 seconds
set_attribute: not a compat02 graph at
/usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.

But I have GD::Graph, so I don't know what is going on:

sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install GD::Graph'
CPAN: Storable loaded ok
Going to read /home/sbassi/.cpan/Metadata
  Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
GD::Graph is up to date.

Any help regarding this: http://www.pastecode.com.ar/f37c1cd60
would be appreciated.

Best,
SB.

-- 
Sebasti?n Bassi (???????). Diplomado en Ciencia y Tecnolog?a.
Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
Mostr? tu c?digo: http://www.pastecode.com.ar
GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D


From jason at bioperl.org  Sat Apr 26 19:23:37 2008
From: jason at bioperl.org (Jason Stajich)
Date: Sat, 26 Apr 2008 12:23:37 -0700
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
Message-ID: <B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>

the error refers to the 'Graph' module not 'GD::Graph';

-jason
On Apr 26, 2008, at 10:49 AM, Sebastian Bassi wrote:

> I tried to install bioperl because I need to install cviewer.
> Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and  
> sdterr outputs.
>
> Here is one of the errors I get:
>
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
> sleeping for 3 seconds
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.
>
> But I have GD::Graph, so I don't know what is going on:
>
> sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install GD::Graph'
> CPAN: Storable loaded ok
> Going to read /home/sbassi/.cpan/Metadata
>   Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
> GD::Graph is up to date.
>
> Any help regarding this: http://www.pastecode.com.ar/f37c1cd60
> would be appreciated.
>
> Best,
> SB.
>
> -- 
> Sebasti?n Bassi (???????). Diplomado en Ciencia y  
> Tecnolog?a.
> Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
> Mostr? tu c?digo: http://www.pastecode.com.ar
> GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From sbassi at clubdelarazon.org  Sat Apr 26 21:08:13 2008
From: sbassi at clubdelarazon.org (Sebastian Bassi)
Date: Sat, 26 Apr 2008 18:08:13 -0300
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
	<B07E3ABC-FA71-4AEA-8802-29F1C3023BAE@bioperl.org>
Message-ID: <9e2f512b0804261408l45ff9f91j94f44065d21cd65f@mail.gmail.com>

On Sat, Apr 26, 2008 at 4:23 PM, Jason Stajich <jason at bioperl.org> wrote:
> the error refers to the 'Graph' module not 'GD::Graph';

You are right, but I have it also installed:

sbassi at ubuntuMAP:~$ sudo perl -MCPAN -e 'install Graph'
Password:
CPAN: Storable loaded ok
Going to read /home/sbassi/.cpan/Metadata
  Database was generated on Fri, 25 Apr 2008 09:29:45 GMT
Graph is up to date.


-- 
Sebasti?n Bassi (???????). Diplomado en Ciencia y Tecnolog?a.
Curso Biologia molecular para programadores: http://tinyurl.com/2vv8w6
Mostr? tu c?digo: http://www.pastecode.com.ar
GPG Fingerprint: 9470 0980 620D ABFC BE63 A4A4 A3DE C97D 8422 D43D


From bix at sendu.me.uk  Sat Apr 26 23:30:56 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Sun, 27 Apr 2008 00:30:56 +0100
Subject: [Bioperl-l] bioperl installation problem
In-Reply-To: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
References: <9e2f512b0804261049s4c1d829cy79b702f6f5680474@mail.gmail.com>
Message-ID: <4813BB30.6060703@sendu.me.uk>

Sebastian Bassi wrote:
> I tried to install bioperl because I need to install cviewer.
> Here (http://www.pastecode.com.ar/f37c1cd60) are both stdout and sdterr outputs.
> 
> Here is one of the errors I get:
> 
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN0> line 10.
> sleeping for 3 seconds
> set_attribute: not a compat02 graph at
> /usr/local/share/perl/5.8.7/Graph.pm line 2394, <GEN1> line 14.

You're trying to install a very old version of Bioperl which apparently 
uses behaviour of the Graph module no longer supported:
http://search.cpan.org/~jhi/Graph-0.84/lib/Graph.pod#Backward_compatibility_with_Graph_0.2

Your options are to force install your desired version of Bioperl (if 
you don't need to use the modules that are causing the errors you get), 
downgrade your version of Graph to pre-0.2, or install the latest 
version of Bioperl (1.5.2 or from svn).


From dr.hogart at gmail.com  Sun Apr 27 14:05:20 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Sun, 27 Apr 2008 18:05:20 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
Message-ID: <op.t99vyoejavnppr@hogart.hackers>

Hi all,

is it possible to add a GD::graphic object (chart) to Bio::Graphics panel  
to obtain a file with image of both the chart and bioseq object?


From Russell.Smithies at agresearch.co.nz  Sun Apr 27 21:27:23 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Mon, 28 Apr 2008 09:27:23 +1200
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <op.t99vyoejavnppr@hogart.hackers>
References: <op.t99vyoejavnppr@hogart.hackers>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>

You can get the GD object back from the Bio::Graphics::Panel  then draw
on it using GD methods

Eg:

#create a BioPerl panel
my $panel = Bio::Graphics::Panel->new(
                              			-length   => 600
                              			-width    => 800,
					-bgcolor  => 'white'
					);
# add your features
my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
200,);
$panel->add_track($feature, glyph   =>   'segments',
					-label   =>   0,
					-height  =>   30,
					-bgcolor  =>  'red',
					-fgcolor  => 'red'
					 );

# grab the GD thingy
my $gd = $panel->gd;

#create a color - not sure if there's a better way?
$black = $gd->colorAllocate(0,0,0);

#draw on your GD thingy
$gd->line(10,10,$panel->width -10,10,$black);
$gd->string(gdSmallFont,20,10,'test' ,'$black);

# print it as normal	
print $panel->png;


> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-
> bio.org] On Behalf Of sergei ryazansky
> Sent: Monday, 28 April 2008 2:05 a.m.
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
> 
> Hi all,
> 
> is it possible to add a GD::graphic object (chart) to Bio::Graphics
panel
> to obtain a file with image of both the chart and bioseq object?
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From dr.hogart at gmail.com  Mon Apr 28 00:25:18 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Mon, 28 Apr 2008 04:25:18 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
Message-ID: <op.uaaosgoeavnppr@hogart.hackers>

Thanks for answer!
Yours  script works fine, but nevertheless, as for as I understand 'gd'  
method return the gd::image object. But I need the to merge bioseq object  
with gd::graph object (gd::graph::area). Is it possible? Or maybe I  
misunderstood something in your example?


On Mon, 28 Apr 2008 01:27:23 +0400, Smithies, Russell  
<Russell.Smithies at agresearch.co.nz> wrote:

> You can get the GD object back from the Bio::Graphics::Panel  then draw
> on it using GD methods
>
> Eg:
>
> #create a BioPerl panel
> my $panel = Bio::Graphics::Panel->new(
>                               			-length   => 600
>                               			-width    => 800,
> 					-bgcolor  => 'white'
> 					);
> # add your features
> my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> 200,);
> $panel->add_track($feature, glyph   =>   'segments',
> 					-label   =>   0,
> 					-height  =>   30,
> 					-bgcolor  =>  'red',
> 					-fgcolor  => 'red'
> 					 );
>
> # grab the GD thingy
> my $gd = $panel->gd;
>
> #create a color - not sure if there's a better way?
> $black = $gd->colorAllocate(0,0,0);
>
> #draw on your GD thingy
> $gd->line(10,10,$panel->width -10,10,$black);
> $gd->string(gdSmallFont,20,10,'test' ,'$black);
>
> # print it as normal	
> print $panel->png;
>
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-
>> bio.org] On Behalf Of sergei ryazansky
>> Sent: Monday, 28 April 2008 2:05 a.m.
>> To: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
>>
>> Hi all,
>>
>> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> panel
>> to obtain a file with image of both the chart and bioseq object?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


From Bank.Beszteri at awi.de  Mon Apr 28 12:18:20 2008
From: Bank.Beszteri at awi.de (=?UTF-8?B?QsOhbmsgQmVzenRlcmk=?=)
Date: Mon, 28 Apr 2008 14:18:20 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <47FB204F.90405@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de>
Message-ID: <4815C08C.1060305@awi.de>

Dear BioSQL / bioperl-db-ists,

I would like  to share my experiences with trying to load uniprot_trembl 
into a BioSQL db, and also to ask a couple of questions; perhaps some of 
you know the problems I encountered. I used bioperl-live and 
bioperl-db-live as of 2008-04-03 and uniprot_trembl.dat as of 
2008-04-04. The command was like

load_seqdatabase.pl --safe --logchunk 1000 --host dbserv --dbname abc 
--dbuser efg --dbpass xyz --driver mysql --namespace uniprot_trembl 
--format embl uniprot_trembl.dat

although I split the dat file into 10 chunks and started them parallel 
to make it faster. This did not go quite as smoothly as Swissprot did. 
In the end, it seems to have loaded 5022284 entries of the 5443284 which 
appear to be there in the input file (when counting with grep -c "ID   ").

Besides the harmless taxonomy warnings which also appear with Swissprot 
(and have been discussed about here a couple of weeks ago and also 
earlier), there came a couple of more serious errors. Perhaps some of 
you know them already:

First of all, the below error seems to lead to a crash, in spite of --safe:

 >>>
------------- EXCEPTION -------------
MSG: A1XDT7 seems to have an invalid species classification.
STACK Bio::SeqIO::embl::_read_EMBL_Species 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:108
7
STACK Bio::SeqIO::embl::next_seq 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:320
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:634
-------------------------------------

Command exited with non-zero status 255
<<<

What this is about is NCBI Tax_ID:435 (Acetobacter aceti; it has some 30 
synonyms in my DB, too), which, to me, looks like a completely normal 
taxon: I could follow its taxonomy up to the root in my NCBI taxonomy in 
the BioSQL DB I used. I don?t know if someone else has seen / can 
reproduce the problem, or should I think about some problem with my 
taxonomy db? Besides, is it the expected behaviour from 
load_seqdatabase.pl to die upon this error?

###################

The other problems did not lead to a crash, only to a failure to load 
the sequence, which would be what I?d expect with --safe. The first type 
of errors looks like

 >>>
Could not store Q49I36:
------------- EXCEPTION -------------
MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor returned 2 rows 
instead of 1. Query was [name_class="scientific 
name",binomial="Onchocerca volvulus"]
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:958
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:854
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:182
STACK Bio::DB::Persistent::PersistentObject::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:244
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
STACK Bio::DB::Persistent::PersistentObject::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:271
STACK (eval) 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:630
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:612
-------------------------------------
<<<

In this particular case, "Onchocerca volvulus" does indeed have two 
taxon_ids in my DB (6282 and 563188, of which only the first one is 
returned by a web search at NCBI taxonomy); but the same thing happened 
with a number of other taxa (followed by how many times the above error 
was caused by the particular taxa):

Wolbachia pipientis     64
Hemerocallis sp.        1
Hypsiglena torquata     3
Salmonella enterica     1211
Burkholderia sp.        31
Streptococcus sp.       4
Rhizobium sp.   600
Nostoc sp.      19
Drosophila sp.  18
Onchocerca volvulus     62
Atlapetes schistaceus   4
Symbiodinium sp.        3
Escherichia coli        7421
Hieraaetus fasciatus    4
Borrelia burgdorferi group      1
Pseudomonas sp. 29
Rotavirus A     1076
Gorilla gorilla 746
Rana plancyi    14
unclassified sequences  1

(This should be 11312 cases altogether, but the list might be incomplete 
because I accidentally removed one of my logs, which contained STDOUT 
&STDERR ~ for 10 % of the entries)

Again, is this a known problem for some of you, or could there be a 
problem with my copy of NCBI taxonomy? I don?t remember having updated 
it after the initial upload, so I?m quite surprised by such duplicate 
entries....

###################

Type 2 error w/o crash:

 >>>
Could not store A5HU09:
------------- EXCEPTION -------------
MSG: create: object (Bio::Species) failed to insert or to be found by 
unique key
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:206
STACK Bio::DB::Persistent::PersistentObject::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:244
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:169
STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:251
STACK Bio::DB::Persistent::PersistentObject::store 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/Bio/DB/Persistent/PersistentObject.pm:271
STACK (eval) 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:630
STACK toplevel 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-db/scripts/biosql/load_seqdatabase.pl:612
<<<

This particular record has the NCBI_TaxID 44271, which looks completely 
normal in the NCBI taxonomy loaded in my BioSQL DB, but the same problem 
appeared in 53 further cases (I could not look into them in detail as 
yet to see whether they were all the same species). On the other hand, 7 
records which were succesfully loaded have this taxonomy ID in the DB 
(44271).

###################

Nr 3 no crash:

 >>>
Could not store Q6T859: Unmatched ( in regex; marked by <-- HERE in 
m/Camelina microcarpa (Littlepod false flax) ( <-- HERE microcarpa 
subsp.\s+/ at 
/home/biocl/bbeszter/lib/bioperl-live/bioperl-live/Bio/Species.pm line 
466, <GEN0> line 357048.
<<<

This happens in the sub binomial in Species.pm using the option "FULL", 
which requests to also return subspecies. I have not looked much deeper 
into this yet, but is it possible that there is a parsing problem with 
multi-line species strings? In the above case the OS field in 
uniprot_trembl.dat looks like

OS   Camelina microcarpa (Littlepod false flax) (Camelina microcarpa subsp.
OS   sylvestris).

###################

I?m still looking for where the remaining records disappeared: of the 
421000 records not showing up in the DB, I could find these:

crasher (Tax_ID=435):   45 entries
problem 1 ("MSG: Unique key query in Bio::DB::BioSQL::SpeciesAdaptor 
returned 2 rows instead of 1."): 11312 entries
problem 2 ("MSG: create: object (Bio::Species) failed to insert or to be 
found by unique key"): 54 entries
problem 3 ("Unmatched ( in regex"): 28241 entries

381348 still remain... Although these could in principle come from the 
first 10 %, for which I don?t have the output, but they don?t seem to: 
after restarting that chunk, I get ~ 30 "Could not store" errors.

So the last question: are there any error messages I can expect which 
don?t contain "Could not store" and which I thus missed here?


Bank Beszteri


Bioinformatics
Alfred Wegener Institute for Polar and Marine Research
Am Handelshafen 12
27570 Bremerhaven


From cjfields at uiuc.edu  Mon Apr 28 13:20:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 28 Apr 2008 08:20:39 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815C08C.1060305@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
Message-ID: <5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>

On Apr 28, 2008, at 7:18 AM, B?nk Beszteri wrote:

> Dear BioSQL / bioperl-db-ists,
>
> I would like  to share my experiences with trying to load  
> uniprot_trembl into a BioSQL db, and also to ask a couple of  
> questions; perhaps some of you know the problems I encountered. I  
> used bioperl-live and bioperl-db-live as of 2008-04-03 and  
> uniprot_trembl.dat as of 2008-04-04. The command was like
>
> load_seqdatabase.pl --safe --logchunk 1000 --host dbserv --dbname  
> abc --dbuser efg --dbpass xyz --driver mysql --namespace  
> uniprot_trembl --format embl uniprot_trembl.dat
>
> ....
>
> First of all, the below error seems to lead to a crash, in spite of  
> --safe:
>
> >>>
> ------------- EXCEPTION -------------
> MSG: A1XDT7 seems to have an invalid species classification.
> STACK Bio::SeqIO::embl::_read_EMBL_Species /home/biocl/bbeszter/lib/ 
> bioperl-live/bioperl-live/Bio/SeqIO/embl.pm:108
> 7
> STACK Bio::SeqIO::embl::next_seq /home/biocl/bbeszter/lib/bioperl- 
> live/bioperl-live/Bio/SeqIO/embl.pm:320
> STACK toplevel /home/biocl/bbeszter/lib/bioperl-live/bioperl-db/ 
> scripts/biosql/load_seqdatabase.pl:634
> -------------------------------------
>
> Command exited with non-zero status 255
> <<<
>
> What this is about is NCBI Tax_ID:435 (Acetobacter aceti; it has  
> some 30 synonyms in my DB, too), which, to me, looks like a  
> completely normal taxon: I could follow its taxonomy up to the root  
> in my NCBI taxonomy in the BioSQL DB I used. I don?t know if someone  
> else has seen / can reproduce the problem, or should I think about  
> some problem with my taxonomy db? Besides, is it the expected  
> behaviour from load_seqdatabase.pl to die upon this error?

...

You should use 'swiss' format instead of 'embl' when loading Uniprot/ 
SwissProt sequences.  Though on the surface they're similar the  
feature table (among other things) is completely different.  I'm not  
sure if that's causing all of the issues here but it certainly could  
contribute to them.

In the meantime, it's much easier for us to track these problems if  
you file a bug (BioPerl, file for bioperl-db):

http://bugzilla.open-bio.org/

chris


From cjfields at uiuc.edu  Sun Apr 27 21:54:03 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Sun, 27 Apr 2008 16:54:03 -0500
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
Message-ID: <FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>

I think this is how some of the synteny mapping is done using  
SynBrowse (the trapezoids connecting syntenous genes on different  
tracks).

http://www.gmod.org/wiki/index.php/SynView

chris

On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:

> You can get the GD object back from the Bio::Graphics::Panel  then  
> draw
> on it using GD methods
>
> Eg:
>
> #create a BioPerl panel
> my $panel = Bio::Graphics::Panel->new(
>                              			-length   => 600
>                              			-width    => 800,
> 					-bgcolor  => 'white'
> 					);
> # add your features
> my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> 200,);
> $panel->add_track($feature, glyph   =>   'segments',
> 					-label   =>   0,
> 					-height  =>   30,
> 					-bgcolor  =>  'red',
> 					-fgcolor  => 'red'
> 					 );
>
> # grab the GD thingy
> my $gd = $panel->gd;
>
> #create a color - not sure if there's a better way?
> $black = $gd->colorAllocate(0,0,0);
>
> #draw on your GD thingy
> $gd->line(10,10,$panel->width -10,10,$black);
> $gd->string(gdSmallFont,20,10,'test' ,'$black);
>
> # print it as normal	
> print $panel->png;
>
>
>
>
>> -----Original Message-----
>> From: bioperl-l-bounces at lists.open-bio.org
> [mailto:bioperl-l-bounces at lists.open-
>> bio.org] On Behalf Of sergei ryazansky
>> Sent: Monday, 28 April 2008 2:05 a.m.
>> To: bioperl-l at bioperl.org
>> Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
>>
>> Hi all,
>>
>> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> panel
>> to obtain a file with image of both the chart and bioseq object?
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> = 
> ======================================================================
> Attention: The information contained in this message and/or  
> attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or  
> privileged
> material. Any review, retransmission, dissemination or other use of,  
> or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by  
> AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> = 
> ======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From Bank.Beszteri at awi.de  Mon Apr 28 13:51:53 2008
From: Bank.Beszteri at awi.de (=?ISO-8859-1?Q?B=E1nk_Beszteri?=)
Date: Mon, 28 Apr 2008 15:51:53 +0200
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
Message-ID: <4815D679.3070307@awi.de>

Chris Fields schrieb:
>
> ...
>
> You should use 'swiss' format instead of 'embl' when loading 
> Uniprot/SwissProt sequences.  Though on the surface they're similar 
> the feature table (among other things) is completely different.  I'm 
> not sure if that's causing all of the issues here but it certainly 
> could contribute to them.
>
> In the meantime, it's much easier for us to track these problems if 
> you file a bug (BioPerl, file for bioperl-db):
>
> http://bugzilla.open-bio.org/
>
Hi Chris,

I will do so; in the meanwhile: I?m not loading Swissprot, but TrEMBL. 
Is swiss also the appropriate format here? By reading 
http://expasy.org/sprot/userman.html#diffEMBL, I concluded that embl 
should be the one I?d need for TrEMBL.

Bank


From cjfields at uiuc.edu  Mon Apr 28 16:24:39 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Mon, 28 Apr 2008 11:24:39 -0500
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815D679.3070307@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
	<4815D679.3070307@awi.de>
Message-ID: <B7918B56-536D-497F-A59D-D48A61085339@uiuc.edu>


On Apr 28, 2008, at 8:51 AM, B?nk Beszteri wrote:

> Chris Fields schrieb:
>>
>> ...
>>
>> You should use 'swiss' format instead of 'embl' when loading  
>> Uniprot/SwissProt sequences.  Though on the surface they're similar  
>> the feature table (among other things) is completely different.   
>> I'm not sure if that's causing all of the issues here but it  
>> certainly could contribute to them.
>>
>> In the meantime, it's much easier for us to track these problems if  
>> you file a bug (BioPerl, file for bioperl-db):
>>
>> http://bugzilla.open-bio.org/
>>
> Hi Chris,
>
> I will do so; in the meanwhile: I?m not loading Swissprot, but  
> TrEMBL. Is swiss also the appropriate format here? By reading http://expasy.org/sprot/userman.html#diffEMBL 
> , I concluded that embl should be the one I?d need for TrEMBL.
>
> Bank

The section you link to describes several important differences  
between EMBL and SwissProt/UniProt format (i.e. how each indicated  
line type differs between SwissProt and EMBL formats, including ID,  
AC, OS/OC, FT, etc).  I'm unsure how you derived that 'embl' would  
work from that, e.g. they are close, but there are enough significant  
differences that using 'embl' for SwissProt (or vice versa) will not  
work as intended, if at all.

chris


From hlapp at gmx.net  Mon Apr 28 19:46:07 2008
From: hlapp at gmx.net (Hilmar Lapp)
Date: Mon, 28 Apr 2008 15:46:07 -0400
Subject: [Bioperl-l] Indexing large databases / BioSQL
In-Reply-To: <4815D679.3070307@awi.de>
References: <19992.156.83.1.157.1207579017.squirrel@webmail.xs4all.nl>
	<47FB204F.90405@awi.de> <4815C08C.1060305@awi.de>
	<5C383B1F-92AD-4194-B9B4-007AE51A092F@uiuc.edu>
	<4815D679.3070307@awi.de>
Message-ID: <3BD6A261-D023-4A5F-9CBC-C3216B0145F0@gmx.net>


On Apr 28, 2008, at 9:51 AM, B?nk Beszteri wrote:
>  I?m not loading Swissprot, but TrEMBL. Is swiss also the  
> appropriate format here?


Yes, though I guess it can be confusing.

Maybe we should create a symlink uniprot.pm to swiss.pm, or in fact  
fork them if UniProt starts accumulating enough differences from the  
traditional Swissprot format.

BTW as you had noticed, the --safe switch only protects the script  
from crashing due to a db loading error. A parsing error will still  
cause a crash.

I guess you can argue that that's not nice, and having a chance to  
skip over the record that offends the (BioPerl) parser would be  
useful. The problem is that if the parser errors out, it's not  
guaranteed where we are in the file and whether the parser module is  
in a state that it can recover itself from. For the database it's a  
bit easier as one just needs to rollback() the transaction (each  
sequence is its own transaction).

	-hilmar
-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
===========================================================


From Russell.Smithies at agresearch.co.nz  Mon Apr 28 21:15:16 2008
From: Russell.Smithies at agresearch.co.nz (Smithies, Russell)
Date: Tue, 29 Apr 2008 09:15:16 +1200
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
Message-ID: <D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>

I thought it was a bit of a hack but I guess if someone else is doing it
too, it can't be all bad  :-)

It looks like you can combine your drawing methods like this:
(I'm sure Lincoln will tell us this is bad but it seems to work ok)
------------------------------------------------------------------------
-------------

#!perl -w
use GD::Graph::lines;
use GD::Graph::colour;
use GD::Graph::Data;

use Bio::Graphics;
use Bio::SeqFeature::Generic;

# create and draw on a graphics panel
my $panel = Bio::Graphics::Panel->new(
                                      -length => 500,
                                      -width  => 500
                                     );
my $track = $panel->add_track(
                              -glyph => 'generic',
                              -label => 1
                             );

# create and add a few features
for($i = 100; $i < 500; $i+= 100){
  my $feature = Bio::SeqFeature::Generic->new(
                                              -display_name => "feature:
$i",
                                              -score        => $i,
                                              -start        => $i,
                                              -end          => $i + 100
                                             );
  $track->add_feature($feature);
}


# create and draw the graph
my @data = (
    ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
    [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
    [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
);
my $graph = GD::Graph::lines->new(500, 300);

$graph->set(
      x_label           => 'X Label',
      y_label           => 'Y label',
      title             => 'Some simple graph',
      y_max_value       => 8,
      y_tick_number     => 8,
      y_label_skip      => 2
) or die $graph->error;

$graph->set( dclrs => [ qw( green blue black red pink) ] );

my $gd = $graph->plot(\@data) or die $graph->error;

# combine the two images
my $combined = $panel->gd($gd);

open(IMG, '>file.png') or die $!;
binmode IMG;
print IMG $combined->png;

------------------------------------------------------------------------
------------------

> -----Original Message-----
> From: Chris Fields [mailto:cjfields at uiuc.edu]
> Sent: Monday, 28 April 2008 9:54 a.m.
> To: Smithies, Russell
> Cc: sergei ryazansky; bioperl-l at bioperl.org
> Subject: Re: [Bioperl-l] addition of GD::graphic object to
Bio::Graphics
> 
> I think this is how some of the synteny mapping is done using
> SynBrowse (the trapezoids connecting syntenous genes on different
> tracks).
> 
> http://www.gmod.org/wiki/index.php/SynView
> 
> chris
> 
> On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
> 
> > You can get the GD object back from the Bio::Graphics::Panel  then
> > draw
> > on it using GD methods
> >
> > Eg:
> >
> > #create a BioPerl panel
> > my $panel = Bio::Graphics::Panel->new(
> >                              			-length   => 600
> >                              			-width    =>
800,
> > 					-bgcolor  => 'white'
> > 					);
> > # add your features
> > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> > 200,);
> > $panel->add_track($feature, glyph   =>   'segments',
> > 					-label   =>   0,
> > 					-height  =>   30,
> > 					-bgcolor  =>  'red',
> > 					-fgcolor  => 'red'
> > 					 );
> >
> > # grab the GD thingy
> > my $gd = $panel->gd;
> >
> > #create a color - not sure if there's a better way?
> > $black = $gd->colorAllocate(0,0,0);
> >
> > #draw on your GD thingy
> > $gd->line(10,10,$panel->width -10,10,$black);
> > $gd->string(gdSmallFont,20,10,'test' ,'$black);
> >
> > # print it as normal
> > print $panel->png;
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org
> > [mailto:bioperl-l-bounces at lists.open-
> >> bio.org] On Behalf Of sergei ryazansky
> >> Sent: Monday, 28 April 2008 2:05 a.m.
> >> To: bioperl-l at bioperl.org
> >> Subject: [Bioperl-l] addition of GD::graphic object to
Bio::Graphics
> >>
> >> Hi all,
> >>
> >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> > panel
> >> to obtain a file with image of both the chart and bioseq object?
> >>
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > =
> >
> =============================================================
> =========
> > Attention: The information contained in this message and/or
> > attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or
> > privileged
> > material. Any review, retransmission, dissemination or other use of,
> > or
> > taking of any action in reliance upon, this information by persons
or
> > entities other than the intended recipients is prohibited by
> > AgResearch
> > Limited. If you have received this message in error, please notify
the
> > sender immediately.
> > =
> >
> =============================================================
> =========
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
> 
> 

=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================


From lincoln.stein at gmail.com  Mon Apr 28 21:33:19 2008
From: lincoln.stein at gmail.com (Lincoln Stein)
Date: Mon, 28 Apr 2008 17:33:19 -0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
In-Reply-To: <D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
	<D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
Message-ID: <6dce9a0b0804281433i697cda2fo2c47ce59010d0858@mail.gmail.com>

Hi,

No, I'm perfectly happy with combining images like this. It is part of what
I intended.

Another idea would be to use the Image glyph to embed graphs at particular
genomic locations in the panel. Right now the glyph is designed in the
expectation that the image passed to it is sitting on the file system (or a
web URL), but it would be easy to modify it so that a callback can generate
the GD on the fly, by using, for example GD::Graph.

Lincoln

On Mon, Apr 28, 2008 at 5:15 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz> wrote:

> I thought it was a bit of a hack but I guess if someone else is doing it
> too, it can't be all bad  :-)
>
> It looks like you can combine your drawing methods like this:
> (I'm sure Lincoln will tell us this is bad but it seems to work ok)
> ------------------------------------------------------------------------
> -------------
>
> #!perl -w
> use GD::Graph::lines;
> use GD::Graph::colour;
> use GD::Graph::Data;
>
> use Bio::Graphics;
> use Bio::SeqFeature::Generic;
>
> # create and draw on a graphics panel
> my $panel = Bio::Graphics::Panel->new(
>                                       -length => 500,
>                                      -width  => 500
>                                     );
> my $track = $panel->add_track(
>                              -glyph => 'generic',
>                              -label => 1
>                             );
>
> # create and add a few features
> for($i = 100; $i < 500; $i+= 100){
>   my $feature = Bio::SeqFeature::Generic->new(
>                                               -display_name => "feature:
> $i",
>                                              -score        => $i,
>                                              -start        => $i,
>                                              -end          => $i + 100
>                                             );
>  $track->add_feature($feature);
> }
>
>
> # create and draw the graph
> my @data = (
>    ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
>    [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
>    [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
> );
> my $graph = GD::Graph::lines->new(500, 300);
>
> $graph->set(
>      x_label           => 'X Label',
>      y_label           => 'Y label',
>      title             => 'Some simple graph',
>      y_max_value       => 8,
>      y_tick_number     => 8,
>      y_label_skip      => 2
> ) or die $graph->error;
>
> $graph->set( dclrs => [ qw( green blue black red pink) ] );
>
> my $gd = $graph->plot(\@data) or die $graph->error;
>
> # combine the two images
> my $combined = $panel->gd($gd);
>
> open(IMG, '>file.png') or die $!;
> binmode IMG;
> print IMG $combined->png;
>
> ------------------------------------------------------------------------
> ------------------
>
> > -----Original Message-----
> > From: Chris Fields [mailto:cjfields at uiuc.edu]
> > Sent: Monday, 28 April 2008 9:54 a.m.
> > To: Smithies, Russell
> > Cc: sergei ryazansky; bioperl-l at bioperl.org
> > Subject: Re: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
> >
> > I think this is how some of the synteny mapping is done using
> > SynBrowse (the trapezoids connecting syntenous genes on different
> > tracks).
> >
> > http://www.gmod.org/wiki/index.php/SynView
> >
> > chris
> >
> > On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
> >
> > > You can get the GD object back from the Bio::Graphics::Panel  then
> > > draw
> > > on it using GD methods
> > >
> > > Eg:
> > >
> > > #create a BioPerl panel
> > > my $panel = Bio::Graphics::Panel->new(
> > >                                                     -length   => 600
> > >                                                     -width    =>
> 800,
> > >                                     -bgcolor  => 'white'
> > >                                     );
> > > # add your features
> > > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
> > > 200,);
> > > $panel->add_track($feature, glyph   =>   'segments',
> > >                                     -label   =>   0,
> > >                                     -height  =>   30,
> > >                                     -bgcolor  =>  'red',
> > >                                     -fgcolor  => 'red'
> > >                                      );
> > >
> > > # grab the GD thingy
> > > my $gd = $panel->gd;
> > >
> > > #create a color - not sure if there's a better way?
> > > $black = $gd->colorAllocate(0,0,0);
> > >
> > > #draw on your GD thingy
> > > $gd->line(10,10,$panel->width -10,10,$black);
> > > $gd->string(gdSmallFont,20,10,'test' ,'$black);
> > >
> > > # print it as normal
> > > print $panel->png;
> > >
> > >
> > >
> > >
> > >> -----Original Message-----
> > >> From: bioperl-l-bounces at lists.open-bio.org
> > > [mailto:bioperl-l-bounces at lists.open-
> > >> bio.org] On Behalf Of sergei ryazansky
> > >> Sent: Monday, 28 April 2008 2:05 a.m.
> > >> To: bioperl-l at bioperl.org
> > >> Subject: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
> > >>
> > >> Hi all,
> > >>
> > >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
> > > panel
> > >> to obtain a file with image of both the chart and bioseq object?
> > >>
> > >> _______________________________________________
> > >> Bioperl-l mailing list
> > >> Bioperl-l at lists.open-bio.org
> > >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> > > =
> > >
> > =============================================================
> > =========
> > > Attention: The information contained in this message and/or
> > > attachments
> > > from AgResearch Limited is intended only for the persons or entities
> > > to which it is addressed and may contain confidential and/or
> > > privileged
> > > material. Any review, retransmission, dissemination or other use of,
> > > or
> > > taking of any action in reliance upon, this information by persons
> or
> > > entities other than the intended recipients is prohibited by
> > > AgResearch
> > > Limited. If you have received this message in error, please notify
> the
> > > sender immediately.
> > > =
> > >
> > =============================================================
> > =========
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING,
PLEASE CONTACT MY ASSISTANT,
SANDRA MICHELSEN, AT michelse at cshl.edu


From dr.hogart at gmail.com  Tue Apr 29 07:56:24 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Tue, 29 Apr 2008 11:56:24 +0400
Subject: [Bioperl-l] addition of GD::graphic object to Bio::Graphics
References: <op.t99vyoejavnppr@hogart.hackers>
	<D5DBA313349A4B458528BE63B387F36C06C76BC7@imail.agresearch.co.nz>
	<FC79BC1A-B547-4A15-9B9B-D633710526FB@uiuc.edu>
	<D5DBA313349A4B458528BE63B387F36C06D0D0CF@imail.agresearch.co.nz>
Message-ID: <op.uac4caojavnppr@hogart.img.ras.ru>

Thank you very much! It is exactly that I was looking for.

On Tue, 29 Apr 2008 01:15:16 +0400, Smithies, Russell  
<Russell.Smithies at agresearch.co.nz> wrote:

> I thought it was a bit of a hack but I guess if someone else is doing it
> too, it can't be all bad  :-)
>
> It looks like you can combine your drawing methods like this:
> (I'm sure Lincoln will tell us this is bad but it seems to work ok)
> ------------------------------------------------------------------------
> -------------
>
> #!perl -w
> use GD::Graph::lines;
> use GD::Graph::colour;
> use GD::Graph::Data;
>
> use Bio::Graphics;
> use Bio::SeqFeature::Generic;
>
> # create and draw on a graphics panel
> my $panel = Bio::Graphics::Panel->new(
>                                       -length => 500,
>                                       -width  => 500
>                                      );
> my $track = $panel->add_track(
>                               -glyph => 'generic',
>                               -label => 1
>                              );
>
> # create and add a few features
> for($i = 100; $i < 500; $i+= 100){
>   my $feature = Bio::SeqFeature::Generic->new(
>                                               -display_name => "feature:
> $i",
>                                               -score        => $i,
>                                               -start        => $i,
>                                               -end          => $i + 100
>                                              );
>   $track->add_feature($feature);
> }
>
>
> # create and draw the graph
> my @data = (
>     ["1st","2nd","3rd","4th","5th","6th","7th", "8th", "9th"],
>     [    1,    2,    5,    6,    3,  1.5,    1,     3,     4],
>     [ sort { $a <=> $b } (1, 2, 5, 6, 3, 1.5, 1, 3, 4) ]
> );
> my $graph = GD::Graph::lines->new(500, 300);
>
> $graph->set(
>       x_label           => 'X Label',
>       y_label           => 'Y label',
>       title             => 'Some simple graph',
>       y_max_value       => 8,
>       y_tick_number     => 8,
>       y_label_skip      => 2
> ) or die $graph->error;
>
> $graph->set( dclrs => [ qw( green blue black red pink) ] );
>
> my $gd = $graph->plot(\@data) or die $graph->error;
>
> # combine the two images
> my $combined = $panel->gd($gd);
>
> open(IMG, '>file.png') or die $!;
> binmode IMG;
> print IMG $combined->png;
>
> ------------------------------------------------------------------------
> ------------------
>
>> -----Original Message-----
>> From: Chris Fields [mailto:cjfields at uiuc.edu]
>> Sent: Monday, 28 April 2008 9:54 a.m.
>> To: Smithies, Russell
>> Cc: sergei ryazansky; bioperl-l at bioperl.org
>> Subject: Re: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
>>
>> I think this is how some of the synteny mapping is done using
>> SynBrowse (the trapezoids connecting syntenous genes on different
>> tracks).
>>
>> http://www.gmod.org/wiki/index.php/SynView
>>
>> chris
>>
>> On Apr 27, 2008, at 4:27 PM, Smithies, Russell wrote:
>>
>> > You can get the GD object back from the Bio::Graphics::Panel  then
>> > draw
>> > on it using GD methods
>> >
>> > Eg:
>> >
>> > #create a BioPerl panel
>> > my $panel = Bio::Graphics::Panel->new(
>> >                              			-length   => 600
>> >                              			-width    =>
> 800,
>> > 					-bgcolor  => 'white'
>> > 					);
>> > # add your features
>> > my $feature = Bio::SeqFeature::Generic->new( -start => 1,-end   =>
>> > 200,);
>> > $panel->add_track($feature, glyph   =>   'segments',
>> > 					-label   =>   0,
>> > 					-height  =>   30,
>> > 					-bgcolor  =>  'red',
>> > 					-fgcolor  => 'red'
>> > 					 );
>> >
>> > # grab the GD thingy
>> > my $gd = $panel->gd;
>> >
>> > #create a color - not sure if there's a better way?
>> > $black = $gd->colorAllocate(0,0,0);
>> >
>> > #draw on your GD thingy
>> > $gd->line(10,10,$panel->width -10,10,$black);
>> > $gd->string(gdSmallFont,20,10,'test' ,'$black);
>> >
>> > # print it as normal
>> > print $panel->png;
>> >
>> >
>> >
>> >
>> >> -----Original Message-----
>> >> From: bioperl-l-bounces at lists.open-bio.org
>> > [mailto:bioperl-l-bounces at lists.open-
>> >> bio.org] On Behalf Of sergei ryazansky
>> >> Sent: Monday, 28 April 2008 2:05 a.m.
>> >> To: bioperl-l at bioperl.org
>> >> Subject: [Bioperl-l] addition of GD::graphic object to
> Bio::Graphics
>> >>
>> >> Hi all,
>> >>
>> >> is it possible to add a GD::graphic object (chart) to Bio::Graphics
>> > panel
>> >> to obtain a file with image of both the chart and bioseq object?
>> >>
>> >> _______________________________________________
>> >> Bioperl-l mailing list
>> >> Bioperl-l at lists.open-bio.org
>> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> > =
>> >
>> =============================================================
>> =========
>> > Attention: The information contained in this message and/or
>> > attachments
>> > from AgResearch Limited is intended only for the persons or entities
>> > to which it is addressed and may contain confidential and/or
>> > privileged
>> > material. Any review, retransmission, dissemination or other use of,
>> > or
>> > taking of any action in reliance upon, this information by persons
> or
>> > entities other than the intended recipients is prohibited by
>> > AgResearch
>> > Limited. If you have received this message in error, please notify
> the
>> > sender immediately.
>> > =
>> >
>> =============================================================
>> =========
>> >
>> > _______________________________________________
>> > Bioperl-l mailing list
>> > Bioperl-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Christopher Fields
>> Postdoctoral Researcher
>> Lab of Dr. Robert Switzer
>> Dept of Biochemistry
>> University of Illinois Urbana-Champaign
>>
>>
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================


-- 
?????????? M2, ????????????? ???????? ?????????? Opera:  
http://www.opera.com/mail/mail/


From d.gatherer at mrcvu.gla.ac.uk  Tue Apr 29 12:21:05 2008
From: d.gatherer at mrcvu.gla.ac.uk (Derek Gatherer)
Date: Tue, 29 Apr 2008 13:21:05 +0100
Subject: [Bioperl-l] translate() oddities
Message-ID: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>

Hi

I thought I'd better run this by the community before I embarrass 
myself on Bugzilla.  It seems like a clear bug to me.  I'm running 
Bioperl 1.5.0 on RedHat.

For a test input:

 >test
ATGATGATGATGATGTGA

the following code is fine.

while((my $seqobj = $seq_in->next_seq()))
{
     print "\n".$seqobj->display_id;
     my $len  = $seqobj->length();
     print " length: $len";
     my $frame1_obj = $seqobj->translate();
     my $f1_prot = $frame1_obj->seq();
     print "\n$f1_prot";
}

Output:

test length: 18
MMMMM*

But if I want to change the frame as specified in the BioPerl 
tutorial, by using:

my $frame1_obj = $seqobj->translate(frame => 1); # which should now 
give frame 2, I get:

test length: 18
MMMMM-frame

The frame is unchanged and the text "-frame" is tacked on the end of 
the output.  The same occurs with translate(frame => 2).

Any ideas?  Can something as fundamental as translate() really be 
bugged?  or am I guilty of some particularly heinous syntax error?

Cheers
Derek


From tristan.lefebure at gmail.com  Tue Apr 29 13:58:21 2008
From: tristan.lefebure at gmail.com (Tristan Lefebure)
Date: Tue, 29 Apr 2008 09:58:21 -0400
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
Message-ID: <200804290958.21548.tristan.lefebure@gmail.com>

Aren't you forgetting the dash?

my $frame1_obj = $seqobj->translate(-frame => 1)


On Tuesday 29 April 2008 08:21:05 Derek Gatherer wrote:
> my $frame1_obj = $seqobj->translate(frame => 1)


-Tristan


From d.gatherer at mrcvu.gla.ac.uk  Tue Apr 29 14:05:03 2008
From: d.gatherer at mrcvu.gla.ac.uk (Derek Gatherer)
Date: Tue, 29 Apr 2008 15:05:03 +0100
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <481726BF.1060609@bms.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<481726BF.1060609@bms.com>
Message-ID: <E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>

Thanks Stefan

Actually, there was a typo in my message, I did use -frame => 
1.  However, the problem disappears on upgrading from 1.5.0 to 1.5.2.

So not a bug anymore.

Cheers
Derek

At 14:46 29/04/2008, Stefan Kirov wrote:
>my $frame1_obj = $seqobj->translate(-frame => 1);
>not
>my $frame1_obj = $seqobj->translate(frame => 1);
>Stefan
>
>Derek Gatherer wrote:
> > Hi
> >
> > I thought I'd better run this by the community before I embarrass
> > myself on Bugzilla.  It seems like a clear bug to me.  I'm running
> > Bioperl 1.5.0 on RedHat.
> >
> > For a test input:
> >
> > >test
> > ATGATGATGATGATGTGA
> >
> > the following code is fine.
> >
> > while((my $seqobj = $seq_in->next_seq()))
> > {
> >     print "\n".$seqobj->display_id;
> >     my $len  = $seqobj->length();
> >     print " length: $len";
> >     my $frame1_obj = $seqobj->translate();
> >     my $f1_prot = $frame1_obj->seq();
> >     print "\n$f1_prot";
> > }
> >
> > Output:
> >
> > test length: 18
> > MMMMM*
> >
> > But if I want to change the frame as specified in the BioPerl
> > tutorial, by using:
> >
> > my $frame1_obj = $seqobj->translate(frame => 1); # which should now
> > give frame 2, I get:
> >
> > test length: 18
> > MMMMM-frame
> >
> > The frame is unchanged and the text "-frame" is tacked on the end of
> > the output.  The same occurs with translate(frame => 2).
> >
> > Any ideas?  Can something as fundamental as translate() really be
> > bugged?  or am I guilty of some particularly heinous syntax error?
> >
> > Cheers
> > Derek
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >


From l.douchy at gmail.com  Tue Apr 29 14:16:40 2008
From: l.douchy at gmail.com (Laurent DOUCHY)
Date: Tue, 29 Apr 2008 16:16:40 +0200
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <200804290958.21548.tristan.lefebure@gmail.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<200804290958.21548.tristan.lefebure@gmail.com>
Message-ID: <2fb209dd0804290716x36e403dek55978dc4f54e34ff@mail.gmail.com>

Hello,

I resolved this issue in Bio::seqIO with the following line :

my $sequence = $seq->translate('*', 'X', '0', '1', '0', '0', '0', '0')->seq;
the third parameter set the frame.

I hope to have been helpful.

laurent.

On Tue, Apr 29, 2008 at 3:58 PM, Tristan Lefebure <
tristan.lefebure at gmail.com> wrote:

> Aren't you forgetting the dash?
>
> my $frame1_obj = $seqobj->translate(-frame => 1)
>
>
> On Tuesday 29 April 2008 08:21:05 Derek Gatherer wrote:
> > my $frame1_obj = $seqobj->translate(frame => 1)
>
>
>
> -Tristan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From roy.chaudhuri at gmail.com  Tue Apr 29 14:27:10 2008
From: roy.chaudhuri at gmail.com (Roy Chaudhuri)
Date: Tue, 29 Apr 2008 15:27:10 +0100
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>	<481726BF.1060609@bms.com>
	<E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
Message-ID: <4817303E.1040903@gmail.com>

Spent two minutes looking at this, so may as well chip in with what I 
discovered even though you solved your problem.

This "bug" comes about because in version 1.5.1 and earlier, the 
arguments to translate were a simple list, with the first argument the 
terminator (defaults to "*"). Your old version therefore assumed that 
you wanted to translate the stop codon to "-frame". Amusingly given your 
typo, if you miss the hyphen off the frame argument in version 1.5.2 it 
reverts to the old interface and you end up with the output 
"MMMMMframe". The moral of the story is of course to read the docs 
relevant to the version you are using.

Roy.
--
Dr. Roy Chaudhuri
Department of Veterinary Medicine
University of Cambridge, U.K.

Derek Gatherer wrote:
> Thanks Stefan
> 
> Actually, there was a typo in my message, I did use -frame => 
> 1.  However, the problem disappears on upgrading from 1.5.0 to 1.5.2.
> 
> So not a bug anymore.
> 
> Cheers
> Derek
> 
> At 14:46 29/04/2008, Stefan Kirov wrote:
>> my $frame1_obj = $seqobj->translate(-frame => 1);
>> not
>> my $frame1_obj = $seqobj->translate(frame => 1);
>> Stefan
>>
>> Derek Gatherer wrote:
>>> Hi
>>>
>>> I thought I'd better run this by the community before I embarrass
>>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>>> Bioperl 1.5.0 on RedHat.
>>>
>>> For a test input:
>>>
>>>> test
>>> ATGATGATGATGATGTGA
>>>
>>> the following code is fine.
>>>
>>> while((my $seqobj = $seq_in->next_seq()))
>>> {
>>>     print "\n".$seqobj->display_id;
>>>     my $len  = $seqobj->length();
>>>     print " length: $len";
>>>     my $frame1_obj = $seqobj->translate();
>>>     my $f1_prot = $frame1_obj->seq();
>>>     print "\n$f1_prot";
>>> }
>>>
>>> Output:
>>>
>>> test length: 18
>>> MMMMM*
>>>
>>> But if I want to change the frame as specified in the BioPerl
>>> tutorial, by using:
>>>
>>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>>> give frame 2, I get:
>>>
>>> test length: 18
>>> MMMMM-frame
>>>
>>> The frame is unchanged and the text "-frame" is tacked on the end of
>>> the output.  The same occurs with translate(frame => 2).
>>>
>>> Any ideas?  Can something as fundamental as translate() really be
>>> bugged?  or am I guilty of some particularly heinous syntax error?
>>>
>>> Cheers
>>> Derek
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l


From stefan.kirov at bms.com  Tue Apr 29 13:46:39 2008
From: stefan.kirov at bms.com (Stefan Kirov)
Date: Tue, 29 Apr 2008 09:46:39 -0400
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
Message-ID: <481726BF.1060609@bms.com>

my $frame1_obj = $seqobj->translate(-frame => 1);
not
my $frame1_obj = $seqobj->translate(frame => 1);
Stefan

Derek Gatherer wrote:
> Hi
>
> I thought I'd better run this by the community before I embarrass
> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
> Bioperl 1.5.0 on RedHat.
>
> For a test input:
>
> >test
> ATGATGATGATGATGTGA
>
> the following code is fine.
>
> while((my $seqobj = $seq_in->next_seq()))
> {
>     print "\n".$seqobj->display_id;
>     my $len  = $seqobj->length();
>     print " length: $len";
>     my $frame1_obj = $seqobj->translate();
>     my $f1_prot = $frame1_obj->seq();
>     print "\n$f1_prot";
> }
>
> Output:
>
> test length: 18
> MMMMM*
>
> But if I want to change the frame as specified in the BioPerl
> tutorial, by using:
>
> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
> give frame 2, I get:
>
> test length: 18
> MMMMM-frame
>
> The frame is unchanged and the text "-frame" is tacked on the end of
> the output.  The same occurs with translate(frame => 2).
>
> Any ideas?  Can something as fundamental as translate() really be
> bugged?  or am I guilty of some particularly heinous syntax error?
>
> Cheers
> Derek
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From cjfields at uiuc.edu  Tue Apr 29 15:00:00 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 29 Apr 2008 10:00:00 -0500
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <4817303E.1040903@gmail.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>	<481726BF.1060609@bms.com>
	<E1JqqS7-0001uA-00@hillend.cent.gla.ac.uk>
	<4817303E.1040903@gmail.com>
Message-ID: <36045A08-AEA8-4639-A384-1DC53B5DC129@uiuc.edu>

Yes the interface changed somewhat post 1.5.1, mainly to accept named  
parameters.  I think a few methods do this now as passing in lists of  
more than 2 args, undef'ing those one doesn't want set, gets confusing.

chris

On Apr 29, 2008, at 9:27 AM, Roy Chaudhuri wrote:

> Spent two minutes looking at this, so may as well chip in with what  
> I discovered even though you solved your problem.
>
> This "bug" comes about because in version 1.5.1 and earlier, the  
> arguments to translate were a simple list, with the first argument  
> the terminator (defaults to "*"). Your old version therefore assumed  
> that you wanted to translate the stop codon to "-frame". Amusingly  
> given your typo, if you miss the hyphen off the frame argument in  
> version 1.5.2 it reverts to the old interface and you end up with  
> the output "MMMMMframe". The moral of the story is of course to read  
> the docs relevant to the version you are using.
>
> Roy.
> --
> Dr. Roy Chaudhuri
> Department of Veterinary Medicine
> University of Cambridge, U.K.
>
> Derek Gatherer wrote:
>> Thanks Stefan
>> Actually, there was a typo in my message, I did use -frame => 1.   
>> However, the problem disappears on upgrading from 1.5.0 to 1.5.2.
>> So not a bug anymore.
>> Cheers
>> Derek
>> At 14:46 29/04/2008, Stefan Kirov wrote:
>>> my $frame1_obj = $seqobj->translate(-frame => 1);
>>> not
>>> my $frame1_obj = $seqobj->translate(frame => 1);
>>> Stefan
>>>
>>> Derek Gatherer wrote:
>>>> Hi
>>>>
>>>> I thought I'd better run this by the community before I embarrass
>>>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>>>> Bioperl 1.5.0 on RedHat.
>>>>
>>>> For a test input:
>>>>
>>>>> test
>>>> ATGATGATGATGATGTGA
>>>>
>>>> the following code is fine.
>>>>
>>>> while((my $seqobj = $seq_in->next_seq()))
>>>> {
>>>>    print "\n".$seqobj->display_id;
>>>>    my $len  = $seqobj->length();
>>>>    print " length: $len";
>>>>    my $frame1_obj = $seqobj->translate();
>>>>    my $f1_prot = $frame1_obj->seq();
>>>>    print "\n$f1_prot";
>>>> }
>>>>
>>>> Output:
>>>>
>>>> test length: 18
>>>> MMMMM*
>>>>
>>>> But if I want to change the frame as specified in the BioPerl
>>>> tutorial, by using:
>>>>
>>>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>>>> give frame 2, I get:
>>>>
>>>> test length: 18
>>>> MMMMM-frame
>>>>
>>>> The frame is unchanged and the text "-frame" is tacked on the end  
>>>> of
>>>> the output.  The same occurs with translate(frame => 2).
>>>>
>>>> Any ideas?  Can something as fundamental as translate() really be
>>>> bugged?  or am I guilty of some particularly heinous syntax error?
>>>>
>>>> Cheers
>>>> Derek
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From cjfields at uiuc.edu  Tue Apr 29 15:07:30 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Tue, 29 Apr 2008 10:07:30 -0500
Subject: [Bioperl-l] translate() oddities
In-Reply-To: <481726BF.1060609@bms.com>
References: <E1Jqopm-0006xM-00@hillend.cent.gla.ac.uk>
	<481726BF.1060609@bms.com>
Message-ID: <18DB95FB-52B9-4091-ACEE-996891F8A5AE@uiuc.edu>

As an aside, I've been playing around with perl6 (Rakudo) for a bit  
now.  Parameter-like passing (using autoaccessors and other means)  
will be added in soon, so you will be able to do this:

$seqobj = Seq.new(seq => 'ATGATGATGATGATGTGA', alphabet => 'dna');
my $protobj = $seq.translate(frame => 1);

Yes, I'm a geek. ; >

chris

On Apr 29, 2008, at 8:46 AM, Stefan Kirov wrote:

> my $frame1_obj = $seqobj->translate(-frame => 1);
> not
> my $frame1_obj = $seqobj->translate(frame => 1);
> Stefan
>
> Derek Gatherer wrote:
>> Hi
>>
>> I thought I'd better run this by the community before I embarrass
>> myself on Bugzilla.  It seems like a clear bug to me.  I'm running
>> Bioperl 1.5.0 on RedHat.
>>
>> For a test input:
>>
>>> test
>> ATGATGATGATGATGTGA
>>
>> the following code is fine.
>>
>> while((my $seqobj = $seq_in->next_seq()))
>> {
>>    print "\n".$seqobj->display_id;
>>    my $len  = $seqobj->length();
>>    print " length: $len";
>>    my $frame1_obj = $seqobj->translate();
>>    my $f1_prot = $frame1_obj->seq();
>>    print "\n$f1_prot";
>> }
>>
>> Output:
>>
>> test length: 18
>> MMMMM*
>>
>> But if I want to change the frame as specified in the BioPerl
>> tutorial, by using:
>>
>> my $frame1_obj = $seqobj->translate(frame => 1); # which should now
>> give frame 2, I get:
>>
>> test length: 18
>> MMMMM-frame
>>
>> The frame is unchanged and the text "-frame" is tacked on the end of
>> the output.  The same occurs with translate(frame => 2).
>>
>> Any ideas?  Can something as fundamental as translate() really be
>> bugged?  or am I guilty of some particularly heinous syntax error?
>>
>> Cheers
>> Derek
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dr.hogart at gmail.com  Tue Apr 29 15:57:51 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Tue, 29 Apr 2008 19:57:51 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
Message-ID: <op.uadqmpg8avnppr@hogart.img.ras.ru>

Hi all!

I am trying to perform TCoffe aligment by  
Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the script.  
This subroutine works fine, but it is not single subroutine - there are a  
lot of other ones in the script. The problem is when compilation of script  
finish execution (nb! successful execution) of tcoffee subroutine the  
compiliation of the end of the script also interrupted. It seems that the  
tcoffee program itself induce interraption of perl compilation. Is it  
possible to pass this problem?

-- 


From darin.london at duke.edu  Tue Apr 29 16:49:53 2008
From: darin.london at duke.edu (darin.london at duke.edu)
Date: Tue, 29 Apr 2008 12:49:53 -0400
Subject: [Bioperl-l] BOSC 2008 Announcement and Call For Submissions
Message-ID: <200804291650.m3TGnr0H020814@tenero.duhs.duke.edu>


BOSC 2008 Call for Abstracts Reminder

The 9th annual Bioinformatics Open Source Conference (BOSC 2008) will take place in Toronto, Ontario, Canada, as one of several Special Interest Group (SIG) meetings occurring in conjunction with the 16th annual Intelligent Systems for Molecular Biology Conference (ISMB 2008).

This is a reminder to submit your proposals for talks to the BOSC submission system before May 11.

Submission Process:
All abstracts must be submitted through our Open Conference Systems site (http://events.open-bio.org/BOSC2008/openconf.php).
The form will ask for a small Abstract Text to be pasted into it, and a full paper.  The small Abstract text should be a summary, while the longer abstract (should provide more details, including the open-source license requirement details)
Full-length abstracts are limited to one page with one inch (2.5 cm) margins on the top, sides, and bottom.  The full-length abstract should include the title, authors, and affiliations.  We prefer your abstract to be in PDF format, although plain t

Important Dates:
May 11: Abstract submission deadline.
June 2: Notification of accepted talks.
June 4: Early registration discount cut-off.
July 18-19: BOSC 2008!

We hope to see you at BOSC 2008!

Kam Dahlquist and Darin London
BOSC 2008 Co-organizers

			 
From bix at sendu.me.uk  Tue Apr 29 16:54:41 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Tue, 29 Apr 2008 17:54:41 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uadqmpg8avnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
Message-ID: <481752D1.7010904@sendu.me.uk>

sergei ryazansky wrote:
> I am trying to perform TCoffe aligment by 
> Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the 
> script. This subroutine works fine, but it is not single subroutine - 
> there are a lot of other ones in the script. The problem is when 
> compilation of script finish execution (nb! successful execution) of 
> tcoffee subroutine the compiliation of the end of the script also 
> interrupted. It seems that the tcoffee program itself induce 
> interraption of perl compilation. Is it possible to pass this problem?

You'll have to supply us with a minimal version of the script and the 
complete error message.


From dr.hogart at gmail.com  Wed Apr 30 11:24:35 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 15:24:35 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
Message-ID: <op.uae8m9tzavnppr@hogart.img.ras.ru>

On Tue, 29 Apr 2008 19:57:51 +0400, sergei ryazansky <dr.hogart at gmail.com>  
wrote:

> Hi all!
>
> I am trying to perform TCoffe aligment by  
> Bio::Tools::Run::Alignment::TCoffee wrapper as subroutine into the  
> script. This subroutine works fine, but it is not single subroutine -  
> there are a lot of other ones in the script. The problem is when  
> compilation of script finish execution (nb! successful execution) of  
> tcoffee subroutine the compiliation of the end of the script also  
> interrupted. It seems that the tcoffee program itself induce  
> interraption of perl compilation. Is it possible to pass this problem?
>


My subroutine is following:

sub align {
	my $file=shift @_;
	my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 'fasta',  
'outfile' => 'temp_align.out');
	my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
	my $aln=$factory->align ($file);
	open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
	return @temp_file;
}

This subroutine is called by the following command:

my @align_fa = align($inputfile_align);

After successful execution of this subroutine (accompaning with the  
corresponding messages on the terminal window) the execution of remainder  
script is terminated without any error messages.

-- 


From bix at sendu.me.uk  Wed Apr 30 12:47:17 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 13:47:17 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uae8m9tzavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
Message-ID: <48186A55.4030406@sendu.me.uk>

sergei ryazansky wrote:
> My subroutine is following:
> 
> sub align {
>     my $file=shift @_;
>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
> 'fasta', 'outfile' => 'temp_align.out');
>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>     my $aln=$factory->align ($file);
>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>     return @temp_file;
> }
> 
> This subroutine is called by the following command:
> 
> my @align_fa = align($inputfile_align);
> 
> After successful execution of this subroutine (accompaning with the 
> corresponding messages on the terminal window) the execution of 
> remainder script is terminated without any error messages.

The problem lies somewhere within the rest of your script, so we have to 
see it if you want help.

Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you 
don't make use of the resulting alignment object? A system call might 
make more sense given what you're doing. The beauty of 
Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the 
result file (temp_align.out) yourself.


From dr.hogart at gmail.com  Wed Apr 30 13:36:58 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 17:36:58 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
Message-ID: <op.uaferwytavnppr@hogart.img.ras.ru>

On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:

> sergei ryazansky wrote:
>> My subroutine is following:
>>  sub align {
>>     my $file=shift @_;
>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>  
>> 'fasta', 'outfile' => 'temp_align.out');
>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>     my $aln=$factory->align ($file);
>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>     return @temp_file;
>> }
>>  This subroutine is called by the following command:
>>  my @align_fa = align($inputfile_align);
>>  After successful execution of this subroutine (accompaning with the  
>> corresponding messages on the terminal window) the execution of  
>> remainder script is terminated without any error messages.
>
> The problem lies somewhere within the rest of your script, so we have to  
> see it if you want help.
>
> Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you  
> don't make use of the resulting alignment object? A system call might  
> make more sense given what you're doing. The beauty of  
> Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the  
> result file (temp_align.out) yourself.

The rest of script,imho, is ok, because without this sub it is work fine.  
May be problem lies into the TCoffee itself?

One of the feature of script is to estimate the quantity of nt changes in  
each position in the different similar sequences in comparing with  
consensus sequences. To perform this it is nesseccary to obtain the  
multiply alignment: the result of TCoffee alignment goes to another  
subroutine, that estemated the level of changes. Of course, I dont think  
that this way is the best approach, most probably there are a lot of the  
better ways to do it. But for my today purposes it is ok.

-- 


From avilella at gmail.com  Wed Apr 30 14:16:56 2008
From: avilella at gmail.com (Albert Vilella)
Date: Wed, 30 Apr 2008 15:16:56 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru> <48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>

Hi Sergei,

Can you try to isolate this call with a simpler example to see if it still
fails? When you say that the problems are in the compilation, do you mean
that the interpreter won't even compile or that it fails during execution?
Have you checked that you have all the dependencies right?

Cheers,

    Albert.

On Wed, Apr 30, 2008 at 2:36 PM, sergei ryazansky <dr.hogart at gmail.com>
wrote:

> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
>
>  sergei ryazansky wrote:
> >
> > > My subroutine is following:
> > >  sub align {
> > >    my $file=shift @_;
> > >    my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>
> > > 'fasta', 'outfile' => 'temp_align.out');
> > >    my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
> > >    my $aln=$factory->align ($file);
> > >    open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
> > >    return @temp_file;
> > > }
> > >  This subroutine is called by the following command:
> > >  my @align_fa = align($inputfile_align);
> > >  After successful execution of this subroutine (accompaning with the
> > > corresponding messages on the terminal window) the execution of remainder
> > > script is terminated without any error messages.
> > >
> >
> > The problem lies somewhere within the rest of your script, so we have to
> > see it if you want help.
> >
> > Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you
> > don't make use of the resulting alignment object? A system call might make
> > more sense given what you're doing. The beauty of
> > Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse the
> > result file (temp_align.out) yourself.
> >
>
> The rest of script,imho, is ok, because without this sub it is work fine.
> May be problem lies into the TCoffee itself?
>
> One of the feature of script is to estimate the quantity of nt changes in
> each position in the different similar sequences in comparing with consensus
> sequences. To perform this it is nesseccary to obtain the multiply
> alignment: the result of TCoffee alignment goes to another subroutine, that
> estemated the level of changes. Of course, I dont think that this way is the
> best approach, most probably there are a lot of the better ways to do it.
> But for my today purposes it is ok.
>
> --
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


From bix at sendu.me.uk  Wed Apr 30 14:22:01 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 15:22:01 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <48188089.8000300@sendu.me.uk>

sergei ryazansky wrote:
> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
> 
>> sergei ryazansky wrote:
>>> My subroutine is following:
>>>  sub align {
>>>     my $file=shift @_;
>>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
>>> 'fasta', 'outfile' => 'temp_align.out');
>>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>>     my $aln=$factory->align ($file);
>>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>>     return @temp_file;
>>> }
>>>  This subroutine is called by the following command:
>>>  my @align_fa = align($inputfile_align);
>>>  After successful execution of this subroutine (accompaning with the 
>>> corresponding messages on the terminal window) the execution of 
>>> remainder script is terminated without any error messages.
>>
>> The problem lies somewhere within the rest of your script, so we have 
>> to see it if you want help.
> 
> The rest of script,imho, is ok, because without this sub it is work 
> fine. May be problem lies into the TCoffee itself?

I've run your subroutine in a simple script of my own and it doesn't 
cause script termination. Again, the problem lies elsewhere in your 
script. Supply it or it is impossible for anyone to help you.


From Sebastien.Moretti at unil.ch  Wed Apr 30 14:06:28 2008
From: Sebastien.Moretti at unil.ch (Sebastien MORETTI)
Date: Wed, 30 Apr 2008 16:06:28 +0200
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uaferwytavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
Message-ID: <48187CE4.8030606@unil.ch>

>>> My subroutine is following:
>>>  sub align {
>>>     my $file=shift @_;
>>>     my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' => 
>>> 'fasta', 'outfile' => 'temp_align.out');
>>>     my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>>>     my $aln=$factory->align ($file);
>>>     open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>>>     return @temp_file;
>>> }
>>>  This subroutine is called by the following command:
>>>  my @align_fa = align($inputfile_align);
>>>  After successful execution of this subroutine (accompaning with the 
>>> corresponding messages on the terminal window) the execution of 
>>> remainder script is terminated without any error messages.
>>
>> The problem lies somewhere within the rest of your script, so we have 
>> to see it if you want help.
>>
>> Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you 
>> don't make use of the resulting alignment object? A system call might 
>> make more sense given what you're doing. The beauty of 
>> Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse 
>> the result file (temp_align.out) yourself.
> 
> The rest of script,imho, is ok, because without this sub it is work 
> fine. May be problem lies into the TCoffee itself?
> 
> One of the feature of script is to estimate the quantity of nt changes 
> in each position in the different similar sequences in comparing with 
> consensus sequences. To perform this it is nesseccary to obtain the 
> multiply alignment: the result of TCoffee alignment goes to another 
> subroutine, that estemated the level of changes. Of course, I dont think 
> that this way is the best approach, most probably there are a lot of the 
> better ways to do it. But for my today purposes it is ok.

Do you have tried to use the tcoffee command, called via bioperl, as a 
command line ?
To check if it is a problem with tcoffee or with the tcoffee release 
that bioperl must use.

-- 
S?bastien Moretti


From dr.hogart at gmail.com  Wed Apr 30 14:54:59 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 18:54:59 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
Message-ID: <op.uafidxitavnppr@hogart.img.ras.ru>

Hi Albert,

The isolated call is executed without any problem, so the code is  
absolutely correct. The problem arise when this sub executed within the  
whole script - after successful execution of TCoffee alignment the  
execution of the rest of script is terminated. The whole code is very big  
(~500 lines), so for simplicity lets imagine the sheme of script in the  
following view:
sub1;
sub2;
sub3;
sub align;  # TCoffe alignment;
sub4;
sub5;

Each sub (subroutine) is independent from the others subs; The order of  
script execution is 1,2,3,align,4,5. But after the execution of align the  
execution of the rest of subs (4 and 5) is terminated. The script without  
sub align {} successfully execute the sub 4 and sub 5. So, I mean that  
interpreter won't compile sub 4 and 5 if sub align is placed before them.

On Wed, 30 Apr 2008 18:16:56 +0400, Albert Vilella <avilella at gmail.com>  
wrote:

> Hi Sergei,
>
> Can you try to isolate this call with a simpler example to see if it  
> still
> fails? When you say that the problems are in the compilation, do you mean
> that the interpreter won't even compile or that it fails during  
> execution?
> Have you checked that you have all the dependencies right?
>
> Cheers,
>
>     Albert.
>
> On Wed, Apr 30, 2008 at 2:36 PM, sergei ryazansky <dr.hogart at gmail.com>
> wrote:
>
>> On Wed, 30 Apr 2008 16:47:17 +0400, Sendu Bala <bix at sendu.me.uk> wrote:
>>
>>  sergei ryazansky wrote:
>> >
>> > > My subroutine is following:
>> > >  sub align {
>> > >    my $file=shift @_;
>> > >    my @params = ('ktuple' => 2,'matrix' => 'BLOSUM', 'output' =>
>> > > 'fasta', 'outfile' => 'temp_align.out');
>> > >    my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params);
>> > >    my $aln=$factory->align ($file);
>> > >    open (fy,'temp_align.out'); my @temp_file=<fy>; close fy;
>> > >    return @temp_file;
>> > > }
>> > >  This subroutine is called by the following command:
>> > >  my @align_fa = align($inputfile_align);
>> > >  After successful execution of this subroutine (accompaning with the
>> > > corresponding messages on the terminal window) the execution of  
>> remainder
>> > > script is terminated without any error messages.
>> > >
>> >
>> > The problem lies somewhere within the rest of your script, so we have  
>> to
>> > see it if you want help.
>> >
>> > Why are you using Bio::Tools::Run::Alignment::TCoffee at all if you
>> > don't make use of the resulting alignment object? A system call might  
>> make
>> > more sense given what you're doing. The beauty of
>> > Bio::Tools::Run::Alignment::TCoffee is that you don't have to parse  
>> the
>> > result file (temp_align.out) yourself.
>> >
>>
>> The rest of script,imho, is ok, because without this sub it is work  
>> fine.
>> May be problem lies into the TCoffee itself?
>>
>> One of the feature of script is to estimate the quantity of nt changes  
>> in
>> each position in the different similar sequences in comparing with  
>> consensus
>> sequences. To perform this it is nesseccary to obtain the multiply
>> alignment: the result of TCoffee alignment goes to another subroutine,  
>> that
>> estemated the level of changes. Of course, I dont think that this way  
>> is the
>> best approach, most probably there are a lot of the better ways to do  
>> it.
>> But for my today purposes it is ok.
>>
>> --
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>


-- 
?????????? M2, ????????????? ???????? ?????????? Opera:  
http://www.opera.com/mail/mail/


From dr.hogart at gmail.com  Wed Apr 30 15:14:09 2008
From: dr.hogart at gmail.com (sergei ryazansky)
Date: Wed, 30 Apr 2008 19:14:09 +0400
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru> <48187CE4.8030606@unil.ch>
Message-ID: <op.uafi7ytravnppr@hogart.img.ras.ru>

No, I didn tried.
To tell the truth the problem like this I have obtatin earlier. I simply  
wanted to aling the several set of sequences by TCoffee Bioperl package.  
The script should have been consequently add the set one after another to  
TCoffee wrapper. But after the alignment of the first set of sequences the  
alignment of the rest sets was terminated. So it was neccessary to use  
another "super_script" that called first script with different arguments  
linked to the corresponding set.


> Do you have tried to use the tcoffee command, called via bioperl, as a  
> command line ?


-- 


From bix at sendu.me.uk  Wed Apr 30 15:28:50 2008
From: bix at sendu.me.uk (Sendu Bala)
Date: Wed, 30 Apr 2008 16:28:50 +0100
Subject: [Bioperl-l] alignment by TCoffee as a subroutine
In-Reply-To: <op.uafidxitavnppr@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>	<op.uae8m9tzavnppr@hogart.img.ras.ru>	<48186A55.4030406@sendu.me.uk>	<op.uaferwytavnppr@hogart.img.ras.ru>	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru>
Message-ID: <48189032.20102@sendu.me.uk>

sergei ryazansky wrote:
> Hi Albert,
> 
> The isolated call is executed without any problem, so the code is 
> absolutely correct. The problem arise when this sub executed within the 
> whole script - after successful execution of TCoffee alignment the 
> execution of the rest of script is terminated. The whole code is very 
> big (~500 lines), so for simplicity lets imagine the sheme of script in 
> the following view:
> sub1;
> sub2;
> sub3;
> sub align;  # TCoffe alignment;
> sub4;
> sub5;
> 
> Each sub (subroutine) is independent from the others subs; The order of 
> script execution is 1,2,3,align,4,5. But after the execution of align 
> the execution of the rest of subs (4 and 5) is terminated. The script 
> without sub align {} successfully execute the sub 4 and sub 5. So, I 
> mean that interpreter won't compile sub 4 and 5 if sub align is placed 
> before them.

This has nothing to do with interpreter compilation, which is successful 
if the script runs at all.

What do you do with the output of &align? The thing you are doing with 
that output is most likely the cause of your script terminating, which 
is why &sub4 and &sub5 run when you don't run &align (have no output 
that causes the problem).

If you're not willing to show us your script, here are some simple 
debugging steps you can do yourself:

# don't do anything with the output of align() - does &sub4 still run?

# add some print statements after you call align(), and then after every 
further block of code in your script to see exactly where the script 
terminates

# reduce your script down to a minimal script that shows the problem 
(with the help of the previous step) and show us that


From dr.hogart at gmail.com  Wed Apr 30 15:42:41 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 19:42:41 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafkhojw9ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
Message-ID: <op.uafklfmd9ju7si@hogart.img.ras.ru>


------- Forwarded message -------
From: "Sergei Ryazansky" <dr.hogart at gmail.com>
To: "Sendu Bala" <bix at sendu.me.uk>
Cc:
Subject: Re: [Bioperl-l] alignment by TCoffee as a subroutine
Date: Wed, 30 Apr 2008 19:40:26 +0400

> What do you do with the output of &align? The thing you are doing with  
> that output is most likely the cause of your script terminating, which  
> is why &sub4 and &sub5 run when you don't run &align (have no output  
> that causes the problem).

please sea my answer to Sebastien Moretti - there are description of
another similar problem. The only thing that I did there with output is
printing to file. Nevetheless the problem was the same.

> # don't do anything with the output of align() - does &sub4 still run?

please sea above.

> # add some print statements after you call align(), and then after every  
> further block of code in your script to see exactly where the script  
> terminates
> # reduce your script down to a minimal script that shows the problem  
> (with the help of the previous step) and show us that

all tests with individual bloks was performed earlier. the results is ok.


From cjfields at uiuc.edu  Wed Apr 30 16:25:06 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 11:25:06 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafklfmd9ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
Message-ID: <5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>

Sergei,

I agree with Sendu; we can't diagnose this unless we either have the  
entire script of a minimal version of it demonstrating the bug.

The best way to handle this is to file a bug report, attaching  
relevant data using the 'Create a new attachment' link (including  
either the full script or a shortened one which demonstrates the bug).  
Otherwise we're just shooting in the dark trying to diagnose the  
problem.

http://bugzilla.open-bio.org/

chris

On Apr 30, 2008, at 10:42 AM, Sergei Ryazansky wrote:

>
>
> ------- Forwarded message -------
> From: "Sergei Ryazansky" <dr.hogart at gmail.com>
> To: "Sendu Bala" <bix at sendu.me.uk>
> Cc:
> Subject: Re: [Bioperl-l] alignment by TCoffee as a subroutine
> Date: Wed, 30 Apr 2008 19:40:26 +0400
>
>> What do you do with the output of &align? The thing you are doing  
>> with that output is most likely the cause of your script  
>> terminating, which is why &sub4 and &sub5 run when you don't run  
>> &align (have no output that causes the problem).
>
> please sea my answer to Sebastien Moretti - there are description of
> another similar problem. The only thing that I did there with output  
> is
> printing to file. Nevetheless the problem was the same.
>
>> # don't do anything with the output of align() - does &sub4 still  
>> run?
>
> please sea above.
>
>> # add some print statements after you call align(), and then after  
>> every further block of code in your script to see exactly where the  
>> script terminates
>> # reduce your script down to a minimal script that shows the  
>> problem (with the help of the previous step) and show us that
>
> all tests with individual bloks was performed earlier. the results  
> is ok.
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign


From dr.hogart at gmail.com  Wed Apr 30 16:40:19 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 20:40:19 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
Message-ID: <op.uafm9hl79ju7si@hogart.img.ras.ru>

On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu> wrote:

Chris, I have already sent file to Sendu and also I am attaching it here.  
I have removed from it really unnecessary parts.

> Sergei,
>
> I agree with Sendu; we can't diagnose this unless we either have the  
> entire script of a minimal version of it demonstrating the bug.
>
> The best way to handle this is to file a bug report, attaching relevant  
> data using the 'Create a new attachment' link (including either the full  
> script or a shortened one which demonstrates the bug). Otherwise we're  
> just shooting in the dark trying to diagnose the problem.
>
> http://bugzilla.open-bio.org/
>
> chris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: script.pl
Type: application/octet-stream
Size: 6870 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20080430/6aef0fde/attachment-0004.obj>

From cjfields at uiuc.edu  Wed Apr 30 17:02:19 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 12:02:19 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafm9hl79ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
Message-ID: <EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>

Hmm, maybe you were confused?  From my last email:

"The best way to handle this is to file a bug report, attaching  
relevant data using the 'Create a new attachment' link (including  
either the full script or a shortened one which demonstrates the bug).  
Otherwise we're just shooting in the dark trying to diagnose the  
problem."

http://bugzilla.open-bio.org/

Anyone can work on fixing the issue there (so it'll probably get fixed  
faster).  The devs can also track progress on the problem via the dev  
mail list (bioperl-guts).  Diagnosing the bug may also reveal issues  
not just with Bio::Tools::Run::Alignment::TCoffee but also with other  
related modules.

If needed I can post it to bugzilla, but it helps to submit the bug  
yourself (so you can receive posts on it's progress).

chris

On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:

> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu>  
> wrote:
>
> Chris, I have already sent file to Sendu and also I am attaching it  
> here. I have removed from it really unnecessary parts.
>
>> Sergei,
>>
>> I agree with Sendu; we can't diagnose this unless we either have  
>> the entire script of a minimal version of it demonstrating the bug.
>>
>> The best way to handle this is to file a bug report, attaching  
>> relevant data using the 'Create a new attachment' link (including  
>> either the full script or a shortened one which demonstrates the  
>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>> the problem.
>>
>> http://bugzilla.open-bio.org/
>>
>> chris


From dr.hogart at gmail.com  Wed Apr 30 17:39:35 2008
From: dr.hogart at gmail.com (Sergei Ryazansky)
Date: Wed, 30 Apr 2008 21:39:35 +0400
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafop6079ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
	<EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>
	<op.uafop6079ju7si@hogart.img.ras.ru>
Message-ID: <op.uafpz9n79ju7si@hogart.img.ras.ru>

On Wed, 30 Apr 2008 21:11:56 +0400, Sergei Ryazansky <dr.hogart at gmail.com>  
wrote:

> Oh, sorry, you right - I too fast read you message. I do it slight later.
>
>> Hmm, maybe you were confused?  From my last email:
>>
>> "The best way to handle this is to file a bug report, attaching  
>> relevant data using the 'Create a new attachment' link (including  
>> either the full script or a shortened one which demonstrates the bug).  
>> Otherwise we're just shooting in the dark trying to diagnose the  
>> problem."
>>
>> http://bugzilla.open-bio.org/
>>
>> Anyone can work on fixing the issue there (so it'll probably get fixed  
>> faster).  The devs can also track progress on the problem via the dev  
>> mail list (bioperl-guts).  Diagnosing the bug may also reveal issues  
>> not just with Bio::Tools::Run::Alignment::TCoffee but also with other  
>> related modules.
>>
>> If needed I can post it to bugzilla, but it helps to submit the bug  
>> yourself (so you can receive posts on it's progress).
>>
>> chris
>>
>> On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:
>>
>>> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields <cjfields at uiuc.edu>  
>>> wrote:
>>>
>>> Chris, I have already sent file to Sendu and also I am attaching it  
>>> here. I have removed from it really unnecessary parts.
>>>
>>>> Sergei,
>>>>
>>>> I agree with Sendu; we can't diagnose this unless we either have the  
>>>> entire script of a minimal version of it demonstrating the bug.
>>>>
>>>> The best way to handle this is to file a bug report, attaching  
>>>> relevant data using the 'Create a new attachment' link (including  
>>>> either the full script or a shortened one which demonstrates the  
>>>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>>>> the problem.
>>>>
>>>> http://bugzilla.open-bio.org/
>>>>
>>>> chris
>


From cjfields at uiuc.edu  Wed Apr 30 18:29:28 2008
From: cjfields at uiuc.edu (Chris Fields)
Date: Wed, 30 Apr 2008 13:29:28 -0500
Subject: [Bioperl-l] Fwd: Re:  alignment by TCoffee as a subroutine
In-Reply-To: <op.uafpz9n79ju7si@hogart.img.ras.ru>
References: <op.uadqmpg8avnppr@hogart.img.ras.ru>
	<op.uae8m9tzavnppr@hogart.img.ras.ru>
	<48186A55.4030406@sendu.me.uk>
	<op.uaferwytavnppr@hogart.img.ras.ru>
	<358f4d650804300716j2a40360fsca340370e552d238@mail.gmail.com>
	<op.uafidxitavnppr@hogart.img.ras.ru> <48189032.20102@sendu.me.uk>
	<op.uafkhojw9ju7si@hogart.img.ras.ru>
	<op.uafklfmd9ju7si@hogart.img.ras.ru>
	<5F24BE07-4085-4458-8A7D-178769BE6110@uiuc.edu>
	<op.uafm9hl79ju7si@hogart.img.ras.ru>
	<EBC881E4-8F1A-4396-8EC9-1FB17681F5D2@uiuc.edu>
	<op.uafop6079ju7si@hogart.img.ras.ru>
	<op.uafpz9n79ju7si@hogart.img.ras.ru>
Message-ID: <39A139E4-6783-41E6-8EE9-1FE60CB57577@uiuc.edu>

Sorry, didn't catch that...

chris

On Apr 30, 2008, at 12:39 PM, Sergei Ryazansky wrote:

> On Wed, 30 Apr 2008 21:11:56 +0400, Sergei Ryazansky <dr.hogart at gmail.com 
> > wrote:
>
>> Oh, sorry, you right - I too fast read you message. I do it slight  
>> later.
>>
>>> Hmm, maybe you were confused?  From my last email:
>>>
>>> "The best way to handle this is to file a bug report, attaching  
>>> relevant data using the 'Create a new attachment' link (including  
>>> either the full script or a shortened one which demonstrates the  
>>> bug). Otherwise we're just shooting in the dark trying to diagnose  
>>> the problem."
>>>
>>> http://bugzilla.open-bio.org/
>>>
>>> Anyone can work on fixing the issue there (so it'll probably get  
>>> fixed faster).  The devs can also track progress on the problem  
>>> via the dev mail list (bioperl-guts).  Diagnosing the bug may also  
>>> reveal issues not just with Bio::Tools::Run::Alignment::TCoffee  
>>> but also with other related modules.
>>>
>>> If needed I can post it to bugzilla, but it helps to submit the  
>>> bug yourself (so you can receive posts on it's progress).
>>>
>>> chris
>>>
>>> On Apr 30, 2008, at 11:40 AM, Sergei Ryazansky wrote:
>>>
>>>> On Wed, 30 Apr 2008 20:25:06 +0400, Chris Fields  
>>>> <cjfields at uiuc.edu> wrote:
>>>>
>>>> Chris, I have already sent file to Sendu and also I am attaching  
>>>> it here. I have removed from it really unnecessary parts.
>>>>
>>>>> Sergei,
>>>>>
>>>>> I agree with Sendu; we can't diagnose this unless we either have  
>>>>> the entire script of a minimal version of it demonstrating the  
>>>>> bug.
>>>>>
>>>>> The best way to handle this is to file a bug report, attaching  
>>>>> relevant data using the 'Create a new attachment' link  
>>>>> (including either the full script or a shortened one which  
>>>>> demonstrates the bug). Otherwise we're just shooting in the dark  
>>>>> trying to diagnose the problem.
>>>>>
>>>>> http://bugzilla.open-bio.org/
>>>>>
>>>>> chris
>>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign