[Bioperl-l] Clones and Unigene annotation

Law, Annie Annie.Law at nrc-cnrc.gc.ca
Wed Jan 7 16:30:03 EST 2004


Hi,

I have started to look at the resources available on the bioperl website.
It looks like there are 
are some tools available that will help me accomplish my task.
I have a list of clone IDs and I would like to do some annotation.  For
starters I have gone to
the I.M.A.G.E. site and I have found the Genbank accession numbers
corresponding to the clone IDs.

I would like to find the corresponding unigene IDs for these accession
numbers.

My goal is to have a database with clone IDs, accession number, and unigene
ID, and allow a user to 
add his own personal annotation as well.  Later on, I would like to annotate
with other information
as well.

I would like to know if bioperl is the way to go or if there is already
something available.
One of the advantages with bioperl is that I will be able to control how
often the database gets updated.
This is a problem with some of the tools I have seen that are available on
the web.

Basically, I think that there are two tools that are available from bioperl
that are of use to me.

There is a file called load_seqdatabase.pl that will allow me to load a
unigene file into a biosql db.
Aside from the resource link listed below is there any more documentation or
examples available for the
use of this script?
http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-db/scripts/Attic/
load_seqdatabase.pl?logsort=date&annotate=1.31&sortby=log&hideattic=1&search
=None&hidecvsroot=1&diff_format=h

There are also Unigene objects and a parsing tool Bio::Cluster::UniGene, and
Bio::ClusterIO respectively.
I saw in a March 2003 post that someone was trying to do something along the
same lines without the database idea 
using the unigene objects.
http://bioperl.org/pipermail/bioperl-l/2003-March/011733.html
However, the major complaint was that it was too slow (I will be dealing
with 20 000 clone IDs at a time).  
I was wondering if anything has changed since then.  Even if the process was
fast.  
My next question would be if I used the parser to find a matching accession
number and the corresponding
unigene ID would the next natural step be to insert this data in some
database I created.

I am not sure whether the load_seqdatabase.pl tool or the unigene objects is
best suited for my needs.
Since I am new to bioperl I am not to sure what the intention was for the
creation of the Unigene objects
and what most people use them for.

I would appreciate any tips on the most appropriate way to use bioperl to
accomplish my task.

thanks very much,
Annie.



More information about the Bioperl-l mailing list