[Bioperl-l] Clones and Unigene annotation

Brian Osborne brian_osborne at cognia.com
Wed Jan 7 16:55:30 EST 2004


Annie,

It sounds like part of your project may involve retrieving id's from Genbank
entries. Take a look at the Feature-Annotation HOWTO, perhaps it will answer
some of your questions on this particular topic. However this doesn't sound
like the tricky part of your project...

Brian O.

-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Law, Annie
Sent: Wednesday, January 07, 2004 4:30 PM
To: 'bioperl-l at portal.open-bio.org'
Subject: [Bioperl-l] Clones and Unigene annotation

Hi,

I have started to look at the resources available on the bioperl website.
It looks like there are
are some tools available that will help me accomplish my task.
I have a list of clone IDs and I would like to do some annotation.  For
starters I have gone to
the I.M.A.G.E. site and I have found the Genbank accession numbers
corresponding to the clone IDs.

I would like to find the corresponding unigene IDs for these accession
numbers.

My goal is to have a database with clone IDs, accession number, and unigene
ID, and allow a user to
add his own personal annotation as well.  Later on, I would like to annotate
with other information
as well.

I would like to know if bioperl is the way to go or if there is already
something available.
One of the advantages with bioperl is that I will be able to control how
often the database gets updated.
This is a problem with some of the tools I have seen that are available on
the web.

Basically, I think that there are two tools that are available from bioperl
that are of use to me.

There is a file called load_seqdatabase.pl that will allow me to load a
unigene file into a biosql db.
Aside from the resource link listed below is there any more documentation or
examples available for the
use of this script?
http://cvs.bioperl.org/cgi-bin/viewcvs/viewcvs.cgi/bioperl-db/scripts/Attic/
load_seqdatabase.pl?logsort=date&annotate=1.31&sortby=log&hideattic=1&search
=None&hidecvsroot=1&diff_format=h

There are also Unigene objects and a parsing tool Bio::Cluster::UniGene, and
Bio::ClusterIO respectively.
I saw in a March 2003 post that someone was trying to do something along the
same lines without the database idea
using the unigene objects.
http://bioperl.org/pipermail/bioperl-l/2003-March/011733.html
However, the major complaint was that it was too slow (I will be dealing
with 20 000 clone IDs at a time).
I was wondering if anything has changed since then.  Even if the process was
fast.
My next question would be if I used the parser to find a matching accession
number and the corresponding
unigene ID would the next natural step be to insert this data in some
database I created.

I am not sure whether the load_seqdatabase.pl tool or the unigene objects is
best suited for my needs.
Since I am new to bioperl I am not to sure what the intention was for the
creation of the Unigene objects
and what most people use them for.

I would appreciate any tips on the most appropriate way to use bioperl to
accomplish my task.

thanks very much,
Annie.

_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list