[Bioperl-l] arabidopsis + load_seqdatabase.pl

Sean Davis sdavis2 at mail.nih.gov
Mon Dec 19 13:54:06 EST 2005




On 12/19/05 12:31 PM, "Angshu Kar" <angshu96 at gmail.com> wrote:

> Hi,
> 
> I'm not fully sure whether to post this question in this community. But I
> feel those who are working in plant genomics using bioperl can possibly
> answer this. I'm trying to use load_seqdatabase.pl to load data into the
> biosql schema.Can anyone please suggest an arabidopsis data file source that
> has all the additional information (probably GENBANK format) but only holds
> the CDSs?
> I'll be obliged if anyone of you who has used such a file helps me with the
> answer.

Angshu,

What information do you need from these files, specifically?  And what is
your definition of a gene?  If you want to stick to Refseq genes, you can
download from here:

ftp://ftp.ncbi.nih.gov/refseq/release/plant

But, the question is really, what EXACT information do you need and what is
the question that you want to answer?  It is only by deciding what you need
that you will know what files will suit (or not suit) your needs (and this
may be a question that you have to decide for yourself).

If you are going to be using NCBI resources (like the link above), I highly
suggest looking at the NCBI handbook here before proceeding too far:

http://www.ncbi.nlm.nih.gov/books/bv.fcgi?call=bv.View..ShowTOC&rid=handbook
.TOC&depth=2

Sean




More information about the Bioperl-l mailing list