[Bioperl-l] [Quick help needed] Getting Organism info using NCBI Accession numbers : sample code included

Mon Apr 18 13:13:17 EDT 2011

Just wanted to push this once again if in case this message was missed over
the weekend.

-Abhi

On Fri, Apr 15, 2011 at 3:39 PM, Abhishek Pratap <abhishek.vit at gmail.com>wrote:

> Hi Guys
>
> Sorry I am posting the same question again from an old thread. I hope this
> time the subject line is more relevant to the question.
>
> I have list of  NCBI Accession/locus name and not GI numbers. What I need
> to do is to obtain lineage for each NCBI accession.
>
> Is this functionality built in directly ? I am using eftech to get the
> genbank record but not sure how to specifically pull out the organism
> lineage. Also I would want this to be fast as I will have thousands of such
> accessions to query.
>
> Eg:
>
> I want to seach NCBI for Locus name "CP000490" and get the organism lineage
> ?
>
>
>  Bacteria; Proteobacteria; Alphaproteobacteria; Rhodobacterales;
>             Rhodobacteraceae; Paracoccus.
>
>
> This info is present in the gen bank record but I am not sure whats the
> best way to fetch it specifically.
> http://www.ncbi.nlm.nih.gov/nuccore/CP000490
>
> Sample code :
>
> my @ids = qw( NW_001884661 EZ361133 CP000490 ) ;
>
> my $factory = Bio::DB::EUtilities->new(-eutil => 'efetch',
>                                        -email => 'apratap at lbl.gov',
>                                        -db    => 'nucleotide',
>                                        -id    => \@ids,
>
>
>
>                                         );
>
> my $file = 'temp.gb';
>
> $factory->get_Response(-file => $file);
>
> my $seqin = Bio::SeqIO->new(-file => $file,
>  -format => 'genbank');
>
>
>
> Thanks for your help!
> -Abhi
>