[Bioperl-l] Taxonomy DB problem

Chris Fields cjfields at illinois.edu
Tue Aug 31 11:01:59 EDT 2010


Yes, I see that one.  It may be the ID hash that is being returned is empty.  I'll look into it.

-c 

On Aug 31, 2010, at 6:57 AM, J. Christopher Ellis wrote:

> Hi Chris,
> 
> The error is...
> 
> "Use of uninitialized value $id in join or string at C:/Perl64/site/lib/Bio/Tools/EUtilities/EUtilParameters.pm line 363."
> 
> The script from http://bioperl.org/wiki/Species_names_from_accession_numbers is as follows....
> 
> use Bio::DB::EUtilities;
> 
> 
> 
>  
> 
> 
> 
> 
> my (%taxa, @taxa);
> 
> 
> 
> my (%names, %idmap);
> 
> 
> 
>  
> 
> 
> 
> 
> # these are protein ids; nuc ids will work by changing -dbfrom => 'nucleotide',
> 
> 
> 
> # (probably)
> 
> 
> 
>  
> 
> 
> 
> 
> my @ids = qw(1621261 89318838 68536103 
> 
> 20807972
>  730439);
> 
>  
> 
> 
> 
> 
> my $factory = Bio::DB::EUtilities->new(
> 
> -
> eutil => 'elink',
> 
>  
> -db => 'taxonomy',
> 
> 
> 
>  
> -dbfrom => 'protein',
> 
> 
> 
>  
> -correspondence => 1,
> 
> 
> 
>  
> -id => \@ids);
> 
> 
> 
>  
> 
> 
> 
> 
> # iterate through the LinkSet objects
> 
> 
> 
> while (my $ds = $factory->next_LinkSet) {
> 
> 
> 
>  
> $taxa{($ds->get_submitted_ids)[0]
> 
> }
>  = ($ds->get_ids)[0]
> 
> }
> 
> 
> 
>  
> 
> 
> 
> 
> @taxa = @taxa{@ids};
> 
> 
> 
>  
> 
> 
> 
> 
> $factory = Bio::DB::EUtilities->new(-eutil 
> 
> =>
>  'esummary',
> 
>  
> -db => 'taxonomy',
> 
> 
> 
>  
> -id => \@taxa );
> 
> 
> 
>  
> 
> 
> 
> 
> while (local $_ = $factory->next_DocSum)
> 
>  
> {
> 
>  
> $names{($_->get_contents_by_name('TaxId'))
> 
> [
> 0]} = 
> 
> ($_->get_contents_by_name('ScientificName'))[0
> 
> ]
> ;
> 
> }
> 
> 
> 
>  
> 
> 
> 
> 
> foreach (@ids) {
> 
> 
> 
>  
> $idmap{$_} = $names{$taxa{$_
> 
> }
> };
> 
> }
> 
> 
> 
>  
> 
> 
> 
> 
> # %idmap is
> 
> 
> 
> # 1621261 => 'Mycobacterium tuberculosis H37Rv'
> 
> 
> 
> # 20807972 => 'Thermoanaerobacter tengcongensis MB4'
> 
> 
> 
> # 68536103 => 'Corynebacterium jeikeium K411'
> 
> 
> 
> # 730439 => 'Bacillus caldolyticus'
> 
> 
> 
> # 89318838 => undef (this record has been removed from the db)
> 
> 
> 
>  
> 
> 
> 
> 
> 1;
> 
> 
> Thanks,
> 
> 
> 
> Chris
> 
> 
> On Mon 08/30/10 09:36 , "Chris Fields" cjfields at illinois.edu sent:
> Chris,
> 
> Regarding a fix for that script, we would have to see your modified script and the error. However, there are modules within BioPerl to essentially do what you want, in particular, Bio::DB::Taxonomy.
> 
> chris
> 
> On Aug 30, 2010, at 7:55 AM, J. Christopher Ellis wrote:
> 
> > Hi All,
> > 
> > I am trying to extract the entire taxonomy of an organism including the
> > classifications. Some thing like...
> > 
> > Phylum:Proteobacteria, Class:Gammaproteobacteria, Order:Enterobacteriales, Family:Enterobacteriaceae, Genus:Escherichia
> > 
> > I am not worried about format just that I get the information and the associated level of hierarchy. The script found athttp://bioperl.org/wiki/Species_names_from_accession_numbers">http://bioperl.org/wiki/Species_names_from_accession_numbers seemed like a good starting point so I copied it and tried run it but got an error.
> > 
> > My first question is "Is there a known fix for this?" and my second question is how do I get the full hierarchical information (as seen above) with the taxonomy db?
> > 
> > Thanks for all your help in advance!
> > 
> > Chris 
> > 
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l">http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 





More information about the Bioperl-l mailing list