[Bioperl-l] Taxonomy DB problem

J. Christopher Ellis J.Christopher.Ellis at duke.edu
Tue Aug 31 07:57:27 EDT 2010


 Hi Chris,

 The error is...

 "Use of uninitialized value $id in join or string at
C:/Perl64/site/lib/Bio/Tools/EUtilities/EUtilParameters.pm line 363."

 The script from
http://bioperl.org/wiki/Species_names_from_accession_numbers is as
follows....

use Bio::DB::EUtilities;

 

my (%taxa, @taxa);

my (%names, %idmap);

 

# these are protein ids; nuc ids will work by changing -dbfrom => 'nucleotide',

# (probably)

 

my @ids = qw(1621261 89318838 68536103 

20807972 730439);

 

my $factory = Bio::DB::EUtilities->new(

-eutil => 'elink',

 -db => 'taxonomy',

 -dbfrom => 'protein',

 -correspondence => 1,

 -id => @ids);

 

# iterate through the LinkSet objects

while (my $ds = $factory->next_LinkSet) {

 $taxa{($ds->get_submitted_ids)[0]

} = ($ds->get_ids)[0]

}

 

@taxa = @taxa{@ids};

 

$factory = Bio::DB::EUtilities->new(-eutil 

=> 'esummary',

 -db => 'taxonomy',

 -id => @taxa );

 

while (local $_ = $factory->next_DocSum)

 {

 $names{($_->get_contents_by_name('TaxId'))

[0]} = 

($_->get_contents_by_name('ScientificName'))[0

];

}

 

foreach (@ids) {

 $idmap{$_} = $names{$taxa{$_

}};

}

 

# %idmap is

# 1621261 => 'Mycobacterium tuberculosis H37Rv'

# 20807972 => 'Thermoanaerobacter tengcongensis MB4'

# 68536103 => 'Corynebacterium jeikeium K411'

# 730439 => 'Bacillus caldolyticus'

# 89318838 => undef (this record has been removed from the db)

 

1;

Thanks,

Chris

 On Mon 08/30/10 09:36 , "Chris Fields" cjfields at illinois.edu sent:
 Chris,

 Regarding a fix for that script, we would have to see your modified
script and the error. However, there are modules within BioPerl to
essentially do what you want, in particular, Bio::DB::Taxonomy.

 chris

 On Aug 30, 2010, at 7:55 AM, J. Christopher Ellis wrote:

 > Hi All,
 > 
 > I am trying to extract the entire taxonomy of an organism including the
 > classifications. Some thing like...
 > 
 > Phylum:Proteobacteria, Class:Gammaproteobacteria,
Order:Enterobacteriales, Family:Enterobacteriaceae, Genus:Escherichia
 > 
 > I am not worried about format just that I get the information and the
associated level of hierarchy. The script found at
http://bioperl.org/wiki/Species_names_from_accession_numbers seemed like a
good starting point so I copied it and tried run it but got an error.
 > 
 > My first question is "Is there a known fix for this?" and my second
question is how do I get the full hierarchical information (as seen above)
with the taxonomy db?
 > 
 > Thanks for all your help in advance!
 > 
 > Chris 
 > 
 > 
 > _______________________________________________
 > Bioperl-l mailing list
 > Bioperl-l at lists.open-bio.org
 > http://lists.open-bio.org/mailman/listinfo/bioperl-l

 


More information about the Bioperl-l mailing list