[Bioperl-l] Taxonomy DB problem

J. Christopher Ellis J.Christopher.Ellis at duke.edu
Thu Sep 2 14:53:34 UTC 2010


 Chris have you had any luck with this?

 Thanks,
 Chris

 On Tue 08/31/10 11:01 , "Chris Fields" cjfields at illinois.edu sent:
 Yes, I see that one. It may be the ID hash that is being returned is
empty. I'll look into it.

 -c 

 On Aug 31, 2010, at 6:57 AM, J. Christopher Ellis wrote:

 > Hi Chris,
 > 
 > The error is...
 > 
 > "Use of uninitialized value $id in join or string at
C:/Perl64/site/lib/Bio/Tools/EUtilities/EUtilParameters.pm line 363."
 > 
 > The script from
http://bioperl.org/wiki/Species_names_from_accession_numbers is as
follows....
 > 
 > use Bio::DB::EUtilities;
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > my (%taxa, @taxa);
 > 
 > 
 > 
 > my (%names, %idmap);
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > # these are protein ids; nuc ids will work by changing -dbfrom =>
'nucleotide',
 > 
 > 
 > 
 > # (probably)
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > my @ids = qw(1621261 89318838 68536103 
 > 
 > 20807972
 > 730439);
 > 
 > 
 > 
 > 
 > 
 > 
 > my $factory = Bio::DB::EUtilities->new(
 > 
 > -
 > eutil => 'elink',
 > 
 > 
 > -db => 'taxonomy',
 > 
 > 
 > 
 > 
 > -dbfrom => 'protein',
 > 
 > 
 > 
 > 
 > -correspondence => 1,
 > 
 > 
 > 
 > 
 > -id => @ids);
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > # iterate through the LinkSet objects
 > 
 > 
 > 
 > while (my $ds = $factory->next_LinkSet) {
 > 
 > 
 > 
 > 
 > $taxa{($ds->get_submitted_ids)[0]
 > 
 > }
 > = ($ds->get_ids)[0]
 > 
 > }
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > @taxa = @taxa{@ids};
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > $factory = Bio::DB::EUtilities->new(-eutil 
 > 
 > =>
 > 'esummary',
 > 
 > 
 > -db => 'taxonomy',
 > 
 > 
 > 
 > 
 > -id => @taxa );
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > while (local $_ = $factory->next_DocSum)
 > 
 > 
 > {
 > 
 > 
 > $names{($_->get_contents_by_name('TaxId'))
 > 
 > [
 > 0]} = 
 > 
 > ($_->get_contents_by_name('ScientificName'))[0
 > 
 > ]
 > ;
 > 
 > }
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > foreach (@ids) {
 > 
 > 
 > 
 > 
 > $idmap{$_} = $names{$taxa{$_
 > 
 > }
 > };
 > 
 > }
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > # %idmap is
 > 
 > 
 > 
 > # 1621261 => 'Mycobacterium tuberculosis H37Rv'
 > 
 > 
 > 
 > # 20807972 => 'Thermoanaerobacter tengcongensis MB4'
 > 
 > 
 > 
 > # 68536103 => 'Corynebacterium jeikeium K411'
 > 
 > 
 > 
 > # 730439 => 'Bacillus caldolyticus'
 > 
 > 
 > 
 > # 89318838 => undef (this record has been removed from the db)
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 
 > 1;
 > 
 > 
 > Thanks,
 > 
 > 
 > 
 > Chris
 > 
 > 
 > On Mon 08/30/10 09:36 , "Chris Fields" cjfields at illinois.edu sent:
 > Chris,
 > 
 > Regarding a fix for that script, we would have to see your modified
script and the error. However, there are modules within BioPerl to
essentially do what you want, in particular, Bio::DB::Taxonomy.
 > 
 > chris
 > 
 > On Aug 30, 2010, at 7:55 AM, J. Christopher Ellis wrote:
 > 
 > > Hi All,
 > > 
 > > I am trying to extract the entire taxonomy of an organism including
the
 > > classifications. Some thing like...
 > > 
 > > Phylum:Proteobacteria, Class:Gammaproteobacteria,
Order:Enterobacteriales, Family:Enterobacteriaceae, Genus:Escherichia
 > > 
 > > I am not worried about format just that I get the information and the
associated level of hierarchy. The script found
athttp://bioperl.org/wiki/Species_names_from_accession_numbers">http://bioperl.org/wiki/Species_names_from_accession_numbers
seemed like a good starting point so I copied it and tried run it but got
an error.
 > > 
 > > My first question is "Is there a known fix for this?" and my second
question is how do I get the full hierarchical information (as seen above)
with the taxonomy db?
 > > 
 > > Thanks for all your help in advance!
 > > 
 > > Chris 
 > > 
 > > 
 > > _______________________________________________
 > > Bioperl-l mailing list
 > > Bioperl-l at lists.open-bio.org
 > >
http://lists.open-bio.org/mailman/listinfo/bioperl-l">http://lists.open-bio.org/mailman/listinfo/bioperl-l
 > 
 > 

 



More information about the Bioperl-l mailing list