[Bioperl-l] parsing xml output

Jason Stajich jason.stajich at duke.edu
Fri May 19 22:40:54 UTC 2006


There is a gi2taxid table in the /pub/taxonomy part of NCBI FTP site  
(ftp.ncbi.nih.gov) -- I have used this to take GI numbers from report  
and get taxonomy for overall classification. I think something like  
this exists in the scripts or examples directory in the bioperl  
distro. I know I posted about it when I wrote about it a while ago.

-jason
On May 19, 2006, at 5:30 PM, Hubert Prielinger wrote:

> ok, thanks,
> it appears that I only need the species where the Protein is derived
> from, so I guess Bio:Species would satisfy me, or?
> and it would work that I just pull off the accession from the blast
> output file and then assign the accession code and get as return value
> the  species name.
> is it possible to just assign the accession code, because I looked up
> but they were always talking of the entire file.
>
> regards
>>
>>
>> Christopher Fields wrote:
>>> You'll have to pull the GI or accession from each hit and do a  
>>> lookup
>>> by either grabbing the sequence and using Bio::Species or use
>>> Bio::DB::Taxonomy; there isn't any tax information directly
>>> incorporated into BLAST reports AFAIK.
>>>
>>> Chris
>>>
>>> ---- Original message ----
>>>
>>>> Date: Fri, 19 May 2006 10:52:28 -0600
>>>> From: Hubert Prielinger <hubert.prielinger at gmx.at>  Subject: Re:
>>>> [Bioperl-l] parsing xml output  To: Warren Gish
>>>> <gish at watson.wustl.edu>, bioperl-l at bioperl.org
>>>>
>>>> hi,
>>>> I wondered whether is it also possible in the xml output (either WU
>>>> or NCBI - Blast) to get the species (taxononmy) for every hit, if I
>>>> do a general search.
>>>> regards
>>>>
>>>> Warren Gish wrote:
>>>>
>>>>> Right, the WU-BLAST tabbed output contains more fields.  (See
>>>>> http:// blast.wustl.edu/blast/tabular.html).
>>>>> --Warren
>>>>>
>>>>>
>>>>>> Whoops - sorry Warren - for some reason I had it in my mind that
>>>>>> it  was different.  So the blastxml parser should work fine.  The
>>>>>> WUBLAST tab-delimited output is different than NCBI's -m8/9
>>>>>> though,  right?
>>>>>>
>>>>>> -jason
>>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12/





More information about the Bioperl-l mailing list