[Bioperl-l] GenBank ASN.1 SeqIO parser

Ryan Golhar golharam at umdnj.edu
Fri Feb 8 00:46:21 UTC 2008


Thank Barry.  I did try using go-perl but it is slow when processing 
queries.

I didn't know about ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2go.gz.  I 
think that is exactly what I'm looking for.

Ryan


Barry Moore wrote:
> Ryan,
> 
> I you have a list of NCBI Gene IDs then you can grab the flatfile 
> gene2go from NCBIs ftp site 
> ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2go.gz.  That will give you 
> tax_id, gene_id, go_id, evidence code, qualifier, category etc.  From 
> there  you can get the description from the GO OBO file 
> http://www.geneontology.org/ontology/gene_ontology_edit.obo.  If all you 
> need is the description then the file is pretty easy to parse on the 
> fly, but if you need to traverse the graphs or if you want an already 
> written parser then add go-perl 
> http://search.cpan.org/~cmungall/go-perl/go-perl.pod
> 
> Barry
> 
> 
> 
> On Feb 7, 2008, at 4:04 PM, Ryan Golhar wrote:
> 
>> Let me re-phrase then - I want to parse an entry such as this:
>>
>> http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene&cmd=Retrieve&dopt=full_report&list_uids=11258 
>>
>>
>> to retrieve the text of the Gene Ontology entries and the associated GO
>> IDs for those entries.  Is this possible with BioPerl?  If so, how can I
>> do this with BioPerl?
>>
>> Ryan
>>
>>
>>
>> Jason Stajich wrote:
>>> ugh - why parse ASN.1? NCBI provides converter application in the ncbi
>>> toolkit to many formats : genbank, XML, etc.
>>> On Feb 7, 2008, at 1:48 PM, Chris Fields wrote:
>>>
>>>> No.  The only ASN.1 parser is entrezgene.  You could probably try
>>>> building one using the same ASN.1 parser that SeqIO::entrezgene uses
>>>> (Bio::ASN1::EntrezGene); it includes a parser for sequences:
>>>>
>>>> http://search.cpan.org/~mingyiliu/Bio-ASN1-EntrezGene-1.091/lib/Bio/ASN1/Sequence.pm 
>>>>
>>>>
>>>>
>>>> chris
>>>>
>>>> On Feb 7, 2008, at 3:24 PM, Ryan Golhar wrote:
>>>>
>>>>> Is there a SeqIO parser module for GenBank ASN.1 format?  I thought
>>>>> it would have been genbank or entrezgene, but neither of them work.
>>>>>
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>> Christopher Fields
>>>> Postdoctoral Researcher
>>>> Lab of Dr. Robert Switzer
>>>> Dept of Biochemistry
>>>> University of Illinois Urbana-Champaign
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 




More information about the Bioperl-l mailing list