[Bioperl-l] GO terms not present in Swiss annotation object
Chris Fields
cjfields at uiuc.edu
Tue Nov 21 18:54:32 EST 2006
You'll want to always reply to the list as well. I would say update
to a newer version; many changes have been made to parsing GenBank/
SwissProt/EMBL since rel 1.4, including dblinks. If you're using
windows you'll need to follow the instructions on the website for the
latest release candidate:
http://www.bioperl.org/wiki/Installing_Bioperl_on_Windows
Note that the release candidates are located in a different
repository, so you'll need to set that up to find them.
chris
On Nov 21, 2006, at 2:32 PM, Juan Cristobal Vera wrote:
> ok, thanks for responding!
> I'm using ActivePerl 5.8.8 build 819 on a windows machine (sorry)
> and the bioperl 1.4 PPM3 package. Perhaps this is too old?
> Here's part of my code (mostly derived from bioperl docs):
> .........................
> #cut
>
> $seqInObj = $indexObj->get_Seq_by_id($line); #get sequence and
> create seq object
>
> #cut
>
> if (defined $seqInObj->annotation){
> $annotObj = $seqInObj->annotation; #create annotation object
> foreach $key ($annotObj->get_all_annotation_keys){
> @values = $annotObj->get_Annotations($key);
> foreach $value (@values){
> if (lc($key) eq "dblink"){
> print $outfh "Annotation: $key\n";
> print $outfh $value->as_text,"\n";
> $dbhash_ref = $value->hash_tree;
> for $dbKey (keys %{$dbhash_ref}) {
> print $outfh $dbKey,": ",$dbhash_ref->
> {$dbKey},"\n"; #none of these prints produce GO terms
> }
> }
> }
> }
> }
> .........................
> My program searches an indexed database on my machine, creates the
> objects, and prints out relevant annotations.
> Here are some of the accessions I used for testing:
> P19351 TNNT_DROME
> P36188 TNNI_DROME
> P11147 HSP7D_DROME
> ..........................................
> the relevant output looks something like this (for debugging) for
> P19351:
> ......................................................................
> Direct database link to X58188 in database EMBL
> database: EMBL
> comment: -; Genomic_DNA.
> primary_id: X58188
> optional_id: CAA41171.1
> Annotation: dblink
> Direct database link to X59376 in database EMBL
> database: EMBL
> comment: -; mRNA.
> primary_id: X59376
> optional_id: CAA42020.1
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment: -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAF48802.2
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment: -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAF48803.2
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment: -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAF48804.2
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment: -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAF48805.2
> Annotation: dblink
> Direct database link to AE003507 in database EMBL
> database: EMBL
> comment: -; Genomic_DNA.
> primary_id: AE003507
> optional_id: AAN09458.1
> Annotation: dblink
> Direct database link to AY122145 in database EMBL
> database: EMBL
> comment: -; mRNA.
> primary_id: AY122145
> optional_id: AAM52657.1
> Annotation: dblink
> Direct database link to A40547 in database PIR
> database: PIR
> primary_id: A40547
> optional_id: A40547
> Annotation: dblink
> Direct database link to B38594 in database PIR
> database: PIR
> primary_id: B38594
> optional_id: B38594
> Annotation: dblink
> Direct database link to Dm.1717 in database UniGene
> database: UniGene
> primary_id: Dm.1717
> optional_id: -
> Annotation: dblink
> Direct database link to P45379 in database HSSP
> database: HSSP
> primary_id: P45379
> optional_id: 1J1E
> Annotation: dblink
> Direct database link to P36188 in database IntAct
> database: IntAct
> primary_id: P36188
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PA in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PA
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PB in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PB
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PC in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PC
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PD in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PD
> optional_id: -
> Annotation: dblink
> Direct database link to dme:CG7178-PG in database KEGG
> database: KEGG
> primary_id: dme:CG7178-PG
> optional_id: -
> Annotation: dblink
> Direct database link to FBgn0004028 in database FlyBase
> database: FlyBase
> primary_id: FBgn0004028
> optional_id: wupA
> Annotation: dblink
> Direct database link to IPR001978 in database InterPro
> database: InterPro
> primary_id: IPR001978
> optional_id: Troponin
> Annotation: dblink
> Direct database link to PF00992 in database Pfam
> database: Pfam
> comment: 1
> primary_id: PF00992
> optional_id: Troponin
> ..............................................
> as you can see, no GO terms above
> ......................................................
> Vs. the actual content of the flat file from for the dblinks from
> P19351:
> DR EMBL; X54504; CAA38366.1; -; mRNA.
> DR EMBL; AY439172; AAR24583.1; -; Genomic_DNA.
> DR EMBL; AY439172; AAR24584.1; -; Genomic_DNA.
> DR EMBL; AY439172; AAR24585.1; -; Genomic_DNA.
> DR EMBL; AY439172; AAR24586.1; -; Genomic_DNA.
> DR EMBL; AY439172; AAR24587.1; -; Genomic_DNA.
> DR EMBL; AY665838; AAU09446.1; -; mRNA.
> DR EMBL; AE014298; AAF48288.2; -; Genomic_DNA.
> DR EMBL; AE014298; AAF48289.2; -; Genomic_DNA.
> DR EMBL; AE014298; AAF48290.1; -; Genomic_DNA.
> DR EMBL; AE014298; AAX52491.1; -; Genomic_DNA.
> DR EMBL; AE014298; AAX52492.1; -; Genomic_DNA.
> DR EMBL; AE014298; AAX52493.1; -; Genomic_DNA.
> DR EMBL; AY051989; AAK93413.1; -; mRNA.
> DR EMBL; AY070875; AAL48497.1; ALT_SEQ; mRNA.
> DR PIR; S13251; S13251.
> DR UniGene; Dm.20472; -.
> DR HSSP; P45379; 1J1E.
> DR Ensembl; CG7107; Drosophila melanogaster.
> DR KEGG; dme:CG7107-PE; -.
> DR KEGG; dme:CG7107-PF; -.
> DR KEGG; dme:CG7107-PG; -.
> DR FlyBase; FBgn0004169; up.
> DR GO; GO:0007498; P:mesoderm development; IEP:FlyBase. ......
> where the GO term is last entry in dblink section above.
> Any help you could provide would be most welcome. Let me know if
> this is insufficient information or if you need a working script.
>
>
> On Tue, 21 Nov 2006 00:19:59 -0600 Chris Fields wrote:
> Juan, The DBLink objects should be generated. You'll need to give
> us a bit more information to go on, though. We need an example
> sequence, your local version of Bioperl, maybe a test script, etc.
> This is the right forum for this, yes, if you are using BioPerl.
> Chris On Nov 20, 2006, at 6:52 PM, Juan Cristobal Vera wrote: > > >
> Hi, > I'm writing a simple application to extract various fields
> from > swissprot objects and I can't access the GO terms found in >
> "dblink" part of the swiss format flat files. I'm not a >
> professional programmer and I can't figure out why this is >
> occuring. All the other "dblink" keys are being >
> generated as far as I can tell (e.g. embl, pfam, etc). The GO >
> terms are just skipped over and it's driving me crazy. Not sure if
> > this is a bug or a deliberate strategy I'm unfamiliar with. I >
> apologize if this is not the correct forum to ask for this sort of
> > help and would ask to be directed to the proper one. > > > > Juan
> Cristobal Vera > > Graduate Student > > Department of Biology > >
> Penn State University > > 208 Mueller Laboratory > > University
> Park, PA 16802 > > (814)863-2957 > >
> _______________________________________________ > Bioperl-l mailing
> list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/
> mailman/listinfo/bioperl-l Christopher Fields Postdoctoral
> Researcher Lab of Dr. Robert Switzer Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
> Juan Cristobal Vera
> Graduate Student
> Department of Biology
> Penn State University
> 208 Mueller Laboratory
> University Park, PA 16802
> (814)863-2957
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list