[Bioperl-l] Locuslink parser

Hilmar Lapp hlapp at gnf.org
Fri Feb 13 14:28:18 EST 2004


On Friday, February 13, 2004, at 08:53  AM, Law, Annie wrote:

> I am learning more about the objects I am using.  Do
> you know if there is some doucmentation with Figures showing all of the
> relationship of objects with Bio::Seq class eg relationship of 
> Bio::Seq and
> Bio::Annotation Collection among others.
>

Brian answered that, right?


> However, I am still unable to get all of the fields for example 
> SUMFUNC( a
> brief summary of the function of the products of this locus), 
> ORGANISM, OMIM
> etc...  I am not sure how to access these.

SUMFUNC becomes an annotation of type Bio::Annotation::SimpleValue, 
with a tag name of SUMFUNC. ORGANISM is a Bio::Species object available 
through $seq->species. OMIM references should be available as dbxrefs 
(Bio::Annotation::DBLink), possibly with the database renamed to 'MIM'.

There's I think not a good reference yet as to where which tag goes, 
but the bottom line is that almost every tag ends up as an annotation 
of some kind, with ORGANISM being a notable exception.

>
> It also seems if I use
> 	foreach my $ann (@annotations) {
> 		if ($ann->isa("Bio::Ontology::TermI")) {
> 			# this is an ontology term as annotation
> 		}
> 		if ($ann->isa("Bio::Annotation::DBLink")) {
> 			# this is a dbxref annotation
> 		}
> 	}
> I am filtering out some of the annotation types such as 
> OFFICIAL_GENE_NAME,
> CHR, OFFICIAL_SYMBOL, etc..

I'm not sure I understand what you mean. I just gave some examples for 
how to test what type an annotation is of. There are other types too 
than the two given in the example. The array you get from 
$seq->annotation->get_Annotations() does contain all and any annotation 
that has been associated with the sequence.

>  I only get GO information and DBLINK
> information.
> If I use the following I will get the maximum number of annotation and
> dbxref fields I have been able to extract so far. Is there another 
> category
> I am missing.  Better yet how do I find out what are the other missing
> categories? Ie. Other than Bio::Ontology::TermI, or 
> Bio::Annotation::DBLink
>

Check out Bio/Annotation/*.pm to see all theoretically possible types. 
The most important are DBLink, SimpleValue, OntologyTerm (which 
basically adapts a Bio::Ontology::TermI), Comment, and Reference. Note 
that Reference is not used by the locuslink parser at this point.

>
> **In the example you provided below I can see that all of the type
> Bio::Ontology::TermI annotation types being Grouped and stuck in
> @term_annotations but what is the $_-> for ? And why do you need the 
> line
> $seq->get_Annotations(); Below it?

It's perl syntax and in part obfuscated by my or your email reader 
introducing a line break after the closing curly brace. Checkout

	$ perldoc -f map

for documentation on how to use the map function. Now, using the map 
function in my example was in fact wrong, and calling get_Annotations() 
on a Bio::SeqI object also won't work. Sorry about these mistakes. 
Here's the corrected version:

	@term_anns = grep { $_->isa("Bio::Ontology::TermI"); } 
$seq->annotaton->get_Annotations();

(There was no linebreak above, but adding one won't bother perl.) 
Again, you can read about grep in perl by

	$ perdoc -f grep

-hilmar

> @term_annotations = map { $_->isa("Bio::Ontology::TermI"); }
> $seq->get_Annotations();
>
> Thanks very much,
> Annie.
>
>
>
> -----Original Message-----
> From: Hilmar Lapp [mailto:hlapp at gnf.org]
> Sent: Thursday, February 12, 2004 2:10 PM
> To: Law, Annie
> Cc: 'bioperl-l at bioperl.org'
> Subject: Re: [Bioperl-l] Locuslink parser
>
>
>
> On Thursday, February 12, 2004, at 09:46  AM, Law, Annie wrote:
>
>> I am most intereste in obtaining the fields  locuslink id, GO id,
>> accession number, unigene id.
>
> The locuslink ID is the $seq->accession_number. GO should be there as
> term annotations, unigene ID and other accessions should be present as
> dbxref annotations.
>
> You can test for an annotation being a term annotation or a dbxref:
>
> 	foreach my $ann (@annotations) {
> 		if ($ann->isa("Bio::Ontology::TermI")) {
> 			# this is an ontology term as annotation
> 		}
> 		if ($ann->isa("Bio::Annotation::DBLink")) {
> 			# this is a dbxref annotation
> 		}
> 	}
>
> Using the map function you can easily filter for annotation types, for
> example:
>
> 	@term_annotations = map { $_->isa("Bio::Ontology::TermI"); }
> $seq->get_Annotations();
>
> BTW if you want to get all annotations from a seq object, you can just
> say $seq->get_Annotations() and omit the key.
>
> Hth,
>
> 	-hilmar
> -- 
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------




More information about the Bioperl-l mailing list