[Bioperl-l] Re: grouping sequences by DNA-binding domains -- elaboration

Brian Osborne brian_osborne at cognia.com
Tue Oct 18 17:08:44 EDT 2005


Stefan,

Yes, the hyperlinks are in the text just like they were in our old friend
LocusLink. But it seems that Olena wanted information about the domains,
like whether or not the domain was DNA-binding - is this in the ASN?

In my too-brief response I was attempting to say that starting with a list
of domains, or domain ids, and finding out whether they were DNA-binding
domains or not seems to imply working with an ontology.

Brian O.


On 10/18/05 3:33 PM, "Stefan Kirov" <skirov at utk.edu> wrote:

> Actually Brian, Bio::SeqIO::entrezgene will extract this data from the
> ASN1 file:
> 
> use Bio::SeqIO;
> my $eio=new Bio::SeqIO(-file=>$file,-format=>'entrezgene',
> -debug=>'off',-service_record=>'no');
> ($seq,$struct,$uncapt)=$eio->next_seq;
> my @contigs=$struct->get_members();#(-authority=>'genomic');
> foreach my $contig (@contigs) {
>     if ($contig->authority eq 'Product') {
>         foreach my $sf ($contig->get_SeqFeatures) {
>             foreach my $dblink ($sf->annotation->get_Annotations(dblink)) {
>                 my
> $key=$dblink->{_anchor}?$dblink->{_anchor}:$dblink->optional_id;
>                 my $db=$dblink->database;
>                 next unless (($db =~/cdd/i)||($sf->primary_tag=~
> /conserved/i));
>                 my $desc;
>                 if ($key =~ /:/) {
>                     ($key,$desc)=split(/:/,$key);
>                 }
>                 print join($fs,
> $gid,$contig->id,$desc,$key,$sf->score,'','',$db,$sf->start,$sf->end),"\n";
>             }
>         }
>     }
> }
> 
> I guess it is really a good time time to write thise docs :-)
> Stefan
> 
> Brian Osborne wrote:
> 
>> Olena,
>> 
>> I'm pretty sure that there's no code in Bioperl that accesses or parses CDD,
>> hopefully I'm corrected if I'm wrong.
>> 
>> Brian O.
>> 
>> 
>> On 10/18/05 2:26 PM, "Olena Morozova" <olenka.m at gmail.com> wrote:
>> 
>>  
>> 
>>> Hi Brian,
>>> 
>>> Thank you for your reply. It is the CDD (Conserved Domain Database) on
>>> the NCBI web site.
>>> Olena
>>> 
>>> On 10/18/05, Brian Osborne <brian_osborne at cognia.com> wrote:
>>>    
>>> 
>>>> Olena,
>>>> 
>>>> What database contains the information you're looking for?
>>>> 
>>>> Brian O.
>>>> 
>>>> 
>>>> On 10/16/05 8:17 PM, "Olena Morozova" <olenka.m at gmail.com> wrote:
>>>> 
>>>>      
>>>> 
>>>>> Hi agian,
>>>>> 
>>>>> I just figured out how to obtain a list of conserved domains for a
>>>>> given sequence using the SeqHound.pm module available at
>>>>> http://www.blueprint.org/seqhound/apifunctslist.html
>>>>> 
>>>>> Now I have a list of conserved domains for a given sequence and I need
>>>>> to extract information as to what these domains are and which ones are
>>>>> DNA-binding. Any help on this will be greatly appreciated
>>>>> 
>>>>> Thanks again,
>>>>> Olena
>>>>> 
>>>>> 
>>>>> On 10/16/05, Olena Morozova <olenka.m at gmail.com> wrote:
>>>>>        
>>>>> 
>>>>>> I have a list of transcription factor sequences, and I need to group
>>>>>> them according to the DNA-binding domains based on the classification
>>>>>> by TRANSFAC or any other database. Basically, I just need to extract
>>>>>> the DNA-binding domain information for a particular TF from a database
>>>>>> like TRANSFAC (I don't know what other databases would have this
>>>>>> information, but any will do) Anyone has any idea how to do this?
>>>>>> Thank you very much for your help and time
>>>>>> 
>>>>>> Olena
>>>>>> 
>>>>>>          
>>>>>> 
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>        
>>>>> 
>>>> 
>>>>      
>>>> 
>> 
>> 
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>  
>> 




More information about the Bioperl-l mailing list