[Bioperl-l] Primary seq primary_id?

Wiepert, Mathieu Wiepert.Mathieu@mayo.edu
Thu, 7 Nov 2002 14:32:44 -0600


Hi,

Just so I can get this straight, fasta.pm is parsing my seq, eventually the primary_id is set in SeqFastaSpeedFactory, for the PrimarySeq object it creates.  Since I can look at the $seq object after this, and see that primary_id is set, I can expect Bio::Seq::primary_id to send it back to me? 

I had the same question as you about the POD, why is this method *not* delegated to the         internal PrimarySeq object?


If I ignore the POD, and change the code for the sub primary_id to be 

sub primary_id {
 return shift->primary_seq->primary_id(@_);
}

The my program works, and the Seq, SeqIO tests still pass.  Is this not a good fix though?



FYI
This is the very simple test program I have:

#!/usr/bin/perl -w
use Bio::SeqIO;
use strict;

my $seq = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' );
my $input = $seq->next_seq();
my $primary_id = $input->primary_id;
print $primary_id;

And the object $input always looks like this (except the hash reference of course ;-)

0  Bio::Seq=HASH(0x8626904)
   'primary_seq' => Bio::PrimarySeq=HASH(0x86268e0)
      'alphabet' => 'protein'
      'desc' => 'fragment'
      'display_id' => 'CYS1_DICDI'
      'primary_id' => 'CYS1_DICDI'
      'seq' => 'SCWSFSTTGNVEGQHFISQNKLVSLSEQNLVDCDHECMEYEGE'

I'll go submit a bug I guess.

Thanks for the help,

-Mat

> -----Original Message-----
> From: Hilmar Lapp [mailto:hlapp@gnf.org]
> Sent: Thursday, November 07, 2002 2:11 PM
> To: Hilmar Lapp; Wiepert, Mathieu; bioperl-l@bioperl.org
> Subject: RE: [Bioperl-l] Primary seq primary_id?
> 
> 
> Sorry I was too fast. Please file it as a bug report.
> 
> First, the POD of Bio::Seq::primary_id explicitly states that 
> it is not delegated to the primary_seq. Can anyone remember 
> why this is or why this should stay?
> 
> Second, Bio::Seq::new does recognize and honor -primary_id, I 
> overlooked it. Can't be the problem.
> 
> Needs to be investigated. Feel welcome to do so ...
> 
> 	-hilmar
> 
> > -----Original Message-----
> > From: Hilmar Lapp 
> > Sent: Thursday, November 07, 2002 12:04 PM
> > To: 'Wiepert, Mathieu'; bioperl-l@bioperl.org
> > Subject: RE: [Bioperl-l] Primary seq primary_id?
> > 
> > 
> > By calling $input->primary_id() :) Interestingly I just 
> > realized the fasta parser is among the few that set this 
> > property. It also appears to be recognized by PrimarySeq::new 
> > ... weird. File it as a bug report, I or others need to see 
> > whether we can reproduce this.
> > 
> > You rarely want primary_id() BTW. A primary_id would be the 
> > GenBank GI number as an example. Usually what you're after 
> > for fasta-returned seqs is display_id.
> > 
> > Ahem. I just see this _IS_ a bug. The problem is Bio::Seq 
> > implements primary_id itself, which it shouldn't do (it 
> > should delegate to the primary_seq object). Bio::Seq::new 
> > doesn't honor -primary_id (which is OK if it delegated).
> > 
> > I'll fix this in a second.
> > 
> > 	-hilmar
> > 
> > > -----Original Message-----
> > > From: Wiepert, Mathieu [mailto:Wiepert.Mathieu@mayo.edu]
> > > Sent: Thursday, November 07, 2002 11:49 AM
> > > To: Hilmar Lapp; bioperl-l@bioperl.org
> > > Subject: RE: [Bioperl-l] Primary seq primary_id?
> > > 
> > > 
> > > Hi,
> > > 
> > > So I am confused then.  The primary_id is set, that is what I 
> > > wanted, the object looks like this.  Should the primary_id 
> > > slot not be filled in this case?  The primary id was set in 
> > > the fast.pm module, in the next_seq sub.  I don't have an 
> > > accession number.
> > > 
> > > This is what the object is looking like to me...
> > > 0  Bio::Seq=HASH(0x853cfe0)
> > >    'primary_seq' => Bio::PrimarySeq=HASH(0x853cfbc)
> > >       'alphabet' => 'protein'
> > >       'desc' => 'fragment'
> > >       'display_id' => 'CYS1_DICDI'
> > >       'primary_id' => 'CYS1_DICDI'
> > >       'seq' => 'SCWSFSTTGNVEGQHFISQNKLVSLSEQNLVDCDHECMEYEGE'
> > > 
> > > How am I supposed to get CYS1_DICDI from the primary_id field?
> > > 
> > > -Mat
> > > 
> > > > -----Original Message-----
> > > > From: Hilmar Lapp [mailto:hlapp@gnf.org]
> > > > Sent: Thursday, November 07, 2002 1:42 PM
> > > > To: Wiepert, Mathieu; bioperl-l@bioperl.org
> > > > Subject: RE: [Bioperl-l] Primary seq primary_id?
> > > > 
> > > > 
> > > > You do get a string. It's just the memory location of the 
> > > > object to fulfill the requirement to return something which 
> > > > is unique in the application. If you don't like that string, 
> > > > set e.g. $input->primary_id($input->accession_number).
> > > > 
> > > > 	-hilmar
> > > > 
> > > > > -----Original Message-----
> > > > > From: Wiepert, Mathieu [mailto:Wiepert.Mathieu@mayo.edu]
> > > > > Sent: Thursday, November 07, 2002 10:42 AM
> > > > > To: 'bioperl-l@bioperl.org'
> > > > > Subject: [Bioperl-l] Primary seq primary_id?
> > > > > 
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > I am pretty sure that something is messed up for me.  When I 
> > > > > call Bio::Seq to get the primary_id of a sequence, I no 
> > > > > longer get a string...
> > > > > 
> > > > > my $seq = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 
> > > > 'fasta' );
> > > > > my $input = $seq->next_seq();
> > > > > my $primary_id = $input->primary_id;
> > > > > print $primary_id;
> > > > > 
> > > > > gives me
> > > > > 
> > > > > Bio::Seq=HASH(0x82d88d4)
> > > > > 
> > > > > Is there something really silly that I missed somewhere?  I 
> > > > > used to get strings...
> > > > > 
> > > > > -Mat
> > > > > _______________________________________________
> > > > > Bioperl-l mailing list
> > > > > Bioperl-l@bioperl.org
> > > > > http://bioperl.org/mailman/listinfo/bioperl-l
> > > > > 
> > > > 
> > > 
> > 
>