[Bioperl-l] Directions for qual modules.

Jason Stajich jason@chg.mc.duke.edu
Tue, 7 Aug 2001 21:54:12 -0400 (EDT)


This is an excellent proposal Chad, I say go with the things you
have outlined below.  Once you get some code committed we can help out
where you would like people to jump in. 

On Tue, 7 Aug 2001, Chad Matsalla wrote:
> 
> Ewan wrote:
> > i think the deliberate copying of PrimarySeq to PrimaryQual is fine. I
> > would put it in
> > Bio::Seq::PrimaryQual
> to which Malcom Cook replied:
> > I would rather create a new interface class, Bio::Seq::QualI
> 
> Perhaps I will make a Bio::Seq::QualI, and later move that into
> Bio::IdentifiableI?
> 
sounds good, let's see what Ewan cooks up for IdentifiableI

> Ewan wrote:
> > We really need a Bio::IdentifiableI interface that it could inheriet for
> > the identifier set.
> to which Heikki replied:
> > I agree. Let's put in Bio::IdentifiableI. Then Chad does not have to
> > nor inherit from Bio::PrimarySeqI nor duplicate code.
> 
> Is this something I can do? Now or later? I guess I am a bit confused as
> to how Bio::IdentifiableI would look and behave.
> 
> Malcom Cook wrote:
> > I would rather have Bio::Seq::Phred implement both Bio::PrimarySeqI and
> > Bio::Seq::QualI.
> 
> This is along the lines of what I was considering. Depending on what
> -format is set to in:
> my $in_qual  = Bio::SeqIO->new(-file => "<t/qualfile.qual" , '-format' =>
> 'qual');
> will decide whether you will get back a Bio::Seq::PrimaryQual object
> alone or a Bio::Seq::SeqWithQuality object with both quality and sequence
> objects inside of it.
> 
yeah - I think SeqWithQuality is better than tieing this just to a "phred"
Seq objects.

> I know about the different formats that phred can write but at the moment
> I really only care about phd files:
> BEGIN_DNA
> a 6 1
> c 6 20
> t 6 17
> ...
> and fasta-style files containing quality values only. These are what can
> be found in phrap'ed consed project directories.
> 
> 
> <thinking>
> Should there be a Bio::SeqIO::phd and Bio::SeqIO::xbap and so on, one for
> each type of file? This sounds good to me and doesn't break the fact that
> at this time my Bio::SeqIO::qual only parses fasta-style quality files. It
> also seems more intuitive to do things this way rather then to pass in
> some flag when then SeqIO object is constructed.
> 
> my $in_qual  = Bio::SeqIO->new(-file => "<t/qualfile.qual" , '-format' =>
> 'qual');
> my $in_qual  = Bio::SeqIO->new(-file => "<t/phredfile.phd" , '-format' =>
> 'phd');
> my $in_qual  = Bio::SeqIO->new(-file => "<t/quality.xbap" , '-format' =>
> 'xbap');
> 
> Heikki, how does this tie into your idea of a Bio::Seq::QualIO? How would
> this account for .phd files with both quality and sequence?
> Bio::Seq::QualIO might only be intuitively useful for files (like my
> fasta-quality files) with quality values only. I am really interested in
> phd files (like above) that have both quality and sequence.
> 
> 
> 
> In any case, here is what I am going to do in the next little bit:
> 
> 1. Rename Bio::PrimaryQual -> Bio::Seq::PrimaryQual
> 
> 2. Verify that Bio::Seq::PrimaryQual uses $obj->qual() rather then
> $obj->seq() .
> Note - It it already did this. :)
> 
> 3. same for $obj->subqual(10,20)
> Note - This returns a reference to an array. Is that OK?
> 
hmmm, i guess so - more efficient than returning an array especially if
you are returning a large number data points.

> 4. Create a Bio::Seq::QualI and cut-and-paste ID-related things from
> Bio::PrimarySeqI where I find them useful.
> 
> 5. Create a Bio::Seq::SeqWithQuality that has a Bio::PrimarySeq and a
> Bio::Seq::PrimaryQual .
> 
> 6. "Fix" Bio::SeqIO::qual. Should this return a Bio::Seq::PrimaryQual
> object or a Bio::Seq::SeqWithQuality with no sequence?
> 
> 7. Create Bio::SeqIO::phd. This will return a Bio::Seq::SeqWithQuality
> object.
> 
excellent!
> 
> I will go ahead and do this but I am ready to alter everything if it seems
> to be incorrect. This has just been a big experiment for me anyway. :)
> 
> When should I do the first commit of this stuff? When it works, or now? I
> have cvs access now.
> 
feel free, I will tag main trunk for 0.9.0 dev release after I have tested
on Thursday.
> 
> 
> Chad Matsalla
> Agriculture & Agri-Food Canada
> Saskatoon, Saskatchewan, Canada
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>