Fri, 15 Nov 2002 17:41:48 +0000 (GMT)
On Fri, 15 Nov 2002, Jamie Hatfield wrote:
> Hello all, I need some advice.
> I work at the Arizona Genomics Institute under Dr. Cari Soderlund (if
> you don't know her, she used to work at the Sanger Centre, where she
> developed FPC - FingerPrinted Contigs - probably the most used software
> for physical map construction. She's here in Tucson, AZ after a short
> hiatus in Clemson, SC) Anyway, I've re-introduced our group to Bioperl
> and we are starting to take advantage of it whereever possible. Cari
> had seen Bioperl before, but that was pre 1.0 days, when things weren't
> stable enough (in her opinion) for a production environment, after which
> point, she never got around to looking into it again.
Cool. Say hi to Cari from me - I'm glad she's letting you look into this
after an intial oops experience...
> I noticed in some document from a presentation given by one of the
> Bioperl bigwigs (might have been LStein), that a FPC parser was a common
> request. If that's true, we know fpc probably as well as anybody else
> so it would make sense for us to develop/maintain it.
> So now we would like to make a contribution. Don't get too excited
> yet... It's not programmed yet. But we have found that in many, many
> different areas we need to read a .fpc file (and corresponding .cor
> file) and Do Something(c) with it. At the same time, I want to get more
> familiar with Bioperl. I've done fairly simple things, like reading in
> fasta/genbank/swisspro format files and working with alignments (as you
> all saw in my EST Alignment questions).
> The advice I want is as follows:
> 1) Where are the standards/guidelines for writing Bioperl modules?
try reading biodesign.pod as some standards here
> 2) Any ideas on what features/functionality Bio::FPC should have?
I would have thought the following heirerachy is good:
Bio::FPC::FPCSet (has a set of)
::Contig (has a set of)
Bio::FPC::BAC may well inheriet off Bio::SeqFeatureI - using the FPC band
coordinates for start/end.
Bio::FPC::IO.pm - IO format - steal Bio::Align::IO etc
::FPC::IO::fpc - problem here - the "program" is also called fpc. Any
other name for the format
IO.pm would defined methods ->next_fpcset and ->write_fpcset
> 3) Any ideas on what (if any besides Bio::Root) I should inherit from?
> 4) Should this be an interface and separate implementation or just an
> (i.e., are there other file formats/programs for physical maps?)
If you think there will be more than one storage system -eg files and
database, probably good to split everything into interfaces and
implementations now. I should think you would want to do this, so I would
go for it.
> 5) What Bioperl objects should I use in construction?
> These are the ideas I have so far (after all of a day of thinking about
> it, so feel free to laugh/scorn/suggest better implementations)
> (all these classes should be prefixed with Bio::FPC
Aha. I like your names better than mine.
Project, Clone, Contig, Marker. Great stuff.
> 1) ::Project
> This would be the main class. It would contain the information parsed
> from the top 8 or so lines of the .fpc file. It would also contain the
> rest of these objects.
> 2) ::Clone
> Obviously, this is the clone (or more properly - fingerprinted clone)
> from the fpc file. The attributes would include type (Clone, BAC, PAC)
> name, bands, sizes (if available), a few dates (creation,
> modification), remarks (normal and fpc remarks), contig (and range),
> matching clones (parents and children; approximate, exact, and pseudo),
> markers, etc. Basically anything you might find as the /^(\w+)/ of the
> line in a .fpc file.
> In typing that out, it seems that maybe the contig and range that a
> clone hits would best be implemented as a type of RangeI class, which is
> more apparent now that I typed that sentence. Moving on then...
> 3) ::Contig
> Contig number, datetime, status (Ok, NoCB, Avoid, NoAce, Dead), #Q's,
> 4) ::Marker
> Type (STS, eMRK, whatever), date (create,mod), Global position (if
> anchored to framework)
> That's basically it for the objects. Although the contigrange might
> need to be an object inherited from RangeI.
> So now I need some input, and we'll see if I can't get started coding
> Jamie Hatfield Room 541H, Marley Building
> Systems Programmer University of Arizona
> Arizona Genomics Computational Tucson, AZ 85721
> Laboratory (AGCoL) (520) 626-9598
> Bioperl-l mailing list