[Bioperl-l] Bio::FPC

Martin Krzywinski martink@bcgsc.ca
Fri, 15 Nov 2002 12:33:02 -0800


An FPC parser, especially for .cor files, would be very useful, where the
actual clone fragment information is stored. Where do I sign up ;)

We have unpublished code which models intersection through the Sulston score
(using simple and Needleman-Wunsch alignment), differences, and unions of
fingerprints and provides graphical output of fingerprints. I also have some
unit conversion code (standard mobility <-> fragment size) and a clone name
parser. It would be great to structure things together to include this where
appropriate.

The only issue I see with the the Bio::FPC namespace is that it refers to a
specific implementation and data storage (FPC) rather than an abstract data
model. Perhaps something like Bio::Fingerprint or Bio::FP which would then
include ::FPC as an I/O layer. You should be able to create a Fingerprint
object, for example, from a .cor file, or a fasta file (through in-silico),
or even through a .sizes/.bands file produced by Image. Here we convert all
our FPC maps into mysql to use with iCE (Internet Contig Explorer).

> from the fpc file.  The attributes would include type (Clone, BAC, PAC)
> name, bands[], sizes[] (if available), a few dates (creation,
> modification), remarks (normal and fpc remarks), contig (and range),

The clone itself doesn't necessarily need to be associated with any
restriction fragments. This is why I think it would be good for the FPC
module set to interact with a middle fingerprint/clone layer.

Best regards,

Martin


Martin Krzywinski
Genome Sciences Centre
600 W 10th Avenue
Vancouver BC V5Z 4E6
Canada
tel 604.877.6086
fax 604.877.6085
http://www.bcgsc.ca

----- Original Message -----
From: "Jamie Hatfield" <jamie@genome.arizona.edu>
To: <bioperl-l@bioperl.org>
Sent: Friday, November 15, 2002 8:15 AM
Subject: [Bioperl-l] Bio::FPC


> Hello all, I need some advice.
>
> I work at the Arizona Genomics Institute under Dr. Cari Soderlund (if
> you don't know her, she used to work at the Sanger Centre, where she
> developed FPC - FingerPrinted Contigs - probably the most used software
> for physical map construction.  She's here in Tucson, AZ after a short
> hiatus in Clemson, SC)  Anyway, I've re-introduced our group to Bioperl
> and we are starting to take advantage of it whereever possible.  Cari
> had seen Bioperl before, but that was pre 1.0 days, when things weren't
> stable enough (in her opinion) for a production environment, after which
> point, she never got around to looking into it again.
>
> I noticed in some document from a presentation given by one of the
> Bioperl bigwigs (might have been LStein), that a FPC parser was a common
> request.  If that's true, we know fpc probably as well as anybody else
> so it would make sense for us to develop/maintain it.
>
> So now we would like to make a contribution.  Don't get too excited
> yet... It's not programmed yet.  But we have found that in many, many
> different areas we need to read a .fpc file (and corresponding .cor
> file) and Do Something(c) with it.  At the same time, I want to get more
> familiar with Bioperl.  I've done fairly simple things, like reading in
> fasta/genbank/swisspro format files and working with alignments (as you
> all saw in my EST Alignment questions).
>
> The advice I want is as follows:
> 1) Where are the standards/guidelines for writing Bioperl modules?
> 2) Any ideas on what features/functionality Bio::FPC should have?
> 3) Any ideas on what (if any besides Bio::Root) I should inherit from?
> 4) Should this be an interface and separate implementation or just an
> implementation?
>    (i.e., are there other file formats/programs for physical maps?)
> 5) What Bioperl objects should I use in construction?
>
> These are the ideas I have so far (after all of a day of thinking about
> it, so feel free to laugh/scorn/suggest better implementations)
> (all these classes should be prefixed with Bio::FPC
>
> 1) ::Project
>   This would be the main class.  It would contain the information parsed
> from the top 8 or so lines of the .fpc file.  It would also contain the
> rest of these objects.
>
> 2) ::Clone
>   Obviously, this is the clone (or more properly - fingerprinted clone)
> from the fpc file.  The attributes would include type (Clone, BAC, PAC)
> name, bands[], sizes[] (if available), a few dates (creation,
> modification), remarks (normal and fpc remarks), contig (and range),
> matching clones (parents and children; approximate, exact, and pseudo),
> markers, etc.  Basically anything you might find as the /^(\w+)/ of the
> line in a .fpc file.
>
>   In typing that out, it seems that maybe the contig and range that a
> clone hits would best be implemented as a type of RangeI class, which is
> more apparent now that I typed that sentence.  Moving on then...
>
> 3) ::Contig
>   Contig number, datetime, status (Ok, NoCB, Avoid, NoAce, Dead), #Q's,
> description.
>
> 4) ::Marker
>   Type (STS, eMRK, whatever), date (create,mod), Global position (if
> anchored to framework)
>
> That's basically it for the objects.  Although the contigrange might
> need to be an object inherited from RangeI.
>
> So now I need some input, and we'll see if I can't get started coding
> this.
>
> Thanks!
>
> ----------------------------------------------------------------------
> Jamie Hatfield                              Room 541H, Marley Building
> Systems Programmer                          University of Arizona
> Arizona Genomics Computational              Tucson, AZ  85721
>   Laboratory (AGCoL)                        (520) 626-9598
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>