[Biojava-dev] Biojava.util package?

Paolo Pavan paolo.pavan at gmail.com
Thu Mar 29 17:37:10 UTC 2012


Just to add my two cents, I believe that Bioperl intent is to use the
Abstract Factory pattern design.
http://en.wikipedia.org/wiki/Abstract_factory_pattern

Have a look at the example at the link above, the GUIFactory class is the
Bioperl corrispondent SeqIO, the WinFactory and OSXFactory could be for
example a FastaParser and a GenBankParser that read/create respectively a
Fasta or a GenBank file.
Advantages of this approach are that Bioperl (the library) is just aware of
abstract classes/interfaces Bio::Seq::SeqFactory (WinFactory, OSXFactory)
and PrimarySeqI (Button), anyone could implement a parser as a new
Bio::Seq::SeqFactory.
The end user will only use the SeqIO (GuiFactory) that will encapsulate all
the rest and can be configured with every kind of SeqFactory object
(Bioperl can also autoconfigure choosing appropriate factory relying on
extension, but it is not compulsory).

But some people here that for sure know more than me could add further
details.

Best regards,
Paolo




2012/3/29 Michael Heuer <heuermh at gmail.com>

> David Felty <davfelty at gmail.com> wrote:
>
> > I've actually been working on something like this for my GSoC proposal,
> > here's what I came up with:
> >
> > public class SeqIO {
> >    public static final int FASTA = 0;
> >    public static final int FASTQ = 1;
> >    public static final Class<DNASequence> DNA = DNASequence.class;
> >    public static final Class<ProteinSequence> PROTEIN =
> > ProteinSequence.class;
> >
> >    public static <S extends Sequence> Iterable<S> parse(InputStream is,
> > int fileFormat, Class<S> seqType) throws Exception {
> >        switch (fileFormat) {
> >            case FASTA:
> >                if (seqType == DNA) {
> >                    return (Iterable<S>)
> > FastaReaderHelper.readFastaDNASequence(is);
> >                } else if (seqType == PROTEIN) {
> >                    // etc...
> >                }
> > break;
> >            case FASTQ:
> >                // etc...
> >        }
> >    }
> > }
> >
> > It would be used like so:
> >
> > InputStream is = ...
> > Iterable<DNASequence> seqs = SeqIO.parse(is, SeqIO.FASTA, SeqIO.DNA);
> > for (DNASequence s : seqs) {
> >   // do something
> > }
> >
> > Obviously it's not the prettiest and a lot could be changed, but that's
> my
> > initial design. I tried to base it off BioPython's SeqIO, but static
> typing
> > and the variety of Sequence types forced me to put in some nasty
> generics.
> > Any tips would be appreciated!
>
> Hello David,
>
> You might also want to look at the cookbook examples for FASTQ parsing
>
> http://biojava.org/wiki/BioJava:Cookbook:SeqIO:FASTQ
>
> http://biojava.org/wiki/BioJava:CookBook3:FASTQ
>
> These APIs are different from the biojava-legacy FASTA parsing APIs
> and from the biojava3 FASTA parsing APIs for various historical
> reasons.  Perhaps there might be something good to come from pulling
> the best together from all of these.
>
>   michael
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>



More information about the biojava-dev mailing list